Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1568644

Summary: Xvfb returning NULL GLX server string
Product: [Fedora] Fedora Reporter: Elliott Sales de Andrade <quantum.analyst>
Component: xorg-x11-serverAssignee: X/OpenGL Maintenance List <xgl-maint>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: alexl, awilliam, bskeggs, caillon+fedoraproject, jglisse, john.j5live, ofourdan, philip.chimento, ppisar, rdieter, rhughes, rstrode, sandmann, xgl-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: xorg-x11-server-1.19.6-8.fc28 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-27 04:11:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Elliott Sales de Andrade 2018-04-18 00:42:16 UTC
Description of problem:
This is a problem being hit by matplotlib builds [1] on Rawhide when testing that Qt5 works. It can also be reproduced a bit simpler by running a Qt5 app in Xvfb. It might even be reproducible with a simple XCB app, but I don't know any off the top of my head.

Version-Release number of selected component (if applicable):
xorg-x11-server-Xvfb-1.19.99.904-2.fc29.x86_64 (but it was also a problem with .902)
1.19.6 from F28 works.


Steps to Reproduce:
1. mock -r fedora-rawhide-x86_64 --install gdb xorg-x11-server-Xvfb mesa-libGL qt5-designer
2. mock -r fedora-rawhide-x86_64 --no-clean --install --enablerepo=fedora-debuginfo mesa-libGL-debuginfo mesa-debugsource qt5-qtbase-gui-debuginfo qt5-qtbase-debugsource libxcb-debuginfo libxcb-debugsource
3. mock -r fedora-rawhide-x86_64 --no-clean --shell
4. xvfb-run -a -s '-screen 0 640x480x24' gdb /usr/bin/designer-qt5 


Actual results:
Thread 1 "designer-qt5" received signal SIGSEGV, Segmentation fault.
xcb_glx_query_server_string_string_length (R=R@entry=0x0) at glx.c:1694
1694	    return R->str_len;
(gdb) bt
#0  xcb_glx_query_server_string_string_length (R=R@entry=0x0) at glx.c:1694
#1  0x00007fffddbf9f99 in __glXQueryServerString (dpy=dpy@entry=0x5555555fb700, opcode=<optimized out>, screen=screen@entry=0, name=name@entry=2)
    at glx_query.c:55
#2  0x00007fffddbf7b93 in AllocAndFetchScreenConfigs (priv=0x55555568a920, dpy=0x5555555fb700) at glxext.c:807
#3  __glXInitialize (dpy=dpy@entry=0x5555555fb700) at glxext.c:946
#4  0x00007fffddbf38f6 in glXGetFBConfigs (dpy=0x5555555fb700, screen=0, nelements=nelements@entry=0x7fffffffdeec) at glxcmds.c:1656
#5  0x00007fffddbf4c86 in glXChooseFBConfig (dpy=<optimized out>, screen=<optimized out>, attribList=0x5555555f5eb8, nitems=0x7fffffffe0a8)
    at glxcmds.c:1594
#6  0x00007ffff11fb6f5 in glXChooseFBConfig () from /lib64/libGLX.so.0
#7  0x00007ffff7e8a944 in qglx_findConfig (display=0x5555555fb700, screen=0, format=..., highestPixelFormat=highestPixelFormat@entry=false, 
    drawableBit=drawableBit@entry=1, flags=flags@entry=0) at ../../../include/QtCore/../../src/corelib/tools/qarraydata.h:209
#8  0x00007ffff7e87d14 in QGLXContext::init (this=0x5555556865e0, screen=0x5555555edba0, share=<optimized out>) at ../../../xcb/qxcbscreen.h:175
#9  0x00007ffff7e8603b in QXcbGlxIntegration::createPlatformOpenGLContext (this=<optimized out>, context=0x5555556467f0)
    at qxcbglxintegration.cpp:184
#10 0x00007fffe1a8ec75 in QXcbIntegration::createPlatformOpenGLContext (this=<optimized out>, context=0x5555556467f0) at qxcbintegration.cpp:279
#11 0x00007ffff6bdbb81 in QOpenGLContext::create (this=this@entry=0x5555556467f0)
    at ../../include/QtGui/5.10.1/QtGui/private/../../../../../src/gui/kernel/qguiapplication_p.h:105
#12 0x00007ffff6b9db68 in QGuiApplicationPrivate::init (this=this@entry=0x5555555e97a0) at kernel/qguiapplication.cpp:1458
#13 0x00007ffff70f1dad in QApplicationPrivate::init (this=0x5555555e97a0) at kernel/qapplication.cpp:576
#14 0x00005555555873c7 in QDesigner::QDesigner(int&, char**) ()
#15 0x0000555555576bdb in main ()

Pulling xorg-x11-server-common and xorg-x11-server-Xvfb 1.19.6 from koji and installing it fixes the crash. Now, I know the backtrace is in XCB, but it's processing a reply from the server that seems to be invalid.


Additional info:
[1] https://koji.fedoraproject.org/koji/buildinfo?buildID=1069562

Comment 1 Elliott Sales de Andrade 2018-04-18 06:13:51 UTC
Bisect turned out to be relatively easy; seems to point to this commit:

d8ec33fe0542141aed1d9016d2ecaf52da944b4b is the first bad commit
commit d8ec33fe0542141aed1d9016d2ecaf52da944b4b
Author: Adam Jackson <ajax>
Date:   Wed Jan 10 13:05:45 2018 -0500

    glx: Use vnd layer for dispatch (v4)
    
    The big change here is MakeCurrent and context tag tracking. We now
    delegate context tags entirely to the vnd layer, and simply store a
    pointer to the context state as the tag data. If a context is deleted
    while it's current, we allocate a fake ID for the context and move the
    context state there, so the tag data still points to a real context. As
    a result we can stop trying so hard to detach the client from contexts
    at disconnect time and just let resource destruction handle it.
    
    Since vnd handles all the MakeCurrent protocol now, our request handlers
    for it can just be return BadImplementation. We also remove a bunch of
    LEGAL_NEW_RESOURCE, because now by the time we're called vnd has already
    allocated its tracking resource on that XID.
    
    v2: Update to match v2 of the vnd import, and remove more redundant work
    like request length checks.
    
    v3: Add/remove the XID map from the vendor private thunk, not the
    backend. (Kyle Brenneman)
    
    v4: Fix deletion of ghost contexts (Kyle Brenneman)
    
    Signed-off-by: Adam Jackson <ajax>

:100644 100644 9e498e662d0ca763656684be5b1427ff61b6f9d0 c1f389c1e5e02a665aa1a1bb4c8dbca6032f5722 M	configure.ac
:040000 040000 b194a8fa688d757fe5fce74b841ab2a79d56e466 64b608f3696cb8b29cba926e0a1a20882446b7d1 M	glx
:040000 040000 86e24d3c34398d1ee475a38af31ac9feb8b12e4b f0fd28b887ee8a9b00ccaa7528cf1981a4988bc2 M	hw
:040000 040000 d79a9ec87c3c8b591d4adcb8fe7b7c7a0bb2a34f c3f3ccfc7ff44994f5824120482a7f01b6c8d56f M	include

Comment 2 Elliott Sales de Andrade 2018-04-18 06:31:04 UTC
Also, since I didn't post it before, when run with 8753218beae641e5c5ac2c2ba598cfb99a893cf4 (the parent commit), the output is:

QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
libEGL warning: DRI2: failed to open swrast (search paths /usr/lib64/dri)
libEGL warning: DRI2: failed to open swrast (search paths /usr/lib64/dri)
libEGL warning: DRI2: failed to open swrast (search paths /usr/lib64/dri)
libEGL warning: DRI2: failed to open swrast (search paths /usr/lib64/dri)
QXcbIntegration: Cannot create platform OpenGL context, neither GLX nor EGL are enabled

and then it just runs invisibly.

Comment 3 Philip Chimento 2018-04-22 04:49:54 UTC
This is affecting GNOME's CI builds (https://gitlab.gnome.org/GNOME/gjs/issues/141) and can be reproduced with simply "xvfb-run glxinfo".

Comment 4 Petr Pisar 2018-04-23 11:53:53 UTC
It also breaks building Perl packages that execute X11 applications during tests.

Comment 5 Adam Jackson 2018-04-23 17:58:26 UTC
I'm reasonably sure this is fixed by the following update to libglvnd:

* Wed Apr 18 2018 Adam Jackson <ajax> - 1.0.1-0.5.20180327git5baa1e5
- Go back to Requires: mesa-*, the fallout is too great (#1568881 etc)

Comment 6 Adam Jackson 2018-04-23 18:01:47 UTC
... or it would be, in 1.20. In 1.19 Xvfb's default depth is 8bpp, and GLX doesn't work at that depth. Probably you want not to be running your tests against 8bpp though.

Comment 7 Fedora Update System 2018-04-23 18:32:49 UTC
xorg-x11-server-1.19.6-8.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-3c8f65c520

Comment 8 Fedora Update System 2018-04-23 22:54:28 UTC
xorg-x11-server-1.19.6-8.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-3c8f65c520

Comment 9 Petr Pisar 2018-04-24 08:50:03 UTC
I don't care about color depth. I tried "Xvfb -ac -screen 0 640x480x24 :0", and Gtk client still segfaults.

It happens exactly since the libglvnd-1.0.1-0.5.20180327git5baa1e5 because it run-requires mesa-libGL. I can get the segfault even with older libglvnd or Xvfb if I install mesa-libGL.

It looks like GDK detects libGL uses it and then it segfaults. Of course glxinfo segfaults too.

Comment 10 Adam Williamson 2018-04-24 15:59:52 UTC
Petr: can you check whether it works if you install mesa-dri-drivers ?

Comment 11 Petr Pisar 2018-04-25 06:10:07 UTC
If I install mesa-dri-drivers and restart Xvfb, then the Gtk client works. Regardless of overriding the color depth.

Also If I upgrade xorg-x11-server-Xvfb to 1.19.99.905-1.fc29, the client works even without mesa-dri-drivers being installed.

Thank you for fixing it.

Comment 12 Fedora Update System 2018-04-27 04:11:16 UTC
xorg-x11-server-1.19.6-8.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.