Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1366897
Summary: | Many apps crash in gdk_event_source_check when logging out of GNOME | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Christian Stadelmann <fedora> | ||||
Component: | gtk3 | Assignee: | Matthias Clasen <mclasen> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 25 | CC: | alecbcs, awilliam, bojan, brhahlen+tech, cgarnach, chmelarz, cosimo.cecchi, davejohansen, debarshir, diogocamposwd, dkopecek, edelgado81, extras-qa, fedora, fmuellner, gmarr, jan.vesely, jaragunde, jfrieben, jkurik, jmccann, joe, juliux.pigface, kluksa, kparal, lray+redhatbugzilla, lukas.polivka+rh, mattdm, mclasen, mfabian, mihai, mikhail.v.gavrilov, motoskov, noobusinghacks, nuno.dias, oholy, otaylor, peljasz, peter, rmatos, robatino, rstrode, sgallagh, spacewar, tcfxfzoi, thughes, yonatan.el.amigo, zbyszek | ||||
Target Milestone: | --- | Keywords: | CommonBugs, Triaged | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | PrioritizedBug;RejectedBlocker https://fedoraproject.org/wiki/Common_F25_bugs#gnome-logout-apps-crash | ||||||
Fixed In Version: | gtk3-3.22.5-1.fc25 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1237277 | Environment: | |||||
Last Closed: | 2016-12-15 23:31:48 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1277927 | ||||||
Attachments: |
|
Description
Christian Stadelmann
2016-08-14 12:00:57 UTC
The reason this happened in ~F22 was probably: https://github.com/systemd/systemd/issues/317 I am not exactly sure what the current status of that bug is; I know we reverted the patch downstream at some point. Basically, I'm not sure what you're seeing in F24 is actually exactly the same as what you were seeing in F22... (In reply to Adam Williamson from comment #1) > The reason this happened in ~F22 was probably: > > https://github.com/systemd/systemd/issues/317 > > I am not exactly sure what the current status of that bug is; I know we > reverted the patch downstream at some point. Basically, I'm not sure what > you're seeing in F24 is actually exactly the same as what you were seeing in > F22... I think this issue is different now. See below. This issue is still present in Wayland F25. Now on every logout I get 1…5 applications crashing and showing up in abrt later on. I guess this crasher is caused by gnome-shell stopping as a wayland compositor before all wayland clients have exited, resulting in the clients crashing due to protocol errors. Some example bugs: #1373517 #1378569 #1378570 #1378566 Note that this might be an issue in gnome-shell, not gnome-session. Feel free to change the "component" field since you probably know better than I do. Affected software: looks like all applications (or at least all applications running on wayland backend) could be affected. Not limited to Gtk+ 3.x applications but also affecting Qt5 applications. How reproducible: on almost every logout. Steps to reproduce: 1. log in into a gnome@wayland session 2. start some applications or search in gnome-shell triggering start of search providers 3. log out 4. log in again, have a look at gnome-abrt and its crashes listed Software versions: gnome-shell-3.21.92-1.fc25 gnome-session-3.21.90-1.fc25 libwayland-client-1.12.0-1.fc25 systemd-231-4.fc25 Additional info: I can't test right now whether this bug is specific to wayland sessions since I can't log in into gnome@X11 due to another bug. In case wayland will be default in F25, this bug will cause crash notifications on every login but the first one. This is no exact violation of the final release criteria: "There must be no SELinux denial notifications or crash notifications on boot of or during installation from a release-blocking live image, or at first login after a default install of a release-blocking desktop. " Due to the high impact of happening on _every_ successive login I think it should be a blocker anyway. The criterion we considered for the older bug was the 'data loss' criterion - basically, if apps don't get the opportunity to shut down cleanly, you could lose unsaved data in them unexpectedly...we could consider the same one here. I'll try and find a bit of time to look into this; CCing kparal, who may also be interested. I was discussing this with halfline last week. gnome-session's XSMP support is pretty broken! (But it has been for a while; I wouldn't recommend this as a blocker.) *** Bug 1378570 has been marked as a duplicate of this bug. *** (In reply to Michael Catanzaro from comment #5) > (But it has been for a while; I wouldn't recommend this as a blocker.) I think always crashing applications on every logout is a major bug and should be fixed. At least for sending a SIGTERM to applications on the same user session and waiting for 1…5 seconds to exit. > I was discussing this with halfline last week. gnome-session's XSMP support is pretty broken! I think this bug is about wayland, not X. Does gnome@wayland use XSMP for wayland applications? Discussed during the 2016-09-26 blocker review meeting: [1] The decision to classify this bug as a RejectedBlocker was made due to the fact that the resolution of the blocker would be to classify it as "CommonBugs" which we will do regardless of whether or not the bug is marked as a blocker. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2016-09-26/f25-blocker-review.2016-09-26-16.04.txt *** Bug 1378566 has been marked as a duplicate of this bug. *** *** Bug 1378569 has been marked as a duplicate of this bug. *** *** Bug 1355754 has been marked as a duplicate of this bug. *** *** Bug 1356493 has been marked as a duplicate of this bug. *** *** Bug 1359529 has been marked as a duplicate of this bug. *** *** Bug 1346728 has been marked as a duplicate of this bug. *** I think all of these bugs: https://bugzilla.redhat.com/buglist.cgi?classification=Fedora&list_id=5956197&longdesc=gdk_event_source_check&longdesc_type=allwordssubstr&product=Fedora&query_format=advanced&short_desc=SIGTRAP&short_desc_type=allwordssubstr are duplicates of this one. At least all of them with "gdk_event_source_check()" in their title. I think that if this is not fixed before F25 is released (with wayland by default), Fedora's infrastructure (most notably retrace server) will have to handle hundreds or thousands of these crashes per day. (In reply to Michael Catanzaro from comment #5) > I was discussing this with halfline last week. gnome-session's XSMP support > is pretty broken! > > (But it has been for a while; I wouldn't recommend this as a blocker.) Yeah I didn't read comment #2 at all, sorry. Let's try gnome-shell for this. Despite what I wrote in comment #2, this bug not only affects applications you start manually but also stuff like search providers. (In reply to Michael Catanzaro from comment #5) > I was discussing this with halfline last week. gnome-session's XSMP support > is pretty broken! Not sure this is related to our discussion, actually. Maybe a regression from this commit? https://git.gnome.org/browse/gnome-session/commit/?id=58c9323ea7b8e51f19449f596bb6826e7600c020 *** Bug 1359530 has been marked as a duplicate of this bug. *** So, yeah, I just fired the RC request, and this is still broken. S.S. Shipit ahoy. *** Bug 1385147 has been marked as a duplicate of this bug. *** *** Bug 1384227 has been marked as a duplicate of this bug. *** This also prevents Emacs for saving modified files for later recovery, which it would do on SIGTERM. *** Bug 1385135 has been marked as a duplicate of this bug. *** *** Bug 1390902 has been marked as a duplicate of this bug. *** *** Bug 1394452 has been marked as a duplicate of this bug. *** *** Bug 1395075 has been marked as a duplicate of this bug. *** *** Bug 1396716 has been marked as a duplicate of this bug. *** *** Bug 1357231 has been marked as a duplicate of this bug. *** *** Bug 1357449 has been marked as a duplicate of this bug. *** *** Bug 1371432 has been marked as a duplicate of this bug. *** *** Bug 1389937 has been marked as a duplicate of this bug. *** *** Bug 1393155 has been marked as a duplicate of this bug. *** (In reply to Eric Smith from comment #23) > This also prevents Emacs for saving modified files for later recovery, which > it would do on SIGTERM. I highly doubt that Emacs is affected by this. This is a crash in GTK+. You'll want to open another bug for the Emacs issue. I guess that Emacs problem would be bug #1394937. It's in the See Also field as a pointer, but it's a different issue. mcatanzaro: well. we're not actually sure there's any genuine bug different from this one in 1394937. It's a very fuzzy report. What happens if you have emacs running inside a gnome-terminal? (In reply to Adam Williamson from comment #36) > mcatanzaro: well. we're not actually sure there's any genuine bug different > from this one in 1394937. It's a very fuzzy report. It's completely different, that bug is about processes getting SIGKILL from systemd, this bug is about a crash in GTK+. Actually I see there's no backtrace in this "original" bug, but we've duped 20 different GTK+ crash reports against this bug that are all the same crash in gdk_event_source_check. So this bug covers one specific crash, and if you're hitting some other issue not this crash, you need a different bug. Let me rename the title for clarity. > What happens if you have emacs running inside a gnome-terminal? I don't known, but I guess emacs will probably receive SIGHUP as a side effect of gnome-terminal dying. So that would be a third way for a process to die at logout. :) Anyway, I doubt that's the scenario of the complaint in comment #23. Anyway SIGKILL is a much less serious issue as it doesn't trigger the cascade of ABRT reports like crashing does. Even just powering off or rebooting your computer without closing all your apps manually can trigger this from many apps; it's really embarrassing for us and we'll be lucky if reviewers don't complain about the flood of crash notices when they log in. It should have been a blocker under the no ABRT notifications criterion, but it's true that it's a bit iffy as it is possible to avoid the crashes by closing everything manually, and also we probably didn't recognize the scope of the problem at the blocker meeting. (In reply to Michael Catanzaro from comment #37) > It's completely different, that bug is about processes getting SIGKILL from > systemd, this bug is about a crash in GTK+. Ah, so the first comment in this bug looks like some completely different issue, but everything from comment #2 and below is all about the GTK+ crash. We should have probably created a new bug instead of adding comment #2 and duping all the bugs here, but too late now. Um. What happens is, when you log out of GNOME-on-Wayland, a lot of running apps 'crash' because they just get SIGKILLed (I think). This shows up as a crash in abrt - typically this backtrace running through gdk_event_source_check - but it's not really some kind of actual crasher bug in GTK+. It's just the apps being killed on logout. Just try it - have some apps running, log out, log back in, and see if you get an abrt notification for a bunch of 'crashed' apps. (In reply to Adam Williamson from comment #40) > Um. What happens is, when you log out of GNOME-on-Wayland, a lot of running > apps 'crash' because they just get SIGKILLed (I think). This shows up as a > crash in abrt - typically this backtrace running through > gdk_event_source_check - but it's not really some kind of actual crasher bug > in GTK+. It's just the apps being killed on logout. > > Just try it - have some apps running, log out, log back in, and see if you > get an abrt notification for a bunch of 'crashed' apps. No, if you check the backtraces it really is just a crash that has nothing to do with SIGKILL; SIGKILL would not cause crash warnings from ABRT anyway. It is caused by the Wayland compositor going away. All GTK+ clients crash when that happens, it hits a g_error (SIGTRAP crash) "Error reading events from display: Broken pipe". Actually it's arguably the opposite problem; if applications received SIGKILL they wouldn't then have any opportunity to go on and crash. :) I'm still thinking this is a bug in gnome-shell, not in Gtk+, because applications will also crash when the X server dies. So I think the issue has to be fixed in gnome-shell, not in Gtk+. So how about solving the issue this way in gnome-shell: 1. After a user chooses logout or shutdown, send a SIGTERM to all GUI applications (i.e. all wayland and X11 clients). Gnome-shell has this list already in its "looking glass" window list you can get by pressing [Alt]+[F2], "lg", [Enter]. In fact there is a bit more to that, e.g. applications which only have a tray icon. 2. wait for the children to die with a timeout (e.g. 30 seconds). 3. after one second, show a dialog with all remaining applications and provide a way to force-kill the apps 4. after time runs out or user chose to force-kill, send SIGKILL to processes. So gdk/wayland/gdkeventsource.c:gdk_event_source_check has this in it: if (source->pfd.revents & G_IO_IN)• {• if (wl_display_read_events (display_wayland->wl_display) < 0)• g_error ("Error reading events from display: %s", g_strerror (errno));• }• The equivalent situation with the X backend is gdk/x11/gdkmain-11.c:gdk_x_io_error which has: if (errno == EPIPE)• {• g_warning ("The application '%s' lost its connection to the display %s;\n"• "most likely the X server was shut down or you killed/destroyed\n"• "the application.\n",• g_get_prgname (),• display ? DisplayString (display) : gdk_get_display_arg_name ());• }• else• {• g_warning ("%s: Fatal IO error %d (%s) on X server %s.\n",• g_get_prgname (),• errno, g_strerror (errno),• display ? DisplayString (display) : gdk_get_display_arg_name ());• }• • _exit (1);• I honestly don't think either behavior is right, if the display connection goes away, we don't need a stampede of applications announcing that fact (g_message sure, whatever, but not g_error or g_warning). But at a minimum, gtk on wayland should exit and not g_error to reach parity with X11 but I think we should cut out the nastygrams from both sides really. Still, I guess the problem leading to the messages is gnome-shell is exiting before other clients, and we should fix that too (which i guess would be a gnome-session fix) (In reply to Ray Strode [halfline] from comment #43) > I honestly don't think either behavior is right, if the display connection > goes away, we don't need a stampede of applications announcing that fact > (g_message sure, whatever, but not g_error or g_warning). But at a minimum, > gtk on wayland should exit and not g_error to reach parity with X11 but I > think we should cut out the nastygrams from both sides really. My thoughts exactly. Ideally, applications would save state (if they have any) when that happens, and exit cleanly. > Still, I guess the problem leading to the messages is gnome-shell is exiting > before other clients, and we should fix that too (which i guess would be a > gnome-session fix) Yep, that should be pursued in parallel, but is probably much more complex than 1. (In reply to Ray Strode [halfline] from comment #43) > I honestly don't think either behavior is right, if the display connection > goes away, we don't need a stampede of applications announcing that fact > (g_message sure, whatever, but not g_error or g_warning). I think it warrants g_warning because _exit doesn't allow the application to shutdown cleanly; it's not hard to imagine data loss caused by quitting without running exit handlers or SIGTERM handlers. Why is it calling _exit? Shouldn't it call exit() to give the process a chance to clean up? (In reply to Zbigniew Jędrzejewski-Szmek from comment #46) > Why is it calling _exit? Shouldn't it call exit() to give the process a > chance to clean up? I thought about suggesting that, but then we have to make the rest of GTK+ able to function safely in this case without crashing. This issue is complex and subtle and will impact the vast majority of Fedora GNOME Wayland session users to at least some extent. This bug has been approved for the "Prioritized bugs list". So, is there anyone who is not on this bug report already who we need to bring this to the attention of? Fixed upstream by https://git.gnome.org/browse/gtk+/commit/?h=gtk-3-22&id=43b2b107f123a97e2040ddb7f429b611a16bdc41 which will be present in GTK+ 3.22.5. (Note the commit message is wrong, as that was a crash and not a warning.) *** Bug 1375437 has been marked as a duplicate of this bug. *** *** Bug 1384228 has been marked as a duplicate of this bug. *** *** Bug 1384194 has been marked as a duplicate of this bug. *** *** Bug 1402208 has been marked as a duplicate of this bug. *** gtk3-3.22.5-1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-4171f5555c (In reply to Fedora Update System from comment #56) > gtk3-3.22.5-1.fc25 has been submitted as an update to Fedora 25. > https://bodhi.fedoraproject.org/updates/FEDORA-2016-4171f5555c Seems to fix this. gtk3-3.22.5-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-4171f5555c gtk3-3.22.5-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. gtk3-3.22.5-1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. *** Bug 1371983 has been marked as a duplicate of this bug. *** *** Bug 1401999 has been marked as a duplicate of this bug. *** *** Bug 1397559 has been marked as a duplicate of this bug. *** Problem still persist Today during update system gnome-terminal crashed :( # rpm -q gtk3 gtk3-3.22.7-1.fc25.x86_64 Original bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1397559 which closed as duplicate of this bug report. Similar problem has been detected: Open the web inspector by right click -> inspect element. An empty bottom panel popped up for a split second, then it disappeared and the crash happened. reporter: libreport-2.7.2 backtrace_rating: 4 cmdline: /usr/libexec/webkit2gtk-4.0/WebKitWebProcess 86 crash_function: pthread_cond_wait@@GLIBC_2.3.2 executable: /usr/libexec/webkit2gtk-4.0/WebKitWebProcess global_pid: 3866 kernel: 4.9.6-100.fc24.x86_64 package: webkitgtk4-2.14.3-1.fc24 pkg_fingerprint: 73BD E983 81B4 6521 pkg_vendor: Fedora Project reason: WebKitWebProcess killed by SIGSEGV runlevel: N 5 type: CCpp uid: 1000 Created attachment 1247547 [details]
File: backtrace
Similar problem has been detected: I had opened the following GIF with Epiphany on webapp mode: https://twitter.com/madtrick/status/831440822730162177 Sometimes it just crashes (which lead to this report), but it often freezes my desktop and I have no other choice but restart the computer. reporter: libreport-2.7.2 backtrace_rating: 4 cmdline: /usr/libexec/webkit2gtk-4.0/WebKitWebProcess 28 crash_function: pthread_cond_wait@@GLIBC_2.3.2 executable: /usr/libexec/webkit2gtk-4.0/WebKitWebProcess global_pid: 2629 kernel: 4.9.7-101.fc24.x86_64 package: webkitgtk4-2.14.3-1.fc24 pkg_fingerprint: 73BD E983 81B4 6521 pkg_vendor: Fedora Project reason: WebKitWebProcess killed by SIGSEGV runlevel: N 5 type: CCpp uid: 1000 (In reply to Jacobo Aragunde from comment #67) > Sometimes it just crashes (which lead to this report), but it often freezes > my desktop and I have no other choice but restart the computer. This is a different bug. You should report the crash in gnome-shell or XWayland in this case. The crash in Gtk+ is just a result of that. By the way, It looks like you are still on Fedora 24. In Fedora 24, Wayland was not the default choice for GNOME desktop, so please use X11 or update to Fedora 25. > I had opened the following GIF with Epiphany on webapp mode: https://twitter.com/madtrick/status/831440822730162177 I cannot reproduce that. Can you retry with Fedora 25? (In reply to Mikhail from comment #64) > Problem still persist > Today during update system gnome-terminal crashed :( > > # rpm -q gtk3 > gtk3-3.22.7-1.fc25.x86_64 > > Original bug report: https://bugzilla.redhat.com/show_bug.cgi?id=1397559 > which closed as duplicate of this bug report. Looks like a different bug to me too. This bug report (1366897) is only for the crashes resulting in logout, not for any other application crashes. *** Bug 1372488 has been marked as a duplicate of this bug. *** *** Bug 1258818 has been marked as a duplicate of this bug. *** (In reply to Ray Strode [halfline] from comment #43) > So gdk/wayland/gdkeventsource.c:gdk_event_source_check has this in it: > > [...] > > The equivalent situation with the X backend is > gdk/x11/gdkmain-11.c:gdk_x_io_error which has: > > [...] > > I honestly don't think either behavior is right, if the display connection > goes away, we don't need a stampede of applications announcing that fact > (g_message sure, whatever, but not g_error or g_warning). But at a minimum, > gtk on wayland should exit and not g_error to reach parity with X11 but I > think we should cut out the nastygrams from both sides really. For what it is worth, the X11 backends logging was demoted in: https://git.gnome.org/browse/gtk+/commit/?id=c70ba3a4f0043e11fffe0023685f99ec3990c644 *** Bug 1347989 has been marked as a duplicate of this bug. *** |