Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1526164
Summary: | [abrt] gnome-shell: raise(): gnome-shell killed by SIGTRAP, maybe related to libst blur_pixels | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | kevin |
Component: | gnome-shell | Assignee: | Owen Taylor <otaylor> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 27 | CC: | adler, andywangchn, awilliam, bruce, bryanhundven, bugzilla, damien, debarshir, dwagelaar, ejhuff, fmuellner, gnome-sig, h, jan.brummer, jfrieben, John_Sauter, j.orti.alcaine, jose.miguel.perez.hernandez, kevin, kmathews, louis, mike, nalimilan, news.gdc, otaylor, plazaga, pschiffe, redhat-bugzilla, redhat, rickyb.com |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | PrioritizedBug | ||
Fixed In Version: | gnome-shell-3.26.2-5.fc27 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-18 01:23:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
kevin
2017-12-14 21:01:20 UTC
Seems to be caused by a bogus `blur` value related to the shadow_spec of an icon. The malloc size is calculated as part of height and width calculations in blur_pixels which are driven by the blur value passed in which is coming from shadow_spec->blur which has a value of 33554432. (gdb) frame 8 #8 0x00007f90b7c23424 in g_malloc0 (n_bytes=18446744072098939136) at gmem.c:129 129 g_error ("%s: failed to allocate %"G_GSIZE_FORMAT" bytes", (gdb) frame 9 #9 0x00007f90b56544e0 in blur_pixels (pixels_in=pixels_in@entry=0x55a13a1f16c0 "", width_in=width_in@entry=16, height_in=height_in@entry=16, rowstride_in=rowstride_in@entry=16, blur=<optimized out>, width_out=width_out@entry=0x7ffc8cad95e4, height_out=height_out@entry=0x7ffc8cad95e8, rowstride_out=0x7ffc8cad95ec) at ../src/st/st-private.c:280 280 pixels_out = g_malloc0 (*rowstride_out * *height_out); (gdb) list 275 276 *width_out = width_in + 2 * half; 277 *height_out = height_in + 2 * half; 278 *rowstride_out = (*width_out + 3) & ~3; 279 280 pixels_out = g_malloc0 (*rowstride_out * *height_out); 281 line = g_malloc0 (*rowstride_out); 282 283 kernel = calculate_gaussian_kernel (sigma, n_values); 284 (gdb) list 240,280 240 blur_pixels (guchar *pixels_in, 241 gint width_in, 242 gint height_in, 243 gint rowstride_in, 244 gdouble blur, 245 gint *width_out, 246 gint *height_out, 247 gint *rowstride_out) 248 { 249 guchar *pixels_out; 250 float sigma; 251 252 /* The CSS specification defines (or will define) the blur radius as twice 253 * the Gaussian standard deviation. See: 254 * 255 * http://lists.w3.org/Archives/Public/www-style/2010Sep/0002.html 256 */ 257 sigma = blur / 2.; 258 259 if ((guint) blur == 0) 260 { 261 *width_out = width_in; 262 *height_out = height_in; 263 *rowstride_out = rowstride_in; 264 pixels_out = g_memdup (pixels_in, *rowstride_out * *height_out); 265 } 266 else 267 { 268 gdouble *kernel; 269 guchar *line; 270 gint n_values, half; 271 gint x_in, y_in, x_out, y_out, i; 272 273 n_values = (gint) 5 * sigma; 274 half = n_values / 2; 275 276 *width_out = width_in + 2 * half; 277 *height_out = height_in + 2 * half; 278 *rowstride_out = (*width_out + 3) & ~3; 279 280 pixels_out = g_malloc0 (*rowstride_out * *height_out); (gdb) frame 10 #10 0x00007f90b5654cbe in _st_create_shadow_pipeline (shadow_spec=shadow_spec@entry=0x7f90908b08d0, src_texture=src_texture@entry=0x55a141098110) at ../src/st/st-private.c:372 372 pixels_out = blur_pixels (pixels_in, width_in, height_in, rowstride_in, (gdb) list 367 pixels_in = g_malloc0 (rowstride_in * height_in); 368 369 cogl_texture_get_data (src_texture, COGL_PIXEL_FORMAT_A_8, 370 rowstride_in, pixels_in); 371 372 pixels_out = blur_pixels (pixels_in, width_in, height_in, rowstride_in, 373 shadow_spec->blur, 374 &width_out, &height_out, &rowstride_out); 375 g_free (pixels_in); 376 (gdb) print shadow_spec->blur $25 = 33554432 Thanks for looking into this. We actually have three very similar reports already: https://bugzilla.redhat.com/show_bug.cgi?id=1508398 https://bugzilla.redhat.com/show_bug.cgi?id=1506325 https://bugzilla.redhat.com/show_bug.cgi?id=1502183 As this bug is public and you've done more digging, I'll close those as dupes of this. Thanks again. *** Bug 1508398 has been marked as a duplicate of this bug. *** *** Bug 1506325 has been marked as a duplicate of this bug. *** *** Bug 1502183 has been marked as a duplicate of this bug. *** Found some relevant upstream bugs: https://bugzilla.gnome.org/show_bug.cgi?id=788908 https://bugzilla.gnome.org/show_bug.cgi?id=788627 *** Bug 1514850 has been marked as a duplicate of this bug. *** Thanks. I'll try to find the time to apply the patch from 788908 and if you don't hear back from me, the patch is probably good. This has been happening pretty frequently for me. *** Bug 1515926 has been marked as a duplicate of this bug. *** *** Bug 1516253 has been marked as a duplicate of this bug. *** *** Bug 1516633 has been marked as a duplicate of this bug. *** *** Bug 1517234 has been marked as a duplicate of this bug. *** *** Bug 1525979 has been marked as a duplicate of this bug. *** Applied patch like so: $ rpmdev-setuptree $ cd ~/rpmbuild/ $ dnf download --source gnome-shell $ rpm -ivh gnome-shell-3.26.2-1.fc27.src.rpm $ cd SOURCES/ $ wget -O StIcon-only-compute-shadow-pipeline-when-the-textu.patch https://bug788908.bugzilla-attachments.gnome.org/attachment.cgi?id=362437 $ cd ../SPECS/ $ sed -i "$(($(grep -n Patch1 gnome-shell.spec | cut -d : -f 1) + 1))iPatch2: StIcon-only-compute-shadow-pipeline-when-the-textu.patch" gnome-shell.spec $ sed -i "$(($(grep -n patch1 gnome-shell.spec | cut -d : -f 1) + 1))i%patch2 -p1 -b .gnome788908" gnome-shell.spec $ cd .. $ rpmbuild -ba ~/rpmbuild/SPECS/gnome-shell.spec $ sudo dnf reinstall -y ~/rpmbuild/RPMS/x86_64/gnome-shell-*rpm Fingers crossed... Kevin: one of the devs on the upstream bug is asking if there's a reliable reproducer for this - do you have one? Does it require having the Trendnet KVM you're using? Thanks! My core trace is similar, but I am using a plain HDMI connection. No switches. The gnome-shell instance will restart and all windows will stay running. I've only had one case of the entire session ending. I haven't been able to reliably reproduce it. Creating new windows is the only common theme. The application is not common as I've see it on every application I've started. "My core trace is similar" The key thing is the "failed to allocate 18446744072098939136 bytes" error. So far I'm working on the basis that all cases of that are fundamentally the same bug, if it doesn't have that in it, it's not the same bug. (In reply to Adam Williamson from comment #15) > Kevin: one of the devs on the upstream bug is asking if there's a reliable > reproducer for this - do you have one? Does it require having the Trendnet > KVM you're using? Thanks! Yes, I can reproduce with the same symptom (a few times a week). Here are two recent dumps (output snipped): $ coredumpctl TIME PID UID GID SIG COREFILE EXE Fri 2017-12-01 09:25:23 PST 2256 1000 1000 6 missing /usr/bin/Xwayland Fri 2017-12-01 09:25:23 PST 2242 1000 1000 5 missing /usr/bin/gnome-shell Fri 2017-12-01 18:08:04 PST 10166 1000 1000 6 missing /usr/bin/Xwayland Fri 2017-12-01 18:08:04 PST 10066 1000 1000 11 missing /usr/bin/gnome-shell Fri 2017-12-01 18:08:42 PST 21694 1000 1000 6 missing /usr/bin/Xwayland Fri 2017-12-01 18:08:43 PST 21606 1000 1000 11 missing /usr/bin/gnome-shell Mon 2017-12-11 17:10:49 PST 1906 1000 1000 6 missing /usr/bin/Xwayland Mon 2017-12-11 17:10:50 PST 1886 1000 1000 5 missing /usr/bin/gnome-shell Thu 2017-12-14 08:42:20 PST 26991 1000 1000 6 present /usr/bin/Xwayland Thu 2017-12-14 08:42:20 PST 26936 1000 1000 5 present /usr/bin/gnome-shell Thu 2017-12-14 17:45:01 PST 4089 1000 1000 5 present /usr/bin/gnome-shell Thu 2017-12-14 17:45:01 PST 4139 1000 1000 6 present /usr/bin/Xwayland $ coredumpctl gdb 26936 [...] (gdb) frame 8 #8 0x00007f90b7c23424 in g_malloc0 (n_bytes=18446744072098939136) at gmem.c:129 129 g_error ("%s: failed to allocate %"G_GSIZE_FORMAT" bytes", $ coredumpctl gdb 4089 [...] (gdb) frame 8 #8 0x00007fdc75d03424 in g_malloc0 (n_bytes=18446744072098939136) at gmem.c:129 129 g_error ("%s: failed to allocate %"G_GSIZE_FORMAT" bytes", I only experience it when switching with the KVM. I'm glad to remove the patch and run with debug if needed. Unfortunately when devs ask for a reliable reproducer what they really mean is "tell me a series of steps I can perform to make this crash every time, on demand, on my system (or a virtual machine)" - as a developer you really kinda need this in order to dig into the crash. So just knowing that it crashes regularly (but not every time) for you on your hardware isn't exactly what they're looking for. But thanks for the info - if you *don't* have a way to reproduce it on demand, then you don't, we'll just have to deal with it somehow :) *** Bug 1511198 has been marked as a duplicate of this bug. *** *** Bug 1519166 has been marked as a duplicate of this bug. *** (In reply to Adam Williamson from comment #19) > Unfortunately when devs ask for a reliable reproducer what they really mean > is "tell me a series of steps I can perform to make this crash every time, > on demand, on my system (or a virtual machine)" - as a developer you really > kinda need this in order to dig into the crash. So just knowing that it > crashes regularly (but not every time) for you on your hardware isn't > exactly what they're looking for. But thanks for the info - if you *don't* > have a way to reproduce it on demand, then you don't, we'll just have to > deal with it somehow :) Sorry, but I don't have any repeatable steps other than it seems to occur when using the KVM switch for me (although only about 5% of the time). *** Bug 1527141 has been marked as a duplicate of this bug. *** abrt reported my bug 1529400 as a duplicate of bug 1510059, but reading through that bug I think it's really a duplicate of this bug 1526164: # cat /var/spool/abrt/ccpp-2017-12-27-15\:04\:39.236206-1587/backtrace [...] #6 0x00007f911c44ba7d in g_logv (log_domain=0x7f911c48cfee "GLib", log_level=G_LOG_LEVEL_ERROR, format=<optimized out>, args=args@entry=0x7ffdc5e172a0) at gmessages.c:1341 domain = 0x0 data = 0x0 depth = 1 log_func = 0x557a3cfd3a90 <default_log_handler> domain_fatal_mask = <optimized out> masquerade_fatal = 0 test_level = 6 was_fatal = 0 was_recursion = 0 msg = 0x557a45712f00 "gmem.c:130: failed to allocate 18446744072098939136 bytes" msg_alloc = 0x557a45712f00 "gmem.c:130: failed to allocate 18446744072098939136 bytes" i = 2 *** Bug 1529400 has been marked as a duplicate of this bug. *** Christian: yep, indeed - good detective work. I've been running with the patch mentioned in comment #14 for a few weeks now and I've not experienced the issue with the patch. Thanks a lot - we should probably pass that along on the upstream bug. *** Bug 1540857 has been marked as a duplicate of this bug. *** Note: I recently `dnf update`d which brought in gnome-shell-3.26.2-4 and overwrote my patch from bug788908 and the issue started happening again. I see that they're still considering the best way to fix the bug because the previous patch is suboptimal (although I haven't seen any issues), and they can reproduce the issue, so I guess we'll just have to wait for the ultimate fix. Bug 1548768 has the magic number 18446744072098939136 *** Bug 1565745 has been marked as a duplicate of this bug. *** https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/49 is a PR which has been tested and reported to fix this. GNOME folks, can we please get this merged and backported to Fedora releases? I just keep finding more dupes of this bug. *** Bug 1558663 has been marked as a duplicate of this bug. *** Florian just merged this upstream as f6a08472a0fef65f . But we still need a plan to get it into f27, f28 and Rawhide. mutter-3.26.2-3.fc27 gnome-shell-3.26.2-5.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-cb46ea0702 gnome-shell-3.26.2-5.fc27, mutter-3.26.2-3.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-cb46ea0702 gnome-shell-3.26.2-5.fc27, mutter-3.26.2-3.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report. I am running Fedora 28, and just had the below which referred me to bug 1516633 which referred me to this one. --- Running report_uReport --- ('report_uReport' completed successfully) --- Running analyze_CCpp --- Ok to upload core dump? (It may contain sensitive data). If your answer is 'No', a stack trace will be generated locally. (It may download a huge amount of data). 'YES' Querying server settings Retrace server can not be used, because the crash is too large. Try local retracing. The size of your crash is 1.5 GiB, but the retrace server only accepts crashes smaller or equal to 1.2 GiB. Do you want to generate a stack trace locally? (It may download a huge amount of data but reporting can't continue without stack trace). 'YES' Analyzing coredump 'coredump' Missing build id: libnvidia-glcore.so.390.77 Missing build id: libnvidia-tls.so.390.77 Missing build id: libGLX_nvidia.so.0 Missing build id: libnvidia-egl-wayland.so.1 Missing build id: libnvidia-glsi.so.390.77 Missing build id: libEGL_nvidia.so.0 Coredump references 201 debuginfo files, 7 of them are not installed Initializing package manager Setting up repositories Looking for needed packages in repositories Packages to download: 5 Downloading 3.05Mb, installed size: 11.93Mb. Continue? 'YES' Downloading (1 of 5) gdm-debuginfo-3.28.4-1.fc28.x86_64.rpm: 100% Extracting cpio from /var/tmp/dnf-abrt-j5l_k0ey/updates-debuginfo-31001629c662a347/packages/gdm-debuginfo-3.28.4-1.fc28.x86_64.rpm Caching files from unpacked.cpio made from gdm-debuginfo-3.28.4-1.fc28.x86_64.rpm Downloading (2 of 5) pango-debuginfo-1.42.4-1.fc28.x86_64.rpm: 100% Extracting cpio from /var/tmp/dnf-abrt-j5l_k0ey/updates-debuginfo-31001629c662a347/packages/pango-debuginfo-1.42.4-1.fc28.x86_64.rpm Caching files from unpacked.cpio made from pango-debuginfo-1.42.4-1.fc28.x86_64.rpm Downloading (3 of 5) libxcrypt-debuginfo-4.1.2-1.fc28.x86_64.rpm: 100% Extracting cpio from /var/tmp/dnf-abrt-j5l_k0ey/updates-debuginfo-31001629c662a347/packages/libxcrypt-debuginfo-4.1.2-1.fc28.x86_64.rpm Caching files from unpacked.cpio made from libxcrypt-debuginfo-4.1.2-1.fc28.x86_64.rpm Downloading (4 of 5) gnutls-debuginfo-3.6.3-4.fc28.x86_64.rpm: 100% Extracting cpio from /var/tmp/dnf-abrt-j5l_k0ey/updates-debuginfo-31001629c662a347/packages/gnutls-debuginfo-3.6.3-4.fc28.x86_64.rpm Caching files from unpacked.cpio made from gnutls-debuginfo-3.6.3-4.fc28.x86_64.rpm Downloading (5 of 5) gnome-bluetooth-libs-debuginfo-3.28.2-1.fc28.x86_64.rpm: 100% Extracting cpio from /var/tmp/dnf-abrt-j5l_k0ey/updates-debuginfo-31001629c662a347/packages/gnome-bluetooth-libs-debuginfo-3.28.2-1.fc28.x86_64.rpm Caching files from unpacked.cpio made from gnome-bluetooth-libs-debuginfo-3.28.2-1.fc28.x86_64.rpm Removing /var/tmp/abrt-tmp-debuginfo.wcrkuR All debuginfo files are available Generating backtrace Backtrace is generated and saved, 84451 bytes --- Running analyze_BodhiUpdates --- Looking for similar problems in bugzilla Duplicate bugzilla bug '#1516633' was found Searching for updates No updates for this package found I am getting pointed to bug #1516633 which is supposedly a duplicate of this bug. I am running Fedora 28. I get this issue almost every time I log on after my PC auto locks once the screen goes dark (I guess we call it a screensaver). This still happens to me on Fedora 30. ABRT points me to #1516633. Steps to reproduce: 1. Start a youtube video in Firefox 2. Lock your screen WHen you come back to your system, your gnome-session is gone, and on logging in you are launched into a new session. The bug Louis is hitting is not actually the same as this one. I don't know if Dennis is hitting the same bug as Louis, though from the description it sounds like it might be. abrt's duplicate detection is somewhat broken because of the path we crash on here: glib has this odd function which basically explicitly logs an error message and then crashes on purpose. So when anything uses that, at least the first several frames of the backtrace look the same, even though the bit of code where the actual error condition is triggered might be completely different. So in the initial bug here, the error message was "failed to allocate 18446744072098939136 bytes", and all the bugs I marked as dupes - including 1516633 - had that same message in the logs from the original reporter. The message Louis is hitting is "Creating pipes for GWakeup: Too many open files". So it's definitely not the same bug. Louis, can you please file a new bug and include your backtrace and system log files from around the time of the crash? Can you also see if Dennis' reproducer triggers the bug for you? Dennis, can you check your backtrace and/or system logs and see if you are also see the same "Creating pipes for GWakeup: Too many open files"? If so, you'll want to follow the same bug report as Louis, otherwise file a separate one for yourself. Thanks! (In reply to Adam Williamson from comment #42) > The bug Louis is hitting is not actually the same as this one. I don't know > if Dennis is hitting the same bug as Louis, though from the description it > sounds like it might be. > > abrt's duplicate detection is somewhat broken because of the path we crash > on here: glib has this odd function which basically explicitly logs an error > message and then crashes on purpose. So when anything uses that, at least > the first several frames of the backtrace look the same, even though the bit > of code where the actual error condition is triggered might be completely > different. > > So in the initial bug here, the error message was "failed to allocate > 18446744072098939136 bytes", and all the bugs I marked as dupes - including > 1516633 - had that same message in the logs from the original reporter. The > message Louis is hitting is "Creating pipes for GWakeup: Too many open > files". So it's definitely not the same bug. > > Louis, can you please file a new bug and include your backtrace and system > log files from around the time of the crash? Can you also see if Dennis' > reproducer triggers the bug for you? Dennis, can you check your backtrace > and/or system logs and see if you are also see the same "Creating pipes for > GWakeup: Too many open files"? If so, you'll want to follow the same bug > report as Louis, otherwise file a separate one for yourself. Thanks! Error message was different for me: reported https://bugzilla.redhat.com/show_bug.cgi?id=1823445 |