Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1371517 - eu-stack killed by SIGABRT processing gcore created core file
Summary: eu-stack killed by SIGABRT processing gcore created core file
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: elfutils
Version: 7.4
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: alpha
: 7.4
Assignee: Mark Wielaard
QA Contact: qe-baseos-tools-bugs
URL: https://retrace.fedoraproject.org/faf...
Whiteboard: abrt_hash:d5718852e850010617546bd78ad...
Depends On: 1365812
Blocks: 1260074 1371380
TreeView+ depends on / blocked
 
Reported: 2016-08-30 11:55 UTC by Matej Habrnal
Modified: 2017-08-01 22:06 UTC (History)
19 users (show)

Fixed In Version: elfutils-0.168-5.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1365812
Environment:
Last Closed: 2017-08-01 22:06:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2017:2020 0 normal SHIPPED_LIVE elfutils bug fix update 2017-08-01 19:31:43 UTC

Description Matej Habrnal 2016-08-30 11:55:29 UTC
+++ This bug was initially created as a clone of Bug #1365812 +++

Description of problem:
I ran eu-stack with a core dump file generated by gcore.

$ sleep 1000 &
$ SLEEP_PID=$!
$ gcore $SLEEP_PID
$ eu-stack --executable=/usr/bin/sleep --core=core.$SLEEP_PID
eu-stack: link_map.c:846: dwfl_link_map_report: Assertion `in.d_size == phnum * phent' failed.
Aborted (core dumped)

Version-Release number of selected component:
elfutils-0.166-2.fc25

Additional info:
reporter:       libreport-2.7.2.6.g6ac1
backtrace_rating: 3
cmdline:        eu-stack --executable=/usr/bin/sleep --core=core.6984
executable:     /usr/bin/eu-stack
global_pid:     7060
kernel:         4.7.0-0.rc7.git4.2.fc25.x86_64
pkg_vendor:     Fedora Project
runlevel:       N 5
type:           CCpp
uid:            18601

Truncated backtrace:
Thread no. 1 (7 frames)
 #25 ??
 #26 dwfl_link_map_report at link_map.c:846
 #27 dwfl_core_file_report at core-file.c:531
 #28 parse_opt at stack.c:595
 #29 parser_parse_arg at argp-parse.c:716
 #30 parser_parse_next at argp-parse.c:865
 #31 __argp_parse at argp-parse.c:921

--- Additional comment from Jakub Filak on 2016-08-10 05:11:01 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:02 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:04 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:06 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:08 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:09 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:11 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:13 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:14 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:16 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:18 EDT ---



--- Additional comment from Jakub Filak on 2016-08-10 05:11:19 EDT ---



--- Additional comment from Matej Habrnal on 2016-08-10 05:30:56 EDT ---

It is possible to generate backtrace from such a coredump using gdb:

$ gdb -batch -ex 'file /usr/bin/sleep' -ex "core-file core.$SLEEP_PID" -ex "bt"
[New LWP 22760]
Core was generated by `sleep'.
#0  0x00007fe59479c810 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
84	T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
#0  0x00007fe59479c810 in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1  0x000055d61b75845f in rpl_nanosleep ()
#2  0x000055d61b7582c0 in xnanosleep ()
#3  0x000055d61b75587d in main ()

--- Additional comment from Jan Kratochvil on 2016-08-10 07:55:09 EDT ---

The core file generated by gcore is bogus so it is rather a GDB bug:
3    AT_PHDR              Program headers for program    0x55848c2b5040
9    AT_ENTRY             Entry point of program         0x55848c2b6940
^^^ it points to nowhere:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000df8 0x000055848c4bb000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  R      1
  LOAD           0x0000000000001df8 0x000055848c4bc000 0x0000000000000000
                 0x0000000000001000 0x0000000000001000  RW     1
  LOAD           0x0000000000002df8 0x000055848e3b7000 0x0000000000000000
                 0x0000000000021000 0x0000000000021000  RW     1

--- Additional comment from Jan Kratochvil on 2016-08-10 15:50:35 EDT ---

A workaround is:
(gdb) shell cat /proc/20477/coredump_filter
00000033
(gdb) shell echo 0x37 >/proc/20477/coredump_filter
(gdb) shell cat /proc/20477/coredump_filter
00000037
(gdb) gcore /tmp/sleep.core
(gdb) shell echo 0x33 >/proc/20477/coredump_filter

This is not a regression, before 'set use-coredump-filter' (gdb >=7.10) GDB also never dumped pages of binary code.

--- Additional comment from Mark Wielaard on 2016-08-10 18:22:00 EDT ---

Please create a new bug, or clone this bug for gdb if you want to fix the bogus core file creation by gcore. It would certainly be nice to fix that. But the original bug is real. eu-stack does crash and it shouldn't, even on a bogus core file. The assert should be fixed by some other sanity check that doesn't cause eu-stack to abort.

--- Additional comment from Mark Wielaard on 2016-08-11 17:19:03 EDT ---

So the assert is actually a good thing. It does show something was wrong with our assumption that in.d_size == phnum * phent, which we previously explicitly set in.d_size to. It got reset (to zero) by the core reading code when it detected an error in the core file. That is why we now try to reread the phdrs from the executable. Reusing the same buffer and size. So all we really need to do instead of asserting the size is as expected, to actually set the expected size:

diff --git a/libdwfl/link_map.c b/libdwfl/link_map.c
index 28d7382..604be1b 100644
--- a/libdwfl/link_map.c
+++ b/libdwfl/link_map.c
@@ -843,7 +843,10 @@ dwfl_link_map_report (Dwfl *dwfl, const void *auxv, size_t auxv_size,
                }
              off_t off = ehdr->e_phoff;
              assert (in.d_buf == NULL);
-             assert (in.d_size == phnum * phent);
+             /* Note this in the !in_ok path.  That means the memory_callback
+                failed.  But the callback might still have reset the in.d_buf
+                value (to zero).  So explicitly set it here again.  */
+             in.d_size = phnum * phent;
              in.d_buf = malloc (in.d_size);
              if (unlikely (in.d_buf == NULL))
                {

And with that we get:

$ LD_LIBRARY_PATH=backends:libelf:libdw src/stack --core=core.$SLEEP_PID --exec=/bin/sleep
PID 9603 - core
TID 9603:
#0  0x00007f5185837810 __nanosleep
#1  0x000055bc65b7f45f rpl_nanosleep
#2  0x000055bc65b7f2c0 xnanosleep
#3  0x000055bc65b7c87d main
#4  0x00007f518578f731 __libc_start_main
#5  0x000055bc65b7c969 _start

--- Additional comment from Mark Wielaard on 2016-08-12 05:55:32 EDT ---

Patch suggested upstream:
https://lists.fedorahosted.org/archives/list/elfutils-devel@lists.fedorahosted.org/message/UP3NBBN7D3DTIABVJEQTGIRDA4HO5D7L/

--- Additional comment from Fedora Update System on 2016-08-26 10:59:55 EDT ---

elfutils-0.167-1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-de1f4e692b

--- Additional comment from Fedora Update System on 2016-08-27 08:52:33 EDT ---

elfutils-0.167-1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-de1f4e692b

Comment 2 Mark Wielaard 2016-12-01 18:40:31 UTC
Patch to fix is already upstream and in fedora elfutils-0.167-1.

Comment 4 Martin Cermak 2017-05-03 13:55:22 UTC
Reproduced on rhel-7.3 and f25, verified on rhel-7.4 using elfutils-0.168-5.el7.

Comment 5 errata-xmlrpc 2017-08-01 22:06:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:2020


Note You need to log in before you can comment on or make changes to this bug.