Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1787426 - bind-9.11.4-9.P2.el7.ppc64 SIGSEGV Crash
Summary: bind-9.11.4-9.P2.el7.ppc64 SIGSEGV Crash
Keywords:
Status: CLOSED DUPLICATE of bug 1779589
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: bind
Version: 7.7
Hardware: All
OS: All
unspecified
high
Target Milestone: rc
: ---
Assignee: Petr Menšík
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-02 19:53 UTC by Anthony Zone
Modified: 2023-09-12 02:16 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-27 14:52:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Anthony Zone 2020-01-02 19:53:52 UTC
Description of problem:
Customer's system is running bind-9.11.4-9.P2.el7.ppc64 randomly experiences a SIGSEGV.  This is not seen when they roll back to bind-9.9.4-74.el7_6.1.ppc64

Version-Release number of selected component (if applicable):
bind-9.11.4-9.P2.el7.ppc64

How reproducible:

Customer can't reproduce on the fly but named appears to run for a day or two and then segfaults.


Steps to Reproduce:
1. Install bind-9.11.4-9.P2.el7.ppc64
2. unknown
3. Coredump

Actual results:

bind-9.11.4-9.P2.el7.ppc64 coredumps with signal 11 after unknown incident.

Expected results:

Continues to run and doesn't crash.

Additional info:

Looking at the core file we see a null pointer reference:

(gdb) bt
#0  ttl_sooner (v1=0x0, v2=0x3fff219f0280) at ../../../lib/dns/rbtdb.c:1127
#1  0x00003fff78cf15bc in isc_heap_delete (heap=0x3fff6c501278, idx=<optimized out>) at ../../../lib/isc/heap.c:233
#2  0x00003fff791a75e8 in free_rdataset (rdataset=0x3fff219f0280, mctx=<optimized out>, rbtdb=0x3fff6c554010)
    at ../../../lib/dns/rbtdb.c:1721
#3  clean_stale_headers (top=0x3fff421834f0, mctx=<optimized out>, rbtdb=0x3fff6c554010) at ../../../lib/dns/rbtdb.c:1805
#4  clean_cache_node (node=0x3fff6c582780, rbtdb=0x3fff6c554010) at ../../../lib/dns/rbtdb.c:1822
#5  decrement_reference (rbtdb=rbtdb@entry=0x3fff6c554010, node=node@entry=0x3fff6c582780, least_serial=least_serial@entry=0, 
    nlock=nlock@entry=isc_rwlocktype_read, tlock=tlock@entry=isc_rwlocktype_none, pruning=pruning@entry=isc_boolean_false)
    at ../../../lib/dns/rbtdb.c:2254
#6  0x00003fff791a9cc0 in detachnode (db=0x3fff6c554010, targetp=targetp@entry=0x3fff70f0e020) at ../../../lib/dns/rbtdb.c:5523
#7  0x00003fff791a9f6c in rdataset_disassociate (rdataset=<optimized out>) at ../../../lib/dns/rbtdb.c:8783
#8  0x00003fff792173d0 in dns_rdataset_disassociate (rdataset=<optimized out>) at ../../../lib/dns/rdataset.c:116
#9  0x00003fff79116980 in free_adbfetch (adb=0x3fff6c280010, fetch=<synthetic pointer>) at ../../../lib/dns/adb.c:1963
#10 fetch_callback (task=<optimized out>, ev=0x3fff373c70a0) at ../../../lib/dns/adb.c:3994
#11 0x00003fff78d18304 in dispatch (manager=0x3fff77ff7010) at ../../../lib/isc/task.c:1141
#12 run (uap=0x3fff77ff7010) at ../../../lib/isc/task.c:1313
#13 0x00003fff7894cafc in start_thread (arg=0x3fff70f0f0b0) at pthread_create.c:309
#14 0x00003fff78436f4c in .__clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:104

(gdb) f 0
#0  ttl_sooner (v1=0x0, v2=0x3fff219f0280) at ../../../lib/dns/rbtdb.c:1127
1127            return (ISC_TF(h1->rdh_ttl < h2->rdh_ttl));
(gdb) p h1
$3 = (rdatasetheader_t *) 0x0
(gdb) p h2
$4 = (rdatasetheader_t *) 0x3fff219f0280

Comment 3 Petr Menšík 2020-01-09 19:50:44 UTC
It seems this crash matches recently fixed upstream issue, solved by merge request [1]. We were unable to figure out why only ppc64le platform seems to be affected by those issues, but very similar crashes were noticed in RHEL 8, tracked on bug #1740511.

1. https://gitlab.isc.org/isc-projects/bind9/merge_requests/2703

Comment 4 Miroslav Lichvar 2020-01-27 12:15:43 UTC
This does look like a duplicate of bug #1779589 (and RHEL8 bug #1740511).

There is a potential fix that modifies the memory order of some atomic operations. Could you please test the packages from the following build?

http://people.redhat.com/~mlichvar/tmp/bind-1779589/

Comment 6 Tomáš Hozza 2020-02-27 14:52:13 UTC
We believe that this bug is a duplicate of Bug #1779589. However since we do not have any reproducer, we can not be 100% sure. Please reopen if resolving Bug #1779589 won't have any effect.

*** This bug has been marked as a duplicate of bug 1779589 ***

Comment 7 Red Hat Bugzilla 2023-09-12 02:16:16 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.