Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1472954 - Hang while waiting for mutex lock
Summary: Hang while waiting for mutex lock
Keywords:
Status: CLOSED DUPLICATE of bug 1470352
Alias: None
Product: Fedora
Classification: Fedora
Component: nss
Version: 26
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kai Engert (:kaie) (inactive account)
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-19 16:27 UTC by Jonathan Lebon
Modified: 2017-07-19 16:47 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-07-19 16:47:53 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Jonathan Lebon 2017-07-19 16:27:48 UTC
Description of problem:

rpm-ostree gets hung while waiting for repos to be updated. Inspecting the backtrace reveals that libnss is trying to lock a mutex which is already locked:

(gdb) bt
#0  0x00007f0fc8e3dfad in __lll_lock_wait () from /host/lib64/libpthread.so.0
#1  0x00007f0fc8e36f44 in pthread_mutex_lock () from /host/lib64/libpthread.so.0
#2  0x00007f0fc1fc51b9 in PR_Lock (lock=0x7f0fb45e32c0) at ../../../nspr/pr/src/pthreads/ptsynch.c:177
#3  0x00007f0fc4e3ccb5 in nssSlot_IsTokenPresent () from /host/lib64/libnss3.so
#4  0x00007f0fc4e3cee6 in nssSlot_GetToken () from /host/lib64/libnss3.so
#5  0x00007f0fc4e36d4d in nssTrustDomain_FindCertificatesBySubject () from /host/lib64/libnss3.so
#6  0x00007f0fc4e35b07 in nssCertificate_BuildChain () from /host/lib64/libnss3.so
#7  0x00007f0fc4deec36 in CERT_FindCertIssuer () from /host/lib64/libnss3.so
#8  0x00007f0fc4def13f in cert_VerifyCertChain () from /host/lib64/libnss3.so
#9  0x00007f0fc4defac9 in CERT_VerifyCertChain () from /host/lib64/libnss3.so
#10 0x00007f0fc4df0909 in cert_VerifyCertWithFlags () from /host/lib64/libnss3.so
#11 0x00007f0fc4df0bc2 in CERT_VerifyCert () from /host/lib64/libnss3.so
#12 0x00007f0fc2a5f55d in SSL_AuthCertificate () from /host/lib64/libssl3.so
#13 0x00007f0fc2a56950 in ssl3_AuthCertificate () from /host/lib64/libssl3.so
#14 0x00007f0fc2a57028 in ssl3_CompleteHandleCertificate () from /host/lib64/libssl3.so
#15 0x00007f0fc2a595b6 in ssl3_HandleHandshakeMessage () from /host/lib64/libssl3.so
#16 0x00007f0fc2a5cd0a in ssl3_HandleRecord () from /host/lib64/libssl3.so
#17 0x00007f0fc2a5ea00 in ssl3_GatherCompleteHandshake () from /host/lib64/libssl3.so
#18 0x00007f0fc2a64e69 in SSL_ForceHandshake () from /host/lib64/libssl3.so
#19 0x00007f0fc63fdef5 in nss_connect_common () from /host/lib64/libcurl.so.4
#20 0x00007f0fc63fa550 in Curl_ssl_connect_nonblocking () from /host/lib64/libcurl.so.4
#21 0x00007f0fc63aefd2 in https_connecting () from /host/lib64/libcurl.so.4
#22 0x00007f0fc63d5f36 in multi_runsingle () from /host/lib64/libcurl.so.4
#23 0x00007f0fc63d6fb3 in curl_multi_perform () from /host/lib64/libcurl.so.4
#24 0x00007f0fca4c56fe in lr_download () from /host/lib64/librepo.so.0
#25 0x00007f0fca4c5ce1 in lr_download_single_cb () from /host/lib64/librepo.so.0
#26 0x00007f0fca4d3b0a in lr_yum_perform () from /host/lib64/librepo.so.0
#27 0x00007f0fca4cabb9 in lr_handle_perform () from /host/lib64/librepo.so.0
#28 0x00007f0fcade5c4e in dnf_repo_update (repo=repo@entry=0x7f0fb422a6c0, flags=flags@entry=DNF_REPO_UPDATE_FLAG_FORCE, state=state@entry=0x7f0fb4525c50, error=error@entry=0x7f0fbcb20ca0)
    at /usr/src/debug/rpm-ostree-2017.7/libdnf/libdnf/dnf-repo.c:1609
#29 0x000055f0dcea473d in rpmostree_context_download_metadata (self=self@entry=0x7f0fb40048a0, cancellable=cancellable@entry=0x55f0de354480, error=error@entry=0x7f0fbcb20ca0)
    at src/libpriv/rpmostree-core.c:956
#30 0x000055f0dcea509e in rpmostree_context_prepare (self=self@entry=0x7f0fb40048a0, cancellable=cancellable@entry=0x55f0de354480, error=error@entry=0x7f0fbcb20ca0)
    at src/libpriv/rpmostree-core.c:1488
#31 0x000055f0dcec8940 in do_local_assembly (error=0x7f0fbcb20ca0, cancellable=0x55f0de354480, self=0x55f0de303ef0) at src/daemon/rpmostree-sysroot-upgrader.c:849
#32 maybe_do_local_assembly (error=0x7f0fbcb20ca0, cancellable=0x55f0de354480, self=0x55f0de303ef0) at src/daemon/rpmostree-sysroot-upgrader.c:971
#33 rpmostree_sysroot_upgrader_deploy (self=self@entry=0x55f0de303ef0, cancellable=cancellable@entry=0x55f0de354480, error=error@entry=0x7f0fbcb20ca0) at src/daemon/rpmostree-sysroot-upgrader.c:994
#34 0x000055f0dcec3464 in deploy_transaction_execute (transaction=0x55f0de3580a0, cancellable=0x55f0de354480, error=0x7f0fbcb20ca0) at src/daemon/rpmostreed-transaction-types.c:864
#35 0x000055f0dceba979 in transaction_execute_thread (task=0x55f0de30d380, source_object=<optimized out>, task_data=<optimized out>, cancellable=0x55f0de354480)
    at src/daemon/rpmostreed-transaction.c:296
#36 0x00007f0fc9ef8086 in g_task_thread_pool_thread () from /host/lib64/libgio-2.0.so.0
#37 0x00007f0fc997af00 in g_thread_pool_thread_proxy () from /host/lib64/libglib-2.0.so.0
#38 0x00007f0fc997a536 in g_thread_proxy () from /host/lib64/libglib-2.0.so.0
#39 0x00007f0fc8e3436d in start_thread () from /host/lib64/libpthread.so.0
#40 0x00007f0fc8b6cb8f in clone () from /host/lib64/libc.so.6

The issue is that the thread that locked it is that very same thread, resulting in a deadlock. One can poke around pthread internals to verify this (see also https://stackoverflow.com/a/3491304/308136):

(gdb) frame 2
#2  0x00007f0fc1fc51b9 in PR_Lock (lock=0x7f0fb45e32c0) at ../../../nspr/pr/src/pthreads/ptsynch.c:177
177         rv = pthread_mutex_lock(&lock->mutex);
(gdb) print lock->mutex.__data.__owner
$6 = 9061
(gdb) thread find 9061
Thread 4 has target id 'LWP 9061'
(gdb) thread
[Current thread is 4 (LWP 9061)]
(gdb)

I'm opening this against nss since it owns most of that backtrace. It may be that the issue is in nspr or libcurl. Let me know if so.

Version-Release number of selected component (if applicable):

# rpm -q rpm-ostree librepo libcurl nss nspr
rpm-ostree-2017.7-1.fc26.x86_64
librepo-1.7.20-3.fc26.x86_64
libcurl-7.53.1-7.fc26.x86_64
nss-3.31.0-1.0.fc26.x86_64
nspr-4.15.0-1.fc26.x86_64

How reproducible:

Sporatic.

Steps to Reproduce:
1. rpm-ostree install wget

Actual results:

Hangs.

Expected results:

Doesn't hang.

Additional info:

I can upload a core file somewhere if required.

Thanks!

Comment 1 Jonathan Lebon 2017-07-19 16:39:48 UTC
Core file available at https://jlebon.fedorapeople.org/core.1989.rhbz1472954 (507M).

Comment 2 Daiki Ueno 2017-07-19 16:44:18 UTC
(In reply to Jonathan Lebon from comment #0)

> nss-3.31.0-1.0.fc26.x86_64

Try -1.1 from:
https://bodhi.fedoraproject.org/updates/FEDORA-2017-244f799ac9
(see bug 1470352)

Comment 3 Jonathan Lebon 2017-07-19 16:47:53 UTC
Thanks! Marking as dupe.

*** This bug has been marked as a duplicate of bug 1470352 ***


Note You need to log in before you can comment on or make changes to this bug.