Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1436158 - dnf crashes when downloading updates from Katello server
Summary: dnf crashes when downloading updates from Katello server
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: curl
Version: 25
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Kamil Dudka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 1473158
TreeView+ depends on / blocked
 
Reported: 2017-03-27 10:57 UTC by Rob Sanders
Modified: 2017-07-25 00:24 UTC (History)
9 users (show)

Fixed In Version: curl-7.53.1-8.fc26 curl-7.51.0-8.fc25
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1473158 (view as bug list)
Environment:
Last Closed: 2017-07-23 03:58:05 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
debug_trace (8.92 KB, text/plain)
2017-06-16 15:03 UTC, Rob Sanders
no flags Details
dnf ccpp (13.47 MB, application/x-xz)
2017-07-19 08:34 UTC, Rob Sanders
no flags Details
backtrace from the last crash (6.61 KB, text/plain)
2017-07-19 08:41 UTC, Rob Sanders
no flags Details
dnf ccpp (13.66 MB, application/x-xz)
2017-07-19 14:04 UTC, Rob Sanders
no flags Details

Description Rob Sanders 2017-03-27 10:57:41 UTC
Description of problem:

My Fedora client is managed by Katello 3.3 server. When I run dnf update -y on it, dnf will crash.


Version-Release number of selected component (if applicable):

$ rpm -qa | grep dnf
python3-dnf-1.1.10-6.fc25.noarch
dnf-1.1.10-6.fc25.noarch
python3-dnf-plugins-core-0.1.21-4.fc25.noarch
dnf-yum-1.1.10-6.fc25.noarch
dnf-conf-1.1.10-6.fc25.noarch
dnf-plugins-core-0.1.21-4.fc25.noarch



Steps to Reproduce:
sudo dnf update -y

Actual results:

[SKIPPED] qemu-kvm-2.7.1-4.fc25.x86_64.rpm: Already downloaded                                                                                                                         
[SKIPPED] qemu-guest-agent-2.7.1-4.fc25.x86_64.rpm: Already downloaded                                                                                                                 
(153/258): realmd-0.16.2-5.fc25_0.16.2-8.fc25.x86_64.drpm                                                                                               398 kB/s |  91 kB     00:00    
[DRPM] openldap-2.4.44-2.fc25_2.4.44-7.fc25.x86_64.drpm: done                                                                                                                          
[DRPM] openslp-2.0.0-9.fc25_2.0.0-10.fc25.x86_64.drpm: done                                                                                                                            
[DRPM] openvas-scanner-5.0.6-1.fc25_5.0.7-1.fc25.x86_64.drpm: done                                                                                                                     
[DRPM] opus-1.1.3-1.fc25_1.1.3-2.fc25.x86_64.drpm: done                                                                                                                                
[DRPM] oz-0.15.0-2.fc25_0.15.0-5.fc25.noarch.drpm: done                                                                                                                                
(154/258): redhat-rpm-config-44-1.fc25_45-1.fc25.noarch.drpm                                                                                            121 kB/s |  29 kB     00:00    
(155/258): redland-1.0.17-7.fc25_1.0.17-8.fc25.x86_64.drpm                                                                                              220 kB/s |  54 kB     00:00    
[DRPM] pam-kwallet-5.8.1-1.fc25_5.8.6-1.fc25.x86_64.drpm: done                                                                                                                         
Segmentation fault (core dumped)8.fc25_2.3.3-61.1.fc25.x86_64.drpm       57% [=========================================-                              ] 413 kB/s |  95 MB     02:51 ETA



$ dmesg -T | grep dnf
[Mon Mar 27 09:40:14 2017] dnf[13839]: segfault at 8 ip 00007f3489082af0 sp 00007ffc88c176e0 error 4 in libc-2.24.so[7f3489013000+1bd000]
[Mon Mar 27 09:43:34 2017] dnf[14322]: segfault at 8 ip 00007fbdff0bfaf0 sp 00007ffc523ed950 error 4 in libc-2.24.so[7fbdff050000+1bd000]
[Mon Mar 27 09:44:29 2017] dnf[14766]: segfault at ffffffffffffffff ip ffffffffffffffff sp 00007ffe428fbd98 error 15
[Mon Mar 27 09:54:31 2017] dnf[18951]: segfault at 7f006e69616d ip 00007f006e69616d sp 00007ffcc31b59b8 error 14
[Mon Mar 27 09:54:51 2017] dnf[19128]: segfault at 0 ip 00007f3ebb285ace sp 00007ffdb55b3fa0 error 4 in libc-2.24.so[7f3ebb216000+1bd000]
[Mon Mar 27 10:02:15 2017] dnf[22116]: segfault at 8 ip 00007f7a97cafaf0 sp 00007ffeb85392d0 error 4 in libc-2.24.so[7f7a97c40000+1bd000]
[Mon Mar 27 10:20:19 2017] dnf[28485]: segfault at 300010080 ip 0000000300010080 sp 00007ffdba6ffee8 error 14 in system-python[55997cd79000+1000]
[Mon Mar 27 10:22:33 2017] dnf[28634]: segfault at 300010080 ip 0000000300010080 sp 00007ffe278ff178 error 14 in system-python[5588b1965000+1000]
[Mon Mar 27 11:00:46 2017] dnf[29962]: segfault at 8 ip 00007faaa53dcaf0 sp 00007ffc5a027650 error 4 in libc-2.24.so[7faaa536d000+1bd000]
[Mon Mar 27 11:38:27 2017] dnf[11741]: segfault at 300010080 ip 0000000300010080 sp 00007fff12a53858 error 14 in system-python[5643fb95d000+1000]
[Mon Mar 27 11:43:32 2017] dnf[28404]: segfault at 55adbbd2a2d0 ip 000055adbbd2a2d0 sp 00007ffd0d4e0528 error 15
[Mon Mar 27 11:43:42 2017] traps: dnf[28505] general protection ip:7f3f0b9a6093 sp:7ffd27a0ba98 error:0


Additional info:

Katello server logs at the time crash happens:

Mar 27 11:38:23 puppet.teamwpc.local pulp_streamer[7694]: [-] 127.0.0.1 - - [27/Mar/2017:10:38:23 +0000] "GET /var/lib/pulp/content/units/rpm/f8/11c045bd847e72db972379806893a549596f111bc8150d68bf376acb066d76/qemu-system-cris-2.7.1-4.fc25.x86_64.rpm HTTP/1.1" 200 1524246 "-" "dnf/1.1.10"
Mar 27 11:38:24 puppet.teamwpc.local pulp_streamer[7694]: [-] 127.0.0.1 - - [27/Mar/2017:10:38:24 +0000] "GET /var/lib/pulp/content/units/rpm/94/e1891fc8e9ae091dac2630af92df0119db3c9828b2b0fdc2f4c5617267b760/qemu-kvm-2.7.1-4.fc25.x86_64.rpm HTTP/1.1" 200 67682 "-" "dnf/1.1.10"
Mar 27 11:38:24 puppet.teamwpc.local pulp_streamer[7694]: [-] 127.0.0.1 - - [27/Mar/2017:10:38:24 +0000] "GET /var/lib/pulp/content/units/rpm/df/7ed0caedf48386aad7eaef5c7fa002256ba9ec3dde8a4e77fa58aa7220ec41/qemu-guest-agent-2.7.1-4.fc25.x86_64.rpm HTTP/1.1" 200 204150 "-" "dnf/1.1.10"
Mar 27 11:38:24 puppet.teamwpc.local pulp_streamer[7694]: [-] 127.0.0.1 - - [27/Mar/2017:10:38:24 +0000] "GET /var/lib/pulp/content/units/rpm/42/39ad5e40d634c1650e874933ea69200c431c912ee8faac7c3402b00d449ac6/qt5-srpm-macros-5.7.1-1.fc25.noarch.rpm HTTP/1.1" 200 8066 "-" "dnf/1.1.10"

nothing our of ordinary here.

FYI, the same Katello server serves all my CentOS 6 and 7 servers and there are no problems with it at all. The problem is Fedora specific.

There are few things here you should be aware of:

1) If I run dnf update against Fedora mirror - everything works. It only crashes when DOWNLOADING from Katello server. Once the download stage is finished - it installs updates fine.

2) When I run smaller updates - like dnf update o* p* q* -y it WILL complete successfully. It only crashes when I try to upgrade the entire system (or a significantly high number of packages - 300+). The workstation I'm installing updates on has 64GB or RAM and it's NOT running out of memory during the upgrade process.

Comment 1 Igor Gnatenko 2017-03-27 11:13:28 UTC
without having backtrace I doubt it's possible to do anything about this. Most probably this bug has been fixed in DNF 2.x (or related components)...

Comment 2 Igor Gnatenko 2017-03-29 11:10:54 UTC
Please, provide backtrace. (probably via ABRT).

Comment 3 Rob Sanders 2017-04-05 10:26:33 UTC
I managed to re-produce it with DNF 2.2.0

$ sudo dnf update
OS                                                                                                                                                       83 MB/s |  59 MB     00:00    
Openh264                                                                                                                                                 23 kB/s | 3.1 kB     00:00    
Segmentation fault                                                        0% [                                                                        ] ---  B/s |   0  B     --:-- ETA


dmesg -T | grep segfault
[Wed Apr  5 11:19:26 2017] dnf[25601]: segfault at 300010080 ip 0000000300010080 sp 00007ffe1945fb98 error 14 in system-python[55bc6444d000+1000]


$ dnf --version
2.2.0
  Installed: dnf-0:2.3.0-0.16g2620114.fc25.noarch at 2017-04-05 10:21
  Built    :  at 2017-04-04 01:50

  Installed: rpm-0:4.13.0.1-1.fc25.x86_64 at 2017-03-27 08:53
  Built    : Fedora Project at 2017-02-24 12:48

Because it doesn't happen every time, it's hard to provide a backtrace. I will try to produce it next time I run dnf command.

Comment 4 Jaroslav Mracek 2017-06-16 14:04:37 UTC
Please can you provide a core dump? Or investigate core dump by your self? Here are some hints:

# to change core bump size
ulimit -c unlimited

# Change core dump file location
cat /proc/sys/kernel/core_pattern
echo "core.%e.%p" > /proc/sys/kernel/core_pattern

Then run app that fails

sudo gdb /usr/bin/python3.5 core.dnf.9299
then: "bt"

Comment 5 Rob Sanders 2017-06-16 14:24:39 UTC
Thank you for this info. I will get it sorted.

Yesterday I've updated one of my systems to Fedora 26 beta and I had the same problem. Interestingly when I run it via gdb:
gdb -ex r --args /usr/libexec/system-python /usr/bin/dnf system-upgrade download --refresh --releasever=26 --allowerasing -y

it didn't crash. The args line:
/usr/libexec/system-python /usr/bin/dnf system-upgrade download --refresh --releasever=26 --allowerasing -y

was extracted via ps ax when running dnf directly.

Today I was upgrading one of the F25 machines via dnf and again it was segfaulting.

At least there is an easy way to re-produce it on multiple systems.

Comment 6 Rob Sanders 2017-06-16 15:03:26 UTC
Created attachment 1288389 [details]
debug_trace

s dnf reinstall `rpm -qa` -y crashed it pretty quick. Log attached.

Comment 7 Jaroslav Mracek 2017-06-19 09:43:57 UTC
Please can you install missing debuginfo and then provide output again, because the most important line has ??:

Missing separate debuginfos, use: dnf debuginfo-install gpgme-1.8.0-12.fc26.x86_64 libassuan-2.4.3-2.fc26.x86_64 libattr-2.4.47-18.fc26.x86_64 libblkid-2.29.1-2.fc26.x86_64 libcap-2.25-5.fc26.x86_64 libcom_err-1.43.4-2.fc26.x86_64 libcomps-0.1.8-3g01a4759.fc26.x86_64 libffi-3.1-10.fc26.x86_64 libgpg-error-1.25-2.fc26.x86_64 libmount-2.29.1-2.fc26.x86_64 libpsl-0.17.0-2.fc26.x86_64 librepo-1.7.20-12g2865c01.fc26.x86_64 libsolv-0.6.27-23g3538163.fc26.x86_64 libssh2-1.8.0-2.fc26.x86_64 libunistring-0.9.7-1.fc26.x86_64 libuuid-2.29.1-2.fc26.x86_64 libxml2-2.9.4-2.fc26.x86_64 lz4-libs-1.7.5-3.fc26.x86_64 ncurses-libs-6.0-8.20170212.fc26.x86_64 popt-1.16-8.fc26.x86_64 python3-gpg-1.8.0-12.fc26.x86_64 python3-libcomps-0.1.8-3g01a4759.fc26.x86_64 python3-librepo-1.7.20-12g2865c01.fc26.x86_64

Also any additional information about setting of in .repo files or Katello server could be helpful.

Thanks a lot.

Comment 8 Igor Gnatenko 2017-06-19 09:45:31 UTC
this bug is either in libcurl or in nss... definitely not in DNF

Comment 9 Rob Sanders 2017-06-19 10:00:43 UTC
Unfortunately I've got them all installed:

dnf debuginfo-install gpgme-1.8.0-12.fc26.x86_64 libassuan-2.4.3-2.fc26.x86_64 libattr-2.4.47-18.fc26.x86_64 libblkid-2.29.1-2.fc26.x86_64 libcap-2.25-5.fc26.x86_64 libcom_err-1.43.4-2.fc26.x86_64 libcomps-0.1.8-3g01a4759.fc26.x86_64 libffi-3.1-10.fc26.x86_64 libgpg-error-1.25-2.fc26.x86_64 libmount-2.29.1-2.fc26.x86_64 libpsl-0.17.0-2.fc26.x86_64 librepo-1.7.20-12g2865c01.fc26.x86_64 libsolv-0.6.27-23g3538163.fc26.x86_64 libssh2-1.8.0-2.fc26.x86_64 libunistring-0.9.7-1.fc26.x86_64 libuuid-2.29.1-2.fc26.x86_64 libxml2-2.9.4-2.fc26.x86_64 lz4-libs-1.7.5-3.fc26.x86_64 ncurses-libs-6.0-8.20170212.fc26.x86_64 popt-1.16-8.fc26.x86_64 python3-gpg-1.8.0-12.fc26.x86_64 python3-libcomps-0.1.8-3g01a4759.fc26.x86_64 python3-librepo-1.7.20-12g2865c01.fc26.x86_64
enabling updates-testing-debuginfo repository
enabling rpmfusion-free-updates-testing-debuginfo repository
enabling rpmfusion-free-debuginfo repository
Last metadata expiration check: 0:00:00 ago on Mon 19 Jun 2017 10:52:54 BST.
Package glibc-debuginfo-2.25-5.fc26.x86_64 is already installed, skipping.
Package libselinux-debuginfo-2.6-6.fc26.x86_64 is already installed, skipping.
Package pcre-debuginfo-8.40-7.fc26.x86_64 is already installed, skipping.
Package libidn2-debuginfo-2.0.2-1.fc26.x86_64 is already installed, skipping.
Package openssl-debuginfo-1:1.1.0f-3.fc26.x86_64 is already installed, skipping.
Package curl-debuginfo-7.53.1-7.fc26.x86_64 is already installed, skipping.
Package krb5-debuginfo-1.15.1-8.fc26.x86_64 is already installed, skipping.
Package keyutils-debuginfo-1.5.10-1.fc26.x86_64 is already installed, skipping.
Package openldap-debuginfo-2.4.44-10.fc26.x86_64 is already installed, skipping.
Package nspr-debuginfo-4.14.0-2.fc26.x86_64 is already installed, skipping.
Package nss-debuginfo-3.30.2-1.1.fc26.x86_64 is already installed, skipping.
Package nss-util-debuginfo-3.30.2-1.0.fc26.x86_64 is already installed, skipping.
Package cyrus-sasl-debuginfo-2.1.26-32.fc26.x86_64 is already installed, skipping.
Package nss-softokn-debuginfo-3.30.2-1.0.fc26.x86_64 is already installed, skipping.
Package libdb-debuginfo-5.3.28-21.fc26.x86_64 is already installed, skipping.
Package nghttp2-debuginfo-1.21.1-1.fc26.x86_64 is already installed, skipping.
Package gcc-debuginfo-7.1.1-2.fc26.x86_64 is already installed, skipping.
Package glib2-debuginfo-2.52.2-2.fc26.x86_64 is already installed, skipping.
Package libsolv-debuginfo-0.6.27-2.fc26.x86_64 is already installed, skipping.
Package rpm-debuginfo-4.13.0.1-4.fc26.x86_64 is already installed, skipping.
Package acl-debuginfo-2.2.52-14.fc26.x86_64 is already installed, skipping.
Package elfutils-debuginfo-0.169-1.fc26.x86_64 is already installed, skipping.
Package lua-debuginfo-5.3.4-3.fc26.x86_64 is already installed, skipping.
Package python3-debuginfo-3.6.1-6.fc26.x86_64 is already installed, skipping.
Package gdbm-debuginfo-1.13-1.fc26.x86_64 is already installed, skipping.
Package sqlite-debuginfo-3.19.1-1.fc26.x86_64 is already installed, skipping.
Dependencies resolved.
Nothing to do.
Complete!

My repos config:

/etc/yum.repos.d/redhat.repo 
#
# Certificate-Based Repositories
# Managed by (rhsm) subscription-manager
#
# *** This file is auto-generated.  Changes made here will be over-written. ***
# *** Use "subscription-manager repo-override --help" if you wish to make changes. ***
#
# If this file is empty and this system is subscribed consider 
# a "yum repolist" to refresh available repos
#

[World_Programming_Ltd_Fedora_26_Updates]
metadata_expire = 1
sslclientcert = /etc/pki/entitlement/1571189452804602295.pem
baseurl = https://puppet.teamwpc.local/pulp/repos/World_Programming_Ltd/DEV/OS_Updates_Fedora_26/custom/Fedora_26/Updates
sslverify = 1
name = Updates
sslclientkey = /etc/pki/entitlement/1571189452804602295-key.pem
enabled = 1
sslcacert = /etc/rhsm/ca/katello-server-ca.pem
gpgcheck = 0

[World_Programming_Ltd_Fedora_26_Openh264]
metadata_expire = 1
sslclientcert = /etc/pki/entitlement/1571189452804602295.pem
baseurl = https://puppet.teamwpc.local/pulp/repos/World_Programming_Ltd/DEV/OS_Updates_Fedora_26/custom/Fedora_26/Openh264
sslverify = 1
name = Openh264
sslclientkey = /etc/pki/entitlement/1571189452804602295-key.pem
enabled = 1
sslcacert = /etc/rhsm/ca/katello-server-ca.pem
gpgcheck = 0

[World_Programming_Ltd_Fedora_26_OS]
metadata_expire = 1
sslclientcert = /etc/pki/entitlement/1571189452804602295.pem
baseurl = https://puppet.teamwpc.local/pulp/repos/World_Programming_Ltd/DEV/OS_Updates_Fedora_26/custom/Fedora_26/OS
sslverify = 1
name = OS
sslclientkey = /etc/pki/entitlement/1571189452804602295-key.pem
enabled = 1
sslcacert = /etc/rhsm/ca/katello-server-ca.pem
gpgcheck = 0


My CA is a 4096 bit sha512WithRSAEncryption certificate.

Comment 10 Martin Hatina 2017-06-21 11:22:39 UTC
We think the bug is in curl. Can you (gyus from curl) confirm this? Thanks.

Comment 11 Kamil Dudka 2017-06-21 12:35:35 UTC
(In reply to Martin Hatina from comment #10)
> We think the bug is in curl. Can you (gyus from curl) confirm this?

Nope.  I cannot see any obvious bug of curl here.  Is the core file available?

Could you please check the value of data->set.fdebug inside Curl_debug()?

Is the value equal to the address that the process attempted to jump to?

Does librepo use the CURLOPT_DEBUGFUNCTION option of libcurl?

Comment 12 Rob Sanders 2017-07-11 15:24:49 UTC
Hi,

Fedora 26 is now out and we'll upgrading a few machines. Please let me know if there's any progress or workaround for this issue.

Thanks,
Rob

Comment 13 Kamil Dudka 2017-07-11 15:56:23 UTC
Do not expect any progress from me unless the questions from comment #11 are answered.  Switching to librepo for further analysis...

Comment 14 Rob Sanders 2017-07-18 10:49:40 UTC
Not sure if that's related or a completely different issue, but after the update to Fedora 26 dnf just hangs now. If I remove katello subscription and let it work against default fedora repos, it's all fine.

Strace:

stat("/etc/rhsm/ca/katello-server-ca.pem", {st_mode=S_IFREG|0644, st_size=2553, ...}) = 0
open("/etc/rhsm/ca/katello-server-ca.pem", O_RDONLY) = 21
fstat(21, {st_mode=S_IFREG|0644, st_size=2553, ...}) = 0
read(21, "-----BEGIN CERTIFICATE-----\nMIIH"..., 2553) = 2553
close(21)                               = 0
stat("/etc/pki/entitlement/3267844884140372615.pem", {st_mode=S_IFREG|0644, st_size=2790, ...}) = 0
stat("/etc/pki/entitlement/3267844884140372615.pem", {st_mode=S_IFREG|0644, st_size=2790, ...}) = 0
open("/etc/pki/entitlement/3267844884140372615.pem", O_RDONLY) = 21
fstat(21, {st_mode=S_IFREG|0644, st_size=2790, ...}) = 0
read(21, "-----BEGIN CERTIFICATE-----\nMIIE"..., 2790) = 2790
close(21)                               = 0
fcntl(18, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=1073741824, l_len=1}) = 0
fcntl(18, F_SETLK, {l_type=F_RDLCK, l_whence=SEEK_SET, l_start=1073741826, l_len=510}) = 0
fcntl(18, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=1073741824, l_len=1}) = 0
stat("/etc/pki/nssdb/cert9.db-journal", 0x7fff76a0e790) = -1 ENOENT (No such file or directory)
fstat(18, {st_mode=S_IFREG|0644, st_size=9216, ...}) = 0
pread64(18, "\0\0\0\2\0\0\0\t\0\0\0\0\0\0\0\0", 16, 24) = 16
fstat(18, {st_mode=S_IFREG|0644, st_size=9216, ...}) = 0
stat("/etc/pki/nssdb/cert9.db-wal", 0x7fff76a0e790) = -1 ENOENT (No such file or directory)
fstat(18, {st_mode=S_IFREG|0644, st_size=9216, ...}) = 0
pread64(18, "\n\0\0\0\0\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1024, 4096) = 1024
fcntl(18, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
futex(0x55b573563610, FUTEX_WAIT_PRIVATE, 2, NULL

Comment 15 Kamil Dudka 2017-07-18 20:19:16 UTC
(In reply to Rob Sanders from comment #14)
> Not sure if that's related or a completely different issue, but after the
> update to Fedora 26 dnf just hangs now.

I do not think it is related, sounds like bug #1470352 to me...

Comment 16 Rob Sanders 2017-07-19 08:34:58 UTC
Created attachment 1300924 [details]
dnf ccpp

Hi Kamil,

Yes, you're right it was related to the other ticket. Unfortunately upgrading nss didn't fix this issue.

# rpm -i nss-3.31.0-1.1.fc26.x86_64.rpm 
# dnf update
Updates                                                                                                                                                  28 MB/s | 8.1 MB     00:00    
Openh264                                                                                                                                                 14 kB/s | 3.1 kB     00:00    
Segmentation fault (core dumped)                                          0% [                                                                        ] ---  B/s |   0  B     --:-- ETA


I would appreciate any help in getting to the bottom of this ticket. Please let me know what do you require and I'll do my best to provide you with all the information.

I've attacked ccpp dump from this crash.

Comment 17 Rob Sanders 2017-07-19 08:41:13 UTC
Created attachment 1300925 [details]
backtrace from the last crash

bt full from attached coredump.

Comment 18 Kamil Dudka 2017-07-19 10:29:32 UTC
Comment on attachment 1300924 [details]
dnf ccpp

Thank you for sharing the core dump.  According to dso_list, librepo.so.0 is loaded from librepo-1.7.20-12g2865c01.fc26, which I am not able to find in Koji.  Could you please retest with librepo-1.7.20-3.fc26?

Comment 19 Rob Sanders 2017-07-19 12:00:45 UTC
It must have been a version from dnf 2.x repo. After downgrading librepo to 1.7.20-3, I can still reproduce the bug.

Comment 20 Kamil Dudka 2017-07-19 12:21:56 UTC
(In reply to Rob Sanders from comment #19)
> It must have been a version from dnf 2.x repo. After downgrading librepo to
> 1.7.20-3, I can still reproduce the bug.

Then please upload the coredump from the crash with librepo-1.7.20-3.fc26.  I am not able to fully map attachment #1300924 [details] because of the library version mismatch.  Neither I know which dnf 2.x repo you refer to.  I am just a libcurl developer trying to help (although I am almost sure that libcurl is not the cause of this bug).

Comment 21 Rob Sanders 2017-07-19 14:04:17 UTC
Created attachment 1301128 [details]
dnf ccpp

It looks like the compressed log file was 24MB this time so I couldn't add it here. I've uploaded it to a first free online uploader services google found. It's available here:
https://ufile.io/snu6x

Also it turns out this bug also affects packagekit - same function.

Comment 22 Kamil Dudka 2017-07-19 15:32:45 UTC
Thanks!

The data of the easy handle looks like garbage.  So I suspect that in some case curl_easy_cleanup() is called without calling curl_multi_remove_handle() first.  Unfortunately, I do not know librepo internals enough to confirm the hypothesis.  Could you please try to rebuild librepo with the following patch applied?

diff --git a/librepo/fastestmirror.c b/librepo/fastestmirror.c
index 42dee1f..a7395f6 100644
--- a/librepo/fastestmirror.c
+++ b/librepo/fastestmirror.c
@@ -63,8 +63,10 @@ lr_lrfastestmirror_free(LrFastestMirror *mirror)
 {
     if (!mirror)
         return;
+#if 0
     if (mirror->curl)
         curl_easy_cleanup(mirror->curl);
+#endif
     g_free(mirror);
 }

diff --git a/librepo/handle.c b/librepo/handle.c
index ccea79b..941eb06 100644
--- a/librepo/handle.c
+++ b/librepo/handle.c
@@ -118,8 +118,10 @@ lr_handle_free(LrHandle *handle)
 {
     if (!handle)
         return;
+#if 0
     if (handle->curl_handle)
         curl_easy_cleanup(handle->curl_handle);
+#endif
     if (handle->mirrorlist_fd != -1)
         close(handle->mirrorlist_fd);
     if (handle->metalink_fd != -1)

Comment 23 Rob Sanders 2017-07-19 16:17:17 UTC
Unfortunately the same problem after the patch.

Comment 24 Kamil Dudka 2017-07-19 17:34:10 UTC
I have identified a possible bug in libcurl source code.  Could you please try libcurl from the following scratch build?

https://koji.fedoraproject.org/koji/taskinfo?taskID=20614836

Comment 25 Rob Sanders 2017-07-19 18:03:22 UTC
Looks like you've nailed it!

I can no longer re-produce this issue after upgrading curl/libcurl and trust me, I've tried. Previously I could reproduce it every few seconds:

[Wed Jul 19 17:06:16 2017] yum[26175]: segfault at a1 ip 00000000000000a1 sp 00007ffd5a113e18 error 14 in system-python[55e3ef392000+2000]
[Wed Jul 19 17:06:28 2017] traps: yum[26320] general protection ip:7ff60b330aee sp:7ffe5c0a8040 error:0 in libc-2.25.so[7ff60b2be000+1c7000]
[Wed Jul 19 17:06:35 2017] traps: yum[26354] general protection ip:7fe03c3a1063 sp:7ffcf206fa28 error:0 in libcurl.so.4.4.0[7fe03c389000+7c000]
[Wed Jul 19 17:06:41 2017] traps: yum[26413] general protection ip:7fa9483df063 sp:7ffc4bbe5a08 error:0 in libcurl.so.4.4.0[7fa9483c7000+7c000]



Is it possible to push this change to F25? This will save me - and others - lots of manual patching before bigger F26 rollout.

Comment 26 Kamil Dudka 2017-07-19 21:04:58 UTC
Perfect.  Thanks for confirmation!  I will prepare builds for f25/f26 tomorrow.  It is interesting that you discovered the bug now because it was introduced in 2009 (exactly 8 years ago) by this commit:

https://github.com/curl/curl/commit/curl-7_19_5-204-g5f0cae803

Comment 28 Fedora Update System 2017-07-20 07:34:16 UTC
curl-7.51.0-8.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2017-7a6144ef24

Comment 29 Fedora Update System 2017-07-20 07:34:28 UTC
curl-7.53.1-8.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2017-d9de901507

Comment 30 Rob Sanders 2017-07-20 08:12:19 UTC
Hi Kamil,

I've hundreds of el6/el7 servers pulling updates from Katello and this issue doesn't affect them. I guess they use different code paths than newer fedora. Only since we introduced Fedora as a Desktop for our developers we stumbled upon this issue.

Also when you push updates from katello server to Fedora client it works as fine. Goferd must be using different code paths than dnf/packagekit.

I would like to personally thank you for investing so much time in getting to the bottom of this issue and responding so promptly.

Comment 31 Kamil Dudka 2017-07-20 08:30:12 UTC
No problem.  Thanks for the help with debugging it!  And sorry that it took so long since the initial bug report.

Comment 32 Fedora Update System 2017-07-20 22:51:10 UTC
curl-7.51.0-8.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-7a6144ef24

Comment 33 Fedora Update System 2017-07-21 01:22:42 UTC
curl-7.53.1-8.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-d9de901507

Comment 34 Fedora Update System 2017-07-23 03:58:05 UTC
curl-7.53.1-8.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 35 Fedora Update System 2017-07-25 00:24:19 UTC
curl-7.51.0-8.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.