Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1157233
Summary: | 'dnf download' cannot be run in parallel as non-root user | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||||
Component: | dnf | Assignee: | Michal Luscon <mluscon> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | rawhide | CC: | akozumpl, jsilhan, jzeleny, mluscon, pnemade, ptoscano, rholy, rjones, tla | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | dnf-plugins-core-0.1.5-2.fc21 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-04-06 18:49:06 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 910269, 1156498 | ||||||||
Attachments: |
|
Description
Richard W.M. Jones
2014-10-26 12:03:03 UTC
Hi, thanks for the report, we'll take a look. Why not just fix this now? It's an obvious bug that prevents 'dnf download' from being used safely; as well as blocking us from adopting dnf at all. (In reply to Richard W.M. Jones from comment #2) > Why not just fix this now? It's an obvious bug that prevents > 'dnf download' from being used safely; as well as blocking us from > adopting dnf at all. May I ask who is "us" and why should this block people from adopting dnf? Nobody else expressed his opinion in this bugzilla and if I understand it correctly, the only thing that is actually blocked is concurrent execution of dnf download, sequential execution still works. While I understand people reporting bugs consider each one very important, we have to carefully evaluate every bug and decide which bugs get fixed now and which later. Obviously, those bugs that affect major use cases have preference. "Us" is supermin/libguestfs. See the dependent bug 1156498 which was filed because apparently everything should use dnf in Fedora 22. 'dnf download' is broken if it randomly corrupts things when you run it in parallel. Imagine the case where someone writes a script to download packages, and then discovers that the script breaks if run in parallel. I really can't believe you can argue this is not broken. Also we cannot solve this in supermin because although we could make instances of supermin run serially (which would be horrible, but could be done) it would still break if an unrelated script or the user ran 'dnf download' at the same time. (In reply to Richard W.M. Jones from comment #4) > "Us" is supermin/libguestfs. See the dependent bug 1156498 > which was filed because apparently everything should use dnf > in Fedora 22. Fair enough. On the other hand, if you just download the package and don't use yum/dnf to do anything else, my guess is that it might be possible to postpone the switch for one more Fedora release - yum is not going away instantly, it will be just phased out. > 'dnf download' is broken if it randomly corrupts things when > you run it in parallel. Imagine the case where someone writes > a script to download packages, and then discovers that > the script breaks if run in parallel. I really can't believe > you can argue this is not broken. I don't. I just questioned the importance of the use case. If you provided the information below right at the beginning, I would not do that. > Also we cannot solve this in supermin because although we > could make instances of supermin run serially (which would > be horrible, but could be done) it would still break if > an unrelated script or the user ran 'dnf download' at the same time. Ok, let's get this back for evaluation. With this new information I believe it can be addressed sooner than "later or never". Can you please elaborate on the use case a bit? For what do you use "dnf download" exactly? What can be the expected input and what should be the expected output in your case? Thank you in advance. https://github.com/libguestfs/supermin/blob/master/src/rpm.ml#L308 See above line 308 for the currently working yumdownloader version. It is passed a list of RPM NEVRs and runs this command: dnf download --destdir <some-tmpdir> <list-of-RPM-NEVR> It's necessary because we need to unpack some RPMs to get original files like /etc files that might be modified by the user. Richard, I still don't get why you run it in parallel for the same list twice. It only happens when two same packages are downloaded at once so the processes are overriding it. Yes, it is a bug. Bug of wrong use case. If you wanna download package sets simultaneously, divide the <list-of-RPM-NEVR> in half and execute "dnf download" for each part. When multiple users unintentionally download rpms to the same directory, you can set different --destdir and then use "mv". If we find that this bug is inside DNF and related to more filed bugs, we will add higher priority. It's because you can run supermin twice in two independent processes. There's no way to "divide the list" between independent copies of supermin, run from different places. Even if we did coordinate multiple copies of supermin, there's still a problem that supermin could be running and the user could independently run 'dnf download' causing supermin and/or dnf to break. Not sure about your point about --destdir, as it always uses a randomly generated tmpdir for --destdir so they are never going to be the same. Created attachment 959666 [details]
0001-dnf-Ensure-two-processes-cannot-overwrite-each-other.patch
This patch adds a simple lock file to the non-root cache
directory, which fixes the problem for me.
Another alternative might be allow to specify on command line the cachedir used when running; this way each dnf run in supermin could have a different cachedir than /var/tmp/dnf-$USER-$RANDOM/, reducing even further the conflicts between dnf runs. (In reply to Pino Toscano from comment #11) > Another alternative might be allow to specify on command line the cachedir > used when running; this way each dnf run in supermin could have a different > cachedir than /var/tmp/dnf-$USER-$RANDOM/, reducing even further the > conflicts between dnf runs. For supermin that would certainly be sufficient. In fact the code mentions a --tempcache option, although it doesn't appear to be implemented. For general dnf download use (eg from user scripts), locking the cache is better. Created attachment 959921 [details]
0001-dnf-Ensure-two-processes-cannot-overwrite-each-other.patch
Second version without the obvious bug this time.
Pino, thats a good point. Till the bug is fixed, append `--setopt=cachedir=<dir>` to command line. Richard, thanks for taking initiative, we will look at the patch. (In reply to Jan Silhan from comment #14) > Pino, thats a good point. Till the bug is fixed, append > `--setopt=cachedir=<dir>` to command line. It does not seem to work here (current rawhide updated as of ~right now): $ rm -rf /var/tmp/dnf-pino-* $ ls -d /var/tmp/dnf-pino-* 2>/dev/null | wc -l 0 $ mkdir dest $ dnf download --destdir=dest --setopt=cachedir=$PWD/dest -v bash.x86_64 cachedir: /var/tmp/dnf-pino-CkAXFj/x86_64/22 Loaded plugins: copr, playground, download, Query, kickstart, generate_completion_cache, debuginfo-install, builddep, noroot, protected_packages DNF version: 0.6.2 Fedora - Rawhide - Developmental packages for the next Fedora release [...] 71 MB/s | 43 MB 00:00 not found updateinfo for: Fedora - Rawhide - Developmental packages for the next Fedora release Completion plugin: Can't write completion cache: [Errno 13] Permission denied: u'/var/cache/dnf/available.cache' bash-4.3.30-2.fc22.x86_64.rpm [...] 40 MB/s | 1.6 MB 00:00 $ ls -d /var/tmp/dnf-pino-* 2>/dev/null | wc -l 1 $ ls dest/ bash-4.3.30-2.fc22.x86_64.rpm According to your suggestion, there should have been no /var/tmp/dnf-$USER-* created at all, while its content being in "destdir". Download phase in dnf-0.6.4 will be secured by blocking lock mechanism. dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21 I'm now testing with dnf-plugins-core-0.1.5-1.fc21, hawkey-0.5.3-2.fc21, dnf-0.6.4-1.fc21 but this functionality doesn't work correctly. I see this error, although not easily reproducible: - "metadata already locked by <pid>" This causes a failure, but it should just wait for the other process to complete: cachedir: /var/tmp/dnf-rjones-wwvQYa/x86_64/21 Loaded plugins: copr, playground, download, Query, protected_packages, needs-restarting, builddep, debuginfo-install, reposync, kickstart, noroot, generate_completion_cache DNF version: 0.6.4 metadata already locked by 29226 The application with PID 29226 is: dnf Memory : 58 M RSS (284 MB VSZ) Started: Mon Feb 16 08:51:02 2015 - 00:02 ago State : Running supermin: /usr/bin/dnf download -v --destdir '/tmp/supermin01fa6c.tmpdir/yo45jxsg' 'bash.x86_64' 'coreutils.x86_64' 'glibc.x86_64' 'info.x86_64' 'grep.x86_64' 'libattr.x86_64' 'openssl-libs.x86_64' 'glibc-common.x86_64' 'ca-certificates.noarch' 'crypto-policies.noarch' 'krb5-libs.x86_64' 'setup.noarch' 'fedora-release.noarch' 'fedora-repos.noarch': command failed, see earlier errors ---- Also if you use --destdir with a non-existent directory name, then dnf creates destdir as a *file* and writes every package to the same file, which seems like a bug, although we work around it by create a randomly named destdir. Another random failure (rarer than the above) is: Completion plugin: Can't write completion cache: unable to open database file The way to reproduce at least some of these bugs is to run the following commands in parallel, as non-root (same user): In Window 1, run: mkdir -p /tmp/t1 ; while dnf -v download --destdir /tmp/t1 bash.x86_64 glibc.x86_64 info.x86_64 grep.x86_64 libattr.x86_64 openssl-libs.x86_64 coreutils.x86_64 glibc-common.x86_64 ca-certificates.noarch crypto-policies.noarch krb5-libs.x86_64 setup.noarch >& /tmp/log1 ; do echo -n .; done In Window 2, run: mkdir -p /tmp/t2 ; while dnf -v download --destdir /tmp/t2 bash.x86_64 glibc.x86_64 info.x86_64 grep.x86_64 libattr.x86_64 openssl-libs.x86_64 coreutils.x86_64 glibc-common.x86_64 ca-certificates.noarch crypto-policies.noarch krb5-libs.x86_64 setup.noarch >& /tmp/log2 ; do echo -n .; done In Window 3, run: mkdir -p /tmp/t3 ; while dnf -v download --destdir /tmp/t1 bash.x86_64 glibc.x86_64 info.x86_64 grep.x86_64 libattr.x86_64 openssl-libs.x86_64 coreutils.x86_64 glibc-common.x86_64 ca-certificates.noarch crypto-policies.noarch krb5-libs.x86_64 setup.noarch >& /tmp/log3 ; do echo -n .; done Let that run for quite a while. When it fails, examine the log files (/tmp/log[123]). Try again, this time with the correct commands: In Window 1, run: mkdir -p /tmp/t1 ; while dnf -v download --destdir /tmp/t1 bash.x86_64 glibc.x86_64 info.x86_64 grep.x86_64 libattr.x86_64 openssl-libs.x86_64 coreutils.x86_64 glibc-common.x86_64 ca-certificates.noarch crypto-policies.noarch krb5-libs.x86_64 setup.noarch >& /tmp/log1 ; do echo -n .; done In Window 2, run: mkdir -p /tmp/t2 ; while dnf -v download --destdir /tmp/t2 bash.x86_64 glibc.x86_64 info.x86_64 grep.x86_64 libattr.x86_64 openssl-libs.x86_64 coreutils.x86_64 glibc-common.x86_64 ca-certificates.noarch crypto-policies.noarch krb5-libs.x86_64 setup.noarch >& /tmp/log2 ; do echo -n .; done In Window 3, run: mkdir -p /tmp/t3 ; while dnf -v download --destdir /tmp/t3 bash.x86_64 glibc.x86_64 info.x86_64 grep.x86_64 libattr.x86_64 openssl-libs.x86_64 coreutils.x86_64 glibc-common.x86_64 ca-certificates.noarch crypto-policies.noarch krb5-libs.x86_64 setup.noarch >& /tmp/log3 ; do echo -n .; done Another error is: repo: using cache for: updates Using metadata from Mon Feb 16 08:44:21 2015 Completion plugin: Can't write completion cache: unable to open database file Waiting for process with pid 4268 to finish. Waiting for process with pid 4022 to finish. [Errno 2] No such file or directory: u'/var/tmp/dnf-rjones-wwvQYa/x86_64/21/updates/packages/bash-4.3.33-1.fc21.x86_64.rpm' where it seems as if a parallel dnf deleted the file from the cache. Here's a better and simpler reproducer of the 'metadata already locked' bug. It seems to happen when two instances of 'dnf download' are started at exactly the same time: $ mkdir -p /tmp/t1 /tmp/t2 ; dnf download --destdir /tmp/t1 bash.x86_64 & dnf download --destdir /tmp/t2 bash.x86_64 That fails about 2/3rds of the time for me. When it fails you will see: metadata already locked by 3199 The application with PID 3199 is: dnf Memory : 35 M RSS (361 MB VSZ) Started: Mon Feb 16 09:46:17 2015 - 00:02 ago State : Running Using metadata from Mon Feb 16 08:44:21 2015 bash-4.3.33-1.fc21.x86_64.rpm 2.2 MB/s | 1.6 MB 00:00 [1]+ Exit 1 dnf download --destdir /tmp/t1 bash.x86_64 Note the 'Exit 1' indicating that one of the dnf processes exited with a failure instead of waiting. Package hawkey-0.5.3-2.fc21, dnf-plugins-core-0.1.5-1.fc21, dnf-0.6.4-1.fc21: * should fix your issue, * was pushed to the Fedora 21 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing hawkey-0.5.3-2.fc21 dnf-plugins-core-0.1.5-1.fc21 dnf-0.6.4-1.fc21' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-2139/dnf-plugins-core-0.1.5-1.fc21,hawkey-0.5.3-2.fc21,dnf-0.6.4-1.fc21 then log in and leave karma (feedback). Please can you remove this bug from the update. hawkey-0.5.3-2.fc21, dnf-plugins-core-0.1.5-1.fc21, dnf-0.6.4-1.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report. This bug appears to have been reported against 'rawhide' during the Fedora 22 development cycle. Changing version to '22'. More information and reason for this action is here: https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora22 (In reply to Richard W.M. Jones from comment #25) > Please can you remove this bug from the update. too late. We try to hold the download lock longer during rpm transaction and make metadata lock non-blocking. Implemented in https://github.com/rpm-software-management/dnf/pull/234 hawkey-0.5.4-1.fc22,dnf-0.6.5-1.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/hawkey-0.5.4-1.fc22,dnf-0.6.5-1.fc22 Package hawkey-0.5.4-1.fc22, dnf-0.6.5-1.fc22: * should fix your issue, * was pushed to the Fedora 22 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing hawkey-0.5.4-1.fc22 dnf-0.6.5-1.fc22' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2015-5337/hawkey-0.5.4-1.fc22,dnf-0.6.5-1.fc22 then log in and leave karma (feedback). dnf-plugins-extras-0.0.6-2.fc22,yum-utils-1.1.31-505.fc22,yum-3.4.3-505.fc22,hawkey-0.5.4-1.fc22,dnf-0.6.5-1.fc22 has been submitted as an update for Fedora 22. https://admin.fedoraproject.org/updates/dnf-plugins-extras-0.0.6-2.fc22,yum-utils-1.1.31-505.fc22,yum-3.4.3-505.fc22,hawkey-0.5.4-1.fc22,dnf-0.6.5-1.fc22 Yes this version now appears to work reliably. dnf-plugins-extras-0.0.6-2.fc22, yum-3.4.3-505.fc22, dnf-0.6.5-1.fc22, yum-utils-1.1.31-505.fc22, hawkey-0.5.4-1.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. dnf-plugins-core-0.1.5-2.fc21,dnf-0.6.4-5.fc21 has been submitted as an update for Fedora 21. https://admin.fedoraproject.org/updates/dnf-plugins-core-0.1.5-2.fc21,dnf-0.6.4-5.fc21 dnf-plugins-core-0.1.5-2.fc21, dnf-0.6.4-5.fc21 has been pushed to the Fedora 21 stable repository. If problems still persist, please make note of it in this bug report. |