Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1636239 - get_best_query().filter(latest=True) is returning incorrect results
Summary: get_best_query().filter(latest=True) is returning incorrect results
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: 29
Hardware: Unspecified
OS: Linux
urgent
high
Target Milestone: ---
Assignee: rpm-software-management
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: PrioritizedBug, RejectedBlocker
Depends On: 1548586
Blocks: IoT
TreeView+ depends on / blocked
 
Reported: 2018-10-04 19:32 UTC by Brian Lane
Modified: 2019-01-18 22:14 UTC (History)
17 users (show)

Fixed In Version: dnf-4.0.4
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1548586
Environment:
Last Closed: 2018-11-22 17:29:18 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Test script (1.41 KB, text/plain)
2018-10-04 21:17 UTC, Brian Lane
no flags Details

Description Brian Lane 2018-10-04 19:32:43 UTC
+++ This bug was initially created as a clone of Bug #1548586 +++

In lorax we use dnf queries to select packages to be installed. Today I've run into two problems which look like they are related.

When adding the package "system-logos" the query is returning both generic-logos and fedora-logos, which conflict with each other. I had expected it to pick one or the other. Also, when using the F27 Everything repo with the F27 updates repo it includes 2 different versions of fedora-logos. Not the latest.

Related to that, when selecting '*-firmware' it returns multiple versions of the firmware, instead of just the newest from the updates repo. eg. two versions of iwl1000-firmware. I'm not sure when this behavior changed, but it used to work on F26.

I'll included a python script to reproduce this.

The same behavior happens with F27 and rawhide.
rawhide dnf version - dnf-2.7.5-8.fc28.noarch
F27 dnf version - dnf-2.7.5-2.fc27.noarch

--- Additional comment from Brian Lane on 2018-02-23 14:34 PST ---

Run this to reproduce the problem.

--- Additional comment from Brian Lane on 2018-02-23 14:35 PST ---

Note the multiple packages returned by the query+filter, and the multiple versions of firmware, eg. iwl1000-firmware from both repos with different versions.

--- Additional comment from Jaroslav Mracek on 2018-02-27 04:27:49 PST ---

I think that the first problem is not a bug. The get_best_query returns all packages that were represented by provided string. Because you used a provide therefore multiple package names can return. The correct solution would by to pass the result query to base.goal.install(select=sltr, optional=(not strict)) where:
sltr=dnf.selector.Selector()
sltr = slts.set(pkg=query)


The second part with with filter(latest=True) is fixed in upstream. Please can you check it for our copr repo ("dnf copr enable rpmsoftwaremanagement/dnf-nightly") - fixed in libdnf-0.13.0+

Please if you thing that issue is somewhere else, or upstream version does't work like you expect, don't hesitate to reopen the bug report.

--- Additional comment from Jaroslav Mracek on 2018-02-27 04:35:15 PST ---



--- Additional comment from Adam Williamson on 2018-02-27 09:14:15 PST ---

"The get_best_query returns all packages that were represented by provided string."

In that case, what does "best" mean?

--- Additional comment from Jaroslav Mracek on 2018-02-27 23:57:16 PST ---

I am not sure, but get_best_query() tries if provided string is NEVRA or its part, then if it is a valid provide or, file provide. I think that best was supposed as as best choice in best order. Like if string return positive result for nevra search and provide it returns only result for nevra search. Or nevra often can be parsed as name or name-version. Then if both provides a result only one will be returned according to given priority (in forms or default priority).

--- Additional comment from Dusty Mabe on 2018-05-31 10:22:16 PDT ---

can we get this bug fixed in f28 ? I'm hitting an issue in lorax in f28: https://github.com/weldr/lorax/issues/368

--- Additional comment from Jaroslav Mracek on 2018-06-01 00:45:09 PDT ---

The problem can be fixed in F28 in about next 30 days with dnf-3.0 release.

--- Additional comment from Dusty Mabe on 2018-06-01 05:28:49 PDT ---

should we open another bug to track that?

--- Additional comment from Peter Robinson on 2018-06-03 07:34:54 PDT ---

(In reply to Jaroslav Mracek from comment #8)
> The problem can be fixed in F28 in about next 30 days with dnf-3.0 release.

That's not ideal, it's a regression causing rel-eng issues in particular around composing atomic updates.

--- Additional comment from Chad on 2018-06-04 07:29:49 PDT ---

I ran into this today as well.  I agree with Peter; 30 days isn't ideal or acceptable.

--- Additional comment from Peter Robinson on 2018-06-04 07:40:57 PDT ---

Actually thinking about this at all a regression _AT_ _ALL_ like this for a stable release is actually completely unacceptable!!

Where is the test process to ensure no regression? This is core functionality now affecting (at a minimum):
* Fedora Atomic host
* Fedora Atomic Workstation
* IoT

--- Additional comment from Matthew Miller on 2018-06-04 09:17:54 PDT ---

Sooooo.... we are past 30 days. Any news?

--- Additional comment from Dusty Mabe on 2018-06-04 09:23:41 PDT ---

(In reply to Matthew Miller from comment #13)
> Sooooo.... we are past 30 days. Any news?

AFAICT the comment about 30 days was made 3 days ago.

--- Additional comment from Matthew Miller on 2018-06-04 09:26:42 PDT ---

(Oh, sorry, calendar math error. I see that we are not past thirty days, just three. Still. That seems like a long time!)

--- Additional comment from Matthew Miller on 2018-06-06 07:28:14 PDT ---

As I understand it, Atomic Host has a workaround in place. Peter, do you have a similar workaround for IoT, or is this blocking you?

--- Additional comment from Colin Walters on 2018-06-06 10:16:41 PDT ---

rpm-ostree uses a (currently forked) version of libdnf and some of the behavior around queries is different.  To my knowledge rpm-ostree isn't affected by this.

--- Additional comment from Colin Walters on 2018-06-06 10:20:06 PDT ---

For a lot of our editions, it's actually just a pungi-imposed constraint that the generated artifacts use an Anaconda version (generated by lorax) from the same package set.  We can trivially unblock ourselves in a lot of these cases by just using a known-good installer image, and updating the "known good" version one periodically, etc.

--- Additional comment from Dusty Mabe on 2018-06-06 10:20:57 PDT ---

(In reply to Matthew Miller from comment #16)
> As I understand it, Atomic Host has a workaround in place. 

Yes the workaround is here: https://infrastructure.fedoraproject.org/cgit/ansible.git/commit/roles/bodhi2/backend/templates?id=7855fe096908181b5fb272651659a196596083e2

(In reply to Colin Walters from comment #17)
> rpm-ostree uses a (currently forked) version of libdnf and some of the
> behavior around queries is different.  To my knowledge rpm-ostree isn't
> affected by this.

This was an issue in building the ISO. Not in rpm-ostree itself.

--- Additional comment from Dusty Mabe on 2018-06-06 10:23:22 PDT ---

(In reply to Colin Walters from comment #18)
> For a lot of our editions, it's actually just a pungi-imposed constraint
> that the generated artifacts use an Anaconda version (generated by lorax)
> from the same package set.  We can trivially unblock ourselves in a lot of
> these cases by just using a known-good installer image, and updating the
> "known good" version one periodically, etc.

yeah I agree. Having an installer that was maintained separately wouldn't be a bad idea. I've mentioned it before: https://pagure.io/pungi-fedora/pull-request/598#comment-50701

--- Additional comment from Brian Lane on 2018-07-19 10:49:00 PDT ---

Still hitting this in f28 (at least) with libdnf-0.11.1-3.fc28.x86_64 and nothing in updates-testing. Any ETA on a fix?

--- Additional comment from Jaroslav Mracek on 2018-07-20 00:07:46 PDT ---

Please can you try dnf-3.0 from cops repo (dnf copr enable rpmsoftwaremanagement/dnf-nightly)?

--- Additional comment from Brian Lane on 2018-07-20 11:57:28 PDT ---

(In reply to Jaroslav Mracek from comment #22)
> Please can you try dnf-3.0 from cops repo (dnf copr enable
> rpmsoftwaremanagement/dnf-nightly)?

Yes, libdnf-0.16.0-0.11g230dc638.fc28.x86_64, solve the problem for me.

--- Additional comment from Adam Williamson on 2018-09-04 14:05:57 PDT ---

Note that this prevents a test I'm currently working on from being useful for stable releases. I'd like to have an openQA test run on candidate updates that creates a netinst image using lorax then runs install tests with it; obviously for stable releases it should include all packages from the release repo, packages from the stable updates repo, and packages from the candidate update. But because of this bug, the image build always fails if both the release repo and the stable updates repo are used.

--- Additional comment from Matthew Miller on 2018-09-12 08:45:53 PDT ---

Just to make the above comment more obvious since I'd missed the implication:

We'd really like this bug fixed in the stable F28 release, not just in future releases, because it impedes testing of updates.

Comment 1 Brian Lane 2018-10-04 19:35:54 UTC
This appears to STILL be a problem with the versions currently in Fedora-29

dnf-3.6.1-1.fc29.noarch
dnf-data-3.6.1-1.fc29.noarch
dnf-plugins-core-3.0.4-1.fc29.noarch
dnf-yum-4.0.3.6.1-1.fc29.noarch
libdnf-0.20.0-1.fc29.x86_64
python3-dnf-3.6.1-1.fc29.noarch
python3-dnf-plugins-core-3.0.4-1.fc29.noarch
python3-libdnf-0.20.0-1.fc29.x86_64


This prevents lorax-composer from working:

[root@composer-f29 lorax]# composer blueprints depsolve example-development 
2018-10-04 15:35:14,929: example-development: There was a problem depsolving [('cmake', '*'), ('curl', '*'), ('file', '*'), ('gcc', '*'), ('gcc-c++', '*'), ('gdb', '*'), ('git', '*'), ('glibc-devel', '*'), ('gnupg2', '*'), ('libcurl-devel', '*'), ('make', '*'), ('openssl-devel', '*'), ('sqlite', '*'), ('sqlite-devel', '*'), ('sudo', '*'), ('tar', '*'), ('xz', '*'), ('xz-devel', '*'), ('zlib-devel', '*')]: 
 Problem 1: cannot install both curl-7.61.1-1.fc29.x86_64 and curl-7.61.1-1.fc29.x86_64
  - conflicting requests
 Problem 2: cannot install both gdb-8.2-2.fc29.x86_64 and gdb-8.2-2.fc29.x86_64
  - conflicting requests
 Problem 3: cannot install both git-2.19.0-1.fc29.x86_64 and git-2.19.0-1.fc29.x86_64
  - conflicting requests
 Problem 4: cannot install both glibc-devel-2.28-9.fc29.i686 and glibc-devel-2.28-9.fc29.i686
  - conflicting requests
 Problem 5: cannot install both glibc-devel-2.28-9.fc29.x86_64 and glibc-devel-2.28-9.fc29.x86_64
  - conflicting requests
 Problem 6: cannot install both libcurl-devel-7.61.1-1.fc29.i686 and libcurl-devel-7.61.1-1.fc29.i686
  - conflicting requests
 Problem 7: cannot install both libcurl-devel-7.61.1-1.fc29.x86_64 and libcurl-devel-7.61.1-1.fc29.x86_64
  - conflicting requests
 Problem 8: cannot install both openssl-devel-1:1.1.1-3.fc29.i686 and openssl-devel-1:1.1.1-3.fc29.i686
  - conflicting requests
 Problem 9: cannot install both openssl-devel-1:1.1.1-3.fc29.x86_64 and openssl-devel-1:1.1.1-3.fc29.x86_64
  - conflicting requests
blueprint: example-development v0.0.1

Comment 2 Fedora Blocker Bugs Application 2018-10-04 19:37:05 UTC
Proposed as a Blocker for 29-final by Fedora user bcl using the blocker tracking app because:

 Prevents lorax-composer from working.

Comment 3 Adam Williamson 2018-10-04 20:55:10 UTC
Is lorax-composer on any of Fedora's critical paths? It's not, AFAIK?

Comment 4 Brian Lane 2018-10-04 21:17:28 UTC
Created attachment 1490729 [details]
Test script

Here's a test script to reproduce the problem. Run this on a f29 VM and observe the output:

[root@composer-f29 ~]# ./dnf-fail-bash.py
bash wants to install: [<hawkey.Package object id 2787, bash-4.4.23-4.fc29.x86_64, updates-testing>, <hawkey.Package object id 12437, bash-4.4.23-4.fc29.x86_64, fedora>]
bash-4.4.* wants to install: [<hawkey.Package object id 2787, bash-4.4.23-4.fc29.x86_64, updates-testing>, <hawkey.Package object id 12437, bash-4.4.23-4.fc29.x86_64, fedora>]

It appears to have the same package in 2 different repos, but it is returning both of them instead of just one.


(and to answer adam's question, not lorax-composer isn't in the critical path, but lorax is, and so are other things that depend on dnf. I don't think this problem is limited to lorax-composer, it's just what's hitting it at the moment).

Comment 5 Jaroslav Mracek 2018-10-05 18:55:03 UTC
Please can you point me to the code in lorax or other tool where query.latest function is used and where you fill the goal or how? We can also make a direct call. I can help you to fix the issue, just let us to help you.

Comment 6 Brian Lane 2018-10-05 21:08:47 UTC
https://github.com/weldr/lorax/blob/master/src/pylorax/api/projects.py#L215

I've added the max( ... or [None]) to work around the problem for now.

Comment 7 Jaroslav Mracek 2018-10-06 13:52:18 UTC
The correct solution would be as I mentioned above:
```
query = dnf.subject.Subject(name).get_best_query(dbo.sack).filter(version__glob=version, latest=True)
if not query:
    install_errors.append(("%s-%s" % (name, version), "No match"))
    continue
sltr=dnf.selector.Selector()
sltr = slts.set(pkg=query)
dbo.goal.install(select=sltr, optional=(not strict))
```
From my point of view this is really not a bug and DNF returns everything correctly.

Comment 8 Jaroslav Mracek 2018-10-07 19:05:13 UTC
If you will provide additional information about character of "name" variable (if it is name of package, provide or NEVRA, or partial nevra type), I can provide additional performance improvement. Anyway usage version__glob=version is not perfect in aspect of performance. Do you want an additional help?

Comment 9 Daniel Mach 2018-10-08 11:49:14 UTC
Brian,
what Jaroslav suggests in comment#7 is the correct solution you should implement in lorax. If you need any help with changing lorax code, please reach out to us via email or (even better) schedule a call so we can work interactively on solving the problem.

Comment 10 František Zatloukal 2018-10-08 16:48:56 UTC
Discussed during the 2018-10-08 blocker review meeting: [1]

The decision to classify this bug as an RejectedBlocker was made:

"there does not seem to be any current violation of the release criteria here, this is not affecting F29 composes"

[1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2018-10-08/f29-blocker-review.2018-10-08-16.00.log.txt

Comment 11 Brian Lane 2018-10-08 18:41:57 UTC
(In reply to Jaroslav Mracek from comment #7)
> The correct solution would be as I mentioned above:
> ```
> query =
> dnf.subject.Subject(name).get_best_query(dbo.sack).
> filter(version__glob=version, latest=True)
> if not query:
>     install_errors.append(("%s-%s" % (name, version), "No match"))
>     continue
> sltr=dnf.selector.Selector()
> sltr = slts.set(pkg=query)
> dbo.goal.install(select=sltr, optional=(not strict))
> ```
> From my point of view this is really not a bug and DNF returns everything
> correctly.

I'm already using package_install() which contains similar code to that.
Your example doesn't work though, Selector() isn't documented, and self.goal seems to not exist, although self._goal does :)

The problem, which I've worked around using max(), is that it returns too many results. Even if I change it to use this:

dnf.subject.Subject(name).get_best_query(dbo.sack).filter(version__glob=version).latest()

which according to https://dnf.readthedocs.io/en/latest/api_queries.html#dnf.query.Query.latest should limit the result to 1 package per name/arch?

I suppose this could be a corner case since the current issue seems to be that the *same* package is in 2 places for f29.

As for name, version the name is a package name, not a NEVRA or glob. The version is a glob or version number to match.

Comment 12 Jaroslav Mracek 2018-10-08 20:51:34 UTC
Selector is described here https://dnf.readthedocs.io/en/latest/api_selector.html. Unfortunately not descriptive.

The description for latest() says 
Return a new query that limits the result to ``limit`` highest version of packages per package
    name and per architecture. In case the limit is negative number, it excludes the number of
    latest versions according to limit.

It means that it can returns multiple packages with latest version, but anyway it is not perfect and we will improve it.

Here is some improvement https://github.com/rpm-software-management/dnf/pull/1240

But back to your code. What about:

for name, version in projects:
    # Find the best package matching the name + version glob
    # dnf can return multiple packages if it is in more than 1 repository
    query = dbo.sack.query().filterm(name=name)
    if version:
        query.filterm(version=version)
    # decide what is better "latest" filter or "latest_per_arch" 
    query.filterm(latest=1)
    # query.filterm(latest_per_arch=1)
    if not query:
        install_errors.append(("%s-%s" % (name, version), "No match"))
        continue
    # If there is a multiple arch in query ask your self what you want to add to create a single install request or request per arch.
    sltr=dnf.selector.Selector()
    sltr = slts.set(pkg=query)
    # in near future there will be a "goal" attribute of Base class
    dbo._goal.install(select=sltr, optional=(not strict))

Comment 13 Brian Lane 2018-10-11 18:14:55 UTC
Thanks for the suggestions, I have it working with a bit more tweaking (I need to use provides__glob instead of name and version__glob instead of version in the filterm() calls.

Note that the .filterm() isn't documented on dnf.readthedocs.org which makes it a bit difficult to figure out how to use it.

Comment 14 Adam Williamson 2019-01-18 00:55:52 UTC
Brian: do you think we could get the fix on F28 as well as F29? It's impossible to build an installer image for F28 using updates repo atm...

Comment 15 Adam Williamson 2019-01-18 22:14:44 UTC
Never mind, the F28 bug turned out to be something else.


Note You need to log in before you can comment on or make changes to this bug.