Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 2115865
Summary: | dnssec-keyfromlabel fails with openssl-pkcs11-0.4.12-1.fc36 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Florence Blanc-Renaud <frenaud> |
Component: | openssl-pkcs11 | Assignee: | Jakub Jelen <jjelen> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 37 | CC: | ansasaki, awilliam, crypto-team, gmarr, jjelen, jvrodrigues, pemensik, robatino |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | RejectedBlocker AcceptedFreezeException | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-02-07 16:10:09 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2009538, 2120605 |
Description
Florence Blanc-Renaud
2022-08-05 15:13:28 UTC
This is probably related to the https://github.com/OpenSC/libp11/pull/460 -- do you have a simple reproducer, which can be tested with a scratch build or should I just update and see if it works? Hi Jakub, if you have a copr build with the fix, I can relaunch our test with your fix and quickly check if it's working. Unfortunately I don't have a reproducer that would be shorter than the one for the description, requiring ipa server installation. I just submitted the Fedora 36 build https://koji.fedoraproject.org/koji/taskinfo?taskID=90606849 which should fix this. It has changes from the above PR and one more commit that is introduced upstream and tested in upstream CI as well as with the downstream make check. Let me know if it will solve the issue for you or not. FEDORA-2022-2f6e9a0b6c has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-2f6e9a0b6c Hum, seeing odd results in openQA testing. On Fedora 36 the tests all passed, but on Rawhide, FreeIPA tests failed: https://openqa.fedoraproject.org/tests/1354594 https://openqa.fedoraproject.org/tests/1354591 https://openqa.fedoraproject.org/tests/1354592 it seems like after the server updated to this version of openssl-pkcs11, client DNS queries stopped working. I'm not sure why not, we don't get logs from the server in this sort of failure case because of limitations of how openQA handles failures (I can try and bodge some out later). I've rerun the tests several times and the failure has reproduced each time. I'll try re-running them once more, though... Still failed on the latest retry. FEDORA-2022-2f6e9a0b6c has been pushed to the Fedora 36 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-2f6e9a0b6c` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-2f6e9a0b6c See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. I see the same behavior as Adam: the fix is working on fedora 36 but failing on fedora 37. On fedora37: the journal shows an error calling dnssec-keyfromlabel: fatal: failed to get key example.test/RSASHA256: no engine ----- 8< ----- Aug 09 07:02:53 master.ipa.test audit[23932]: AVC avc: denied { read } for pid=23932 comm="dnssec-keyfroml" name="enabled" dev="sysfs" ino=3348 scontext=system_u:system_r:ipa_dnskey_t:s0 tcontext=system_u:object_r:sysfs_t:s0 tclass=file permissive=1 Aug 09 07:02:53 master.ipa.test audit[23932]: AVC avc: denied { open } for pid=23932 comm="dnssec-keyfroml" path="/sys/kernel/mm/transparent_hugepage/enabled" dev="sysfs" ino=3348 scontext=system_u:system_r:ipa_dnskey_t:s0 tcontext=system_u:object_r:sysfs_t:s0 tclass=file permissive=1 Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: Traceback (most recent call last): Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/libexec/ipa/ipa-dnskeysyncd", line 130, in <module> Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: while ldap_connection.syncrepl_poll(all=1, msgid=ldap_search): Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib64/python3.11/site-packages/ldap/syncrepl.py", line 435, in syncrepl_poll Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.syncrepl_entry(dn, attrs, c.entryUUID) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/syncrepl.py", line 70, in syncrepl_entry Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.application_add(uuid, dn, attributes) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/keysyncer.py", line 84, in application_add Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.key_meta_add(uuid, dn, attributes) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/keysyncer.py", line 137, in key_meta_add Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.bindmgr_sync(self.dnssec_zones) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/keysyncer.py", line 150, in bindmgr_sync Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.bindmgr.sync(dnssec_zones) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/bindmgr.py", line 232, in sync Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.sync_zone(zone) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/bindmgr.py", line 205, in sync_zone Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: self.install_key(zone, uuid, attrs, tempdir) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipaserver/dnssec/bindmgr.py", line 146, in install_key Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: result = ipautil.run(cmd, capture_output=True) Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: File "/usr/lib/python3.11/site-packages/ipapython/ipautil.py", line 599, in run Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: raise CalledProcessError( Aug 09 07:02:53 master.ipa.test ipa-dnskeysyncd[23573]: ipapython.ipautil.CalledProcessError: CalledProcessError(Command ['/usr/bin/dnssec-keyfromlabel', '-E', 'pkcs11', '-K', '/var/named/dyndb-ldap/ipa/master/example.test/tmp0tge94y3', '-a', b'RSASHA256', '-l', b'pkcs11:object=90c3402b3109c4ff1bb987b8ed92eca4;pin-source=/var/lib/ipa/dnssec/softhsm_pin', '-P', b'20220809070250', '-A', b'20220809070250', '-I', 'none', '-D', 'none', '-f', 'KSK', '-E', 'pkcs11', 'example.test.'] returned non-zero exit status 1: 'dnssec-keyfromlabel: fatal: failed to get key example.test/RSASHA256: no engine\n') ----- 8< ----- Note on the AVC denials there: they show "permissive=1" which means Florence was testing with SELinux in permissive mode, so the access was not really denied (in permissive mode, the "denial" is just logged, but access is allowed). So they likely aren't the issue here. The openssl-pkcs11 is on the same version for both Fedora 36 and 37/rawhide. I do not see any difference in the openssl package either, but there is rebase in IPA as well as in bind so I assume the problem will be somewhere in that direction. Florence, is there a way to try on Fedora 37 with the previously working openssl-pkcs11 0.4.11 [1] to rule out the openssl-pkcs11 fault? [1] https://koji.fedoraproject.org/koji/buildinfo?buildID=2017061 openQA suggests the problem really is triggered by openssl-pkcs11 0.4.11 , because we (now) test all Rawhide updates in openQA. The way the test works, we start from a base image, then update to 'latest 'stable' packages plus the packages from the update', and run the tests. We tested other Rawhide updates around the openssl-pkcs11 0.4.11 update, and all of them passed these tests; the tests fail *only* on the openssl-pkcs11 0.4.1 update. I would expect the tests will start failing on *all* Rawhide updates as soon as we get a new compose with openssl-pkcs11 0.4.1 in it, as then it will be included in the set of 'latest 'stable' updates' and included in all tests. But the results from yesterday and today clearly suggest the problem is really triggered by this update, somehow. I agree it's odd that it happens on 37 not 36, but there we are. You can see what I mean on openQA's "next and previous tests" view: https://openqa.fedoraproject.org/tests/1355264#next_previous that's the view for one of the failed tests (basic realm join with sssd test). You can see that it failed twice for FEDORA-2022-19f8b3b23f (that's openssl-pkcs11 0.4.1), with tests immediately before and after that for other updates passing. The view is filtered to tests for the same release version, so those are all other Rawhide update tests. The failure on _advisory_update for FEDORA-2022-1c95f88c82 was a different issue (and you can see the test for that update was re-run a bit later and passed). I am not so sure about the f37 issue. On fedora36, the error was: warning: ENGINE_load_private_key failed (not found)\ndnssec-keyfromlabel: fatal: failed to get key dnssec.test/RSASHA256: not found On fedora 37, I reproduced the same issue (zone not signed and dnssec-keyfromlabel command failing) even with openssl-pkcs11-0.4.11-9.fc37.x86_64. But the error is different: fatal: failed to get key dnssec.test/RSASHA256: no engine Note that the same error msg is printed with 0.4.11-9.fc37 or 0.4.12-2.fc37. There is probably something else that we are failing to identify. The behavior seems to depend on bind version: bind-9.16.30-1.fc37 + openssl-pkcs11-0.4.12-1.fc37: fatal: failed to get key dnssec.test/RSASHA256: not found bind-9.16.30-1.fc37 + openssl-pkcs11-0.4.12-2.fc37: success => openssl-pkcs11 update to -2 correctly fixes the issue. bind-9.18.5-1.fc37 + openssl-pkcs11-0.4.12-1.fc37: fatal: failed to get key dnssec.test/RSASHA256: no engine bind-9.18.5-1.fc37 + openssl-pkcs11-0.4.12-2.fc37: fatal: failed to get key dnssec.test/RSASHA256: no engine => the bind update is responsible for the "no engine" error. I'm opening a new BZ against bind 9.18 for the "no engine" issue: https://bugzilla.redhat.com/show_bug.cgi?id=2117342 and I think we can consider that this BZ with the original error "not found" is properly fixed by openssl-pkcs11 0.4.12-2 But none of the openQA tests failed on the bind-9.18.5-1.fc37 update: https://openqa.fedoraproject.org/tests/overview?version=37&groupid=2&build=Update-FEDORA-2022-2a542348da&distri=fedora *** Bug 2117540 has been marked as a duplicate of this bug. *** FEDORA-2022-2f6e9a0b6c has been pushed to the Fedora 36 stable repository. If problem still persists, please make note of it in this bug report. and another data point: the first Fedora 37 Branched compose - Fedora-37-20220811.n.0 - included openssl-pkcs11-0.4.12-2.fc37: https://kojipkgs.fedoraproject.org/compose/branched/Fedora-37-20220811.n.0/compose/Everything/x86_64/os/Packages/o/ and on that compose, all the FreeIPA enrolment tests failed: https://openqa.fedoraproject.org/tests/1358294 https://openqa.fedoraproject.org/tests/1358301 https://openqa.fedoraproject.org/tests/1358315 for the second Branched compose - Fedora-37-20220811.n.1 - I had Kevin untag openssl-pkcs11-0.4.12-2.fc37 (since we never synced the first compose this was still OK), so it includes openssl-pkcs11-0.4.12-1.fc37: https://kojipkgs.fedoraproject.org/compose/branched/Fedora-37-20220811.n.1/compose/Everything/x86_64/os/Packages/o/ and on that compose, the FreeIPA enrolment tests passed: https://openqa.fedoraproject.org/tests/1361775 https://openqa.fedoraproject.org/tests/1361789 https://openqa.fedoraproject.org/tests/1361768 so the openQA tests are definitely, reliably, reproducibly working with 0.4.12-1 and failing with 0.4.12-2. bind was the same version in both composes, bind-9.18.5-1.fc37. I wonder if perhaps FreeIPA is automatically disabling dnssec with 0.4.12-1 (because it's really broken) but not doing it with 0.4.12-2, or something like that? I'll have to look at the logs. Filed https://bugzilla.redhat.com/show_bug.cgi?id=2117859 for the f37-specific, -2-specific issue. Thank you for digging further. I actually see some weird issues also in the Fedora 36 container images with the latest two builds after rebase [1], but I was not able to reproduce them locally. Similarly all the low-level tests with openssl-pkcs11 (both upstream and dowsntream) worked just ok so I would need to have a bit more information on what is IPA doing at the time of failure and how the softhsm? is set up. https://gitlab.com/jjelen/build-images/-/jobs/2883523060 Let's close this and continue the discussion at https://bugzilla.redhat.com/show_bug.cgi?id=2117859 . So since we're still holding the -2 update out of Fedora 37 (Branched) because of #2117859 , I'm gonna nominate *this* bug as an F37 Beta blocker. The overall status quo is: * F36 has 0.4.12-2 and appears fine * F37 has 0.4.12-1 and so is affected by this bug * Rawhide has 0.4.12-2 and is affected by https://bugzilla.redhat.com/show_bug.cgi?id=2117859 so I guess for F37, the question is, can we release Beta with this bug? If not, we need a build that fixes both this and 2117859 somehow. Discussed during the 2022-08-29 blocker review meeting: [0] The decision to classify this bug as a "RejectedBlocker (Beta)" and an "AcceptedFreezeException (Beta)" was made on the grounds that dnssec functionality is not covered in the criteria, and the bug does not affect default functionality. It's accepted as a freeze exception because we do consider dnssec support important and would like this fixed if we can also fix 2117859 so things work correctly. [0] https://meetbot.fedoraproject.org/fedora-blocker-review/2022-08-29/f37-blocker-review.2022-08-29-16.01.txt I am trying to wrap my head around all these bug around this issue. If I see right, the Fedora 36 is solved. In Fedora 37 and rawhide, the bind is built without the engine support (see the bug #2117342) so I assume this is also a base for the other failures described in this bug, bug #2122841 and bug #2117859. Can we close this bug or is there still something to address in the openssl-pkcs11? I don't understand the timing. If the ultimate cause of the trouble is bind, why did this bug and 2117859 show up with openssl-pkcs11 updates, not with bind updates? Why do we suddenly have a new problem with bind with the 9.18.6 build that only just showed up? Maybe it all makes sense to you, but it does not to me. @awilliam There are 2 different issues related to DNSSEC: - signing of DNS zones managed by IPA - DNS resolution of zones not managed by IPA when DNSSEC is enabled at IPA level The first scenario is tested by IPA nightly tests but I don't think it's tested by OpenQA. It's the issue described in this BZ 2115865, with an error when running dnssec-keyfromlabel. The 2nd issue can be seen by OpenQA in the client enrollment test. Described in 2117859. Client is using IPA server as DNS resolver and fails to resolve kojipkgs.fedoraproject.org, which is a zone not managed by IPA. So we can agree that the 1st issue is not an issue in openssl-pkcs11, but a bind issue described in #2122841 and in bug #2117342, which removed support for openssl engines with the rebase. Do we have some reproducer for the 2nd issue without the need to go through the first issue? With all the reproducers I tried so far, I am hitting the first issue only ("no engine" error). Florence: sure, that part I get. But we had both those bug reports filed weeks ago, with bind 9.18.5. Then bind 9.18.6 showed up and caused a new bug we had never seen before - https://bugzilla.redhat.com/show_bug.cgi?id=2122841 - which happens even when deploying FreeIPA with dnssec disabled. What I'm confused by is that you seemed to be saying that new bug is really just some other case of one of the existing bugs? Jakub: openQA hits issue #2 with bind 9.18.5 and openssl-pkcs11 0.4.11-2, when deploying the server with dnssec enabled and immediately trying to enrol a client which hits a dnssec-signed zone during client deployment. Basically, deploy the server with dnssec enabled, then have something try to resolve a host in a signed zone not managed by the server. If you somehow hit #1 during that process, you must be doing something different to how openQA does it, but I'm not sure precisely what. You can get the gist of openQA's server deployment process here: https://pagure.io/fedora-qa/os-autoinst-distri-fedora/blob/main/f/tests/role_deploy_domain_controller.pm it's perl, but it's basically just a big long set of commands it runs. The logic right now disables dnssec on Rawhide, that is precisely to work around issue #2 so every Rawhide test doesn't fail on that bug. Adam, how is this bug after the bug #2122841 and bug #2117342 are fixed? Is there still some issue to fix here or in bug #2117859 or can I close them? I *think* everything should be fixed. I was waiting for the updates to go stable for F37, then was going to drop the disabling of dnssec and see if tests still pass. The F37 update went stable over the weekend, so I'm going to try dropping the dnssec disablement today. Another half year passed so I assume this is fixed so closing. oh yes, sorry, I thought I'd updated this. yeah, it's OK now. we still have problems with *upgrades* with dnssec enabled, but fresh deployment on a single release is OK. |