Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1850290 - FreeIPA server upgrade from Fedora 31/32 to Rawhide (33) often fails
Summary: FreeIPA server upgrade from Fedora 31/32 to Rawhide (33) often fails
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: java-1.8.0-openjdk
Version: rawhide
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Deepak Bhole
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks: F33BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2020-06-23 23:20 UTC by Adam Williamson
Modified: 2020-07-18 01:18 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-18 01:18:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
tarball of /var/log from the failed test (4.75 MB, application/gzip)
2020-06-23 23:20 UTC, Adam Williamson
no flags Details
tarball of /var/log from the failed test on 2020-06-01 (4.61 MB, application/octet-stream)
2020-06-30 22:37 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2020-06-23 23:20:31 UTC
Created attachment 1698514 [details]
tarball of /var/log from the failed test

We have a test in openQA which does this:

1. Boot the previous release to the one being tested (in this case, Fedora 32) and deploy the system as a FreeIPA server
2. Similarly boot a second system to the previous release and deploy it as a FreeIPA client
3. Upgrade the server system to the release being tested (in this case, Rawhide); check that IPA is still running after the upgrade
4. Upgrade the client as well
5. Run a series of checks to make sure everything works after the upgrade

Up till Fedora-Rawhide-20200622.n.0 this test was passing. Since Fedora-Rawhide-20200623.n.0 today, it's failing. It fails at step 3: the system upgrade of the server fails, but ipa.service then fails to start. The system logs show that ipa-server-upgrade fails, apparently due to issues with the certificate server (dogtag) and/or the LDAP server (389-ds). I'm attaching a tarball of /var/log .

We're filing this against Java because of all the things that changed in the Fedora-Rawhide-20200623.n.0 compose it seems like the most likely suspect. There were a *lot* of at least *potentially* related things that changed, though. Aside from java-1.8.0-openjdk, other things that changed include bind, glibc, the kernel and openssl. Having looked at them, we don't *think* those changes are likely to be the cause, though.

This violates the Beta criteria: "It must be possible to successfully complete a direct upgrade from a fully updated installation of each of the last two stable Fedora Server releases with the system configured as a FreeIPA domain controller or postgresql server as specified in the relevant criteria. The upgraded system must meet all relevant release criteria, including criteria relating to functionality of the server software" - https://fedoraproject.org/wiki/Fedora_33_Beta_Release_Criteria#Server_upgrade_requirements , so proposing as a Beta blocker.

Comment 1 Severin Gehwolf 2020-06-24 18:50:35 UTC
This seems to be caused by tomcat@pki failing to start. Exception I see in the log is:

2020-06-23 11:09:33 [https-jsse-nio-8443-exec-6] WARNING: Invalid base64: [ipara]: java.lang.IllegalArgumentException: Last unit does not have enough valid bits
java.lang.IllegalArgumentException: Last unit does not have enough valid bits
	at java.util.Base64$Decoder.decode0(Base64.java:734)
	at java.util.Base64$Decoder.decode(Base64.java:526)
	at java.util.Base64$Decoder.decode(Base64.java:549)
	at org.mozilla.jss.netscape.security.util.Utils.base64decode(Utils.java:417)
	at org.dogtagpki.server.authentication.AuthToken.getInByteArray(AuthToken.java:113)
	at org.dogtagpki.server.authentication.AuthToken.getInStringArray(AuthToken.java:198)
	at com.netscape.cms.servlet.cert.CertProcessor.setAuthTokenIntoRequest(CertProcessor.java:132)
	at com.netscape.cms.servlet.cert.CertProcessor.populateRequests(CertProcessor.java:345)
	at com.netscape.cms.servlet.cert.EnrollmentProcessor.processEnrollment(EnrollmentProcessor.java:189)
	at com.netscape.cms.servlet.cert.EnrollmentProcessor.processEnrollment(EnrollmentProcessor.java:97)
	at com.netscape.cms.servlet.profile.ProfileSubmitServlet.processEnrollment(ProfileSubmitServlet.java:276)
	at com.netscape.cms.servlet.profile.ProfileSubmitServlet.process(ProfileSubmitServlet.java:130)
	at com.netscape.cms.servlet.base.CMSServlet.service(CMSServlet.java:493)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAsPrivileged(Subject.java:549)
	at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
	at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:170)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:225)
	at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAsPrivileged(Subject.java:549)
	at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
	at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:253)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:191)
	at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541)
	at com.netscape.cms.tomcat.ExternalAuthenticationValve.invoke(ExternalAuthenticationValve.java:82)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:690)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:373)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.lang.Thread.run(Thread.java:748)

Comment 2 Severin Gehwolf 2020-06-24 19:11:22 UTC
From /var/log/messages:

Jun 23 11:31:50 ipa001 ipactl[717]: CalledProcessError: CalledProcessError(Command ['/bin/systemctl', 'start', 'pki-tomcatd'] returned non-zero exit status 1: 'Job for pki-tomcatd failed because a timeout was exceeded.\nSee "systemctl status pki-tomcatd" and "journalctl -xe" for details.\n')

And from /var/log/ipaupgrade.log

2020-06-23T15:28:46Z DEBUG Starting external process
2020-06-23T15:28:46Z DEBUG args=['/usr/sbin/tomcat', 'version']
2020-06-23T15:28:47Z DEBUG Process finished, return code=0
2020-06-23T15:28:47Z DEBUG stdout=Server version: Apache Tomcat/9.0.36
Server built:   Dec 12 1969 11:45:33 UTC
Server number:  9.0.36.0
OS Name:        Linux
OS Version:     5.8.0-0.rc2.20200622git625d3449788f.1.fc33.x86_64
Architecture:   amd64
JVM Version:    1.8.0_262-ea-b01
JVM Vendor:     Oracle Corporation

2020-06-23T15:28:47Z DEBUG stderr=
2020-06-23T15:28:47Z DEBUG Starting external process
2020-06-23T15:28:47Z DEBUG args=['/bin/systemctl', 'start', 'pki-tomcatd']
2020-06-23T15:31:50Z DEBUG Process finished, return code=1
2020-06-23T15:31:50Z DEBUG stdout=
2020-06-23T15:31:50Z DEBUG stderr=Job for pki-tomcatd failed because a timeout was exceeded.
See "systemctl status pki-tomcatd" and "journalctl -xe" for details.

2020-06-23T15:31:50Z ERROR IPA server upgrade failed: Inspect /var/log/ipaupgrade.log and run command ipa-server-upgrade manually.



I also see this in /var/log/pki/pki-tomcat/ca/debug*.log:

2020-06-23 11:07:16 [https-jsse-nio-8443-exec-3] SEVERE: Object certificate not found. Error Certificate not found: caSigningCert cert-pki-ca: Certificate not found: caSigningCert cert-pki-ca
Certificate not found: caSigningCert cert-pki-ca: Certificate not found: caSigningCert cert-pki-ca
	at com.netscape.ca.SigningUnit.init(SigningUnit.java:163)
	at com.netscape.ca.CertificateAuthority.initSigUnit(CertificateAuthority.java:1741)
	at com.netscape.ca.CertificateAuthority.init(CertificateAuthority.java:540)
	at com.netscape.cmscore.apps.CMSEngine.reinit(CMSEngine.java:1014)
	at org.dogtagpki.server.ca.CAConfigurator.reinitSubsystems(CAConfigurator.java:170)
	at com.netscape.cms.servlet.csadmin.Configurator.setupDatabase(Configurator.java:447)
	at org.dogtagpki.server.ca.CAConfigurator.setupDatabase(CAConfigurator.java:76)
	at org.dogtagpki.server.rest.SystemConfigService.setupDatabase(SystemConfigService.java:137)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:140)
	at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:295)
	at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:249)
	at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:236)
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:406)
	at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:213)
	at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:228)
	at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:56)
	at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:51)
	at javax.servlet.http.HttpServlet.service(HttpServlet.java:741)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAsPrivileged(Subject.java:549)
	at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
	at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:170)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:225)
	at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
	at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:53)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:282)
	at org.apache.catalina.security.SecurityUtil$1.run(SecurityUtil.java:279)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAsPrivileged(Subject.java:549)
	at org.apache.catalina.security.SecurityUtil.execute(SecurityUtil.java:314)
	at org.apache.catalina.security.SecurityUtil.doAsPrivilege(SecurityUtil.java:253)
	at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:191)
	at org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilterChain.java:47)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:149)
	at org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChain.java:145)
	at java.security.AccessController.doPrivileged(Native Method)
	at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:144)
	at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:202)
	at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
	at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:541)
	at com.netscape.cms.tomcat.ExternalAuthenticationValve.invoke(ExternalAuthenticationValve.java:82)
	at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:139)
	at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:92)
	at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:690)
	at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:74)
	at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:343)
	at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:373)
	at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
	at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:868)
	at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1590)
	at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.mozilla.jss.crypto.ObjectNotFoundException: Certificate not found: caSigningCert cert-pki-ca
	at org.mozilla.jss.CryptoManager.findCertByNicknameNative(Native Method)
	at org.mozilla.jss.CryptoManager.findCertByNickname(CryptoManager.java:865)
	at com.netscape.ca.SigningUnit.init(SigningUnit.java:159)
	... 72 more

Comment 3 Andrew John Hughes 2020-06-24 19:55:39 UTC
I don't see anything attributing this to OpenJDK. There seem to be failures throughout the system e.g.

Jun 23 11:09:46 ipa001 certmonger[10065]: 2020-06-23 11:09:46 [10065] error:0D06407A:asn1 encoding routines:a2d_ASN1_OBJECT:first num too large
Jun 23 11:09:53 ipa001 certmonger[10105]: Error obtaining initial credentials: Generic error (see e-text).
Jun 23 11:09:53 ipa001 certmonger[10105]: Error setting up ccache at the client: Generic error (see e-text).
Jun 23 11:11:53 ipa001 dnf[11039]:  - Curl error (6): Couldn't resolve host name for https://dl.fedoraproject.org/pub/fedora/linux/releases/32/Modular/x86_64/os/repodata/repomd.xml [Could not resolve host: dl.fedoraproject.org]
Jun 23 11:16:37 ipa001 named-pkcs11[10722]: LDAP error: Can't contact LDAP server: ldap_sync_poll() failed
Jun 23 11:16:37 ipa001 named-pkcs11[10722]: ldap_syncrepl will reconnect in 60 seconds
Jun 23 11:28:15 ipa001 setroubleshootd[797]: error: Unable to open sqlite database /var/lib/rpm/rpmdb.sqlite: unable to open database file

Comment 4 Adam Williamson 2020-06-24 21:45:43 UTC
Hum. The test passed again today. It's definitely strange that it failed twice in a row exactly the same way with yesterday's compose, but with today's it worked, at least. I'll keep an eye on it and investigate more if it breaks again.

Comment 5 Adam Williamson 2020-06-24 21:47:12 UTC
FWIW, today's test did log a *ton* of AVCs, which may have been related to the failure. I'll file a separate selinux-policy bug for these, but you can see them at https://openqa.fedoraproject.org/tests/620050/file/_console_avc_crash-avcs.txt .

Comment 6 Adam Williamson 2020-06-30 19:30:40 UTC
Well, since I last updated this, the test has failed again every time it's run:

https://openqa.fedoraproject.org/tests/621392#next_previous

ditto a nearly-identical test which upgrades from F31 (rather than F32):

https://openqa.fedoraproject.org/tests/621355#next_previous

so there is definitely still some kind of problem here :( Each test failure has its logs attached on the Logs & Assets tab, so we can poke through to try and find out some common thread. Note the logs will get garbage collected after a couple of weeks, if we don't have this figured out soon I'll try and remember to attach a few more samples.

Comment 7 Adam Williamson 2020-06-30 20:52:16 UTC
I tried a run with SELinux in permissive mode, but still failed:

https://openqa.fedoraproject.org/tests/621579

Comment 8 Adam Williamson 2020-06-30 22:03:13 UTC
Really having trouble getting a handle on this. It seems like the problem is tomcat, but I can't really find anything to say *why*. It starts to start up, then...just...doesn't:

[adamw@adam tmp]$ journalctl --file var-new/log/journal/153ded1a9d5a467b846b7bbd9d256fa9/system.journal -b-1 -u pki-tomcatd --no-pager
-- Logs begin at Tue 2020-06-30 13:15:34 PDT, end at Tue 2020-06-30 13:50:01 PDT. --
Jun 30 13:40:55 ipa001.domain.local systemd[1]: Starting PKI Tomcat Server pki-tomcat...
Jun 30 13:40:57 ipa001.domain.local server[13559]: Java virtual machine used: /usr/lib/jvm/jre-1.8.0-openjdk/bin/java
Jun 30 13:40:57 ipa001.domain.local server[13559]: classpath used: /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar:/usr/lib/java/commons-daemon.jar
Jun 30 13:40:57 ipa001.domain.local server[13559]: main class used: org.apache.catalina.startup.Bootstrap
Jun 30 13:40:57 ipa001.domain.local server[13559]: flags used:
Jun 30 13:40:57 ipa001.domain.local server[13559]: options used: -Dcatalina.base=/var/lib/pki/pki-tomcat -Dcatalina.home=/usr/share/tomcat -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/lib/pki/pki-tomcat/temp -Djava.util.logging.config.file=/var/lib/pki/pki-tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager     -Djava.security.manager     -Djava.security.policy==/var/lib/pki/pki-tomcat/conf/catalina.policy
Jun 30 13:40:57 ipa001.domain.local server[13559]: arguments used: start
Jun 30 13:40:57 ipa001.domain.local ipa-pki-wait-running[13560]: pki.client: /usr/libexec/ipa/ipa-pki-wait-running:60: The subsystem in PKIConnection.__init__() has been deprecated (https://www.dogtagpki.org/wiki/PKI_10.8_Python_Changes).
Jun 30 13:40:57 ipa001.domain.local ipa-pki-wait-running[13560]: ipa-pki-wait-running: Created connection http://ipa001.domain.local:8080/ca
Jun 30 13:40:57 ipa001.domain.local ipa-pki-wait-running[13560]: ipa-pki-wait-running: Connection failed: HTTPConnectionPool(host='ipa001.domain.local', port=8080): Max retries exceeded with url: /ca/admin/ca/getStatus (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2546d9caf0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Jun 30 13:40:58 ipa001.domain.local java[13559]: usr/lib/api/apiutil.c Could not open /run/lock/opencryptoki/LCK..APIlock
Jun 30 13:40:58 ipa001.domain.local ipa-pki-wait-running[13560]: ipa-pki-wait-running: Connection failed: HTTPConnectionPool(host='ipa001.domain.local', port=8080): Max retries exceeded with url: /ca/admin/ca/getStatus (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2546d9cf70>: Failed to establish a new connection: [Errno 111] Connection refused'))
Jun 30 13:40:59 ipa001.domain.local ipa-pki-wait-running[13560]: ipa-pki-wait-running: Connection failed: HTTPConnectionPool(host='ipa001.domain.local', port=8080): Max retries exceeded with url: /ca/admin/ca/getStatus (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2546d57370>: Failed to establish a new connection: [Errno 111] Connection refused'))

eventually ipa-pki-wait-running gets bored and times out.

Comment 9 Adam Williamson 2020-06-30 22:36:38 UTC
Hum, well, here's a thing: looking more closely at the history of the two relevant tests, I think this started happening earlier than I thought. The test failed twice in Fedora-Rawhide-20200601.n.1 in a way that looks very similar:

https://openqa.fedoraproject.org/tests/609532
https://openqa.fedoraproject.org/tests/609584

and there are periodic failures since then. The F31 -> Rawhide test failed in the same module in Fedora-Rawhide-20200529.n.2 and even in Fedora-Rawhide-20200516.n.0 ; those logs have been garbage collected now so I can't tell for sure if they were the same.

I'll attach the logs from the 2020-06-01 failure before they get GCed...

Comment 10 Adam Williamson 2020-06-30 22:37:22 UTC
Created attachment 1699394 [details]
tarball of /var/log from the failed test on 2020-06-01

Comment 11 Adam Williamson 2020-07-11 01:50:36 UTC
so the test has now passed every day for the last week. I officially give up. If it keeps passing next week I guess I'll just close the bug again.

Comment 12 Adam Williamson 2020-07-18 01:18:16 UTC
This kept passing for a while, now it's failing again, but it's because of a known bug https://bugzilla.redhat.com/show_bug.cgi?id=1857043 . So I'll close this again for now.


Note You need to log in before you can comment on or make changes to this bug.