Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1868482 - krb5-1.18.2-19.fc32 breaks FreeIPA replica deployment (openQA test)
Summary: krb5-1.18.2-19.fc32 breaks FreeIPA replica deployment (openQA test)
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: 389-ds
Version: 32
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Simon Pichugin
QA Contact: Fedora Extras Quality Assurance
URL: https://github.com/389ds/389-ds-base/...
Whiteboard: openqa
: 1868207 1869009 (view as bug list)
Depends On: 1915868
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-12 19:56 UTC by Adam Williamson
Modified: 2023-09-12 03:46 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-25 17:12:20 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
/var/log tarball from the replica (1.62 MB, application/octet-stream)
2020-08-12 19:57 UTC, Adam Williamson
no flags Details
dirsrv on primary with connection logging (306.22 KB, application/gzip)
2020-09-01 16:54 UTC, Robbie Harwood
no flags Details
dirsrv logs from replica (16.01 KB, application/gzip)
2020-09-01 16:56 UTC, Robbie Harwood
no flags Details
test 389 + krb5, level 8 dirsrv logs (438.43 KB, application/gzip)
2020-12-02 20:52 UTC, Robbie Harwood
no flags Details
replica installation log (389 test, krb5 1.19) for collation (95.91 KB, text/plain)
2020-12-02 20:53 UTC, Robbie Harwood
no flags Details
ipa.rharwood.test.dirsrv.errors.log (13.31 KB, text/plain)
2020-12-07 16:01 UTC, Simon Pichugin
no flags Details

Description Adam Williamson 2020-08-12 19:56:13 UTC
The recent krb5-1.18.2-19.fc32 update for Fedora 32 - https://bodhi.fedoraproject.org/updates/FEDORA-2020-e33d26ea4e - seems to have broken FreeIPA replica deployment. Unfortunately, some kind of blip in the test scheduling process (not sure what happened yet) meant we didn't run the openQA tests on that update itself, and it went stable without anyone noticing the problem. I noticed the problem now because *all* F32 updates are now failing the test, not because they themselves are broken but because the problematic krb5 update went stable.

I identify krb5 as the culprit because I did a scratch build which was simply the previous stable build - krb5-1.18.2-10.fc32 - with the release tag bumped to 100, so it was versioned krb5-1.18.2-100.fc32:
https://koji.fedoraproject.org/koji/taskinfo?taskID=49169392
and ran the openQA tests on that:
https://openqa.fedoraproject.org/tests/overview?groupid=2&build=Kojitask-49169427-NOREPORT&version=32&distri=fedora

so I was effectively testing a downgrade to krb5-1.18.2-10.fc32 , and the tests passed. So the krb5 upgrade does seem to be the problem.

The actual failure goes like this. In openQA we have three tests that run together, a master, a replica, and a client. First the master should deploy itself, then the replica should deploy itself as a replica of the master, then the client should try to enrol against the replica. The master deploys itself fine, but the replica deployment fails partway through, apparently due to a failure to reach the LDAP server on the master:
https://openqa.fedoraproject.org/tests/638488#step/realmd_join_sssd/29

I'm attaching a tarball of /var/log from the replica. Getting the logs out of the master is harder but if necessary I can probably bodge it up somehow.

Comment 1 Adam Williamson 2020-08-12 19:57:21 UTC
Created attachment 1711221 [details]
/var/log tarball from the replica

Comment 2 Robbie Harwood 2020-08-13 16:20:24 UTC
*** Bug 1868207 has been marked as a duplicate of this bug. ***

Comment 3 Robbie Harwood 2020-08-13 16:21:12 UTC
Changes have been reverted.  Bodhi: https://bodhi.fedoraproject.org/updates/FEDORA-2020-d10a284af3

Comment 4 Robbie Harwood 2020-08-13 19:57:20 UTC
There's shockingly little information in those logs.  If there's a way to bump ns_slapd logging (or even better, get KRB5_TRACE from the failing command), I'd appreciate it.  I'll try to reproduce on local machines but may be unable to.

Comment 5 Adam Williamson 2020-08-13 20:53:55 UTC
the tricky thing with FreeIPA tests is there's seventeen zillion different components that each have their own debug logging settings and if we turn them all on at once you can't find anything...also none of them turn on debug logging the same way so I always have to ask how. here is me asking. :D

Comment 6 Robbie Harwood 2020-08-13 22:11:11 UTC
Well, hopefully not me - it's not my area of expertise, and I think we have the same complaints in this area :)

Maybe we can bother Rob?

Comment 7 Jan Pazdziora 2020-08-16 19:27:11 UTC
*** Bug 1869009 has been marked as a duplicate of this bug. ***

Comment 8 Rob Crittenden 2020-08-17 14:55:37 UTC
Is anything needed here since the root cause was identified?

Comment 9 Adam Williamson 2020-08-17 15:58:21 UTC
Was the root cause identified? I thought we were still looking for it. The 'fix' in -20 was only to revert the dns_canonicalize_hostname changes, but I think rharwood would like to put them back only with the actual cause of this bug fixed. I can try and provide more useful logs, but need direction as to what would be useful...or perhaps it's easier for you guys to get it from your CI since you have the issue reproduced there too?

Comment 10 Rob Crittenden 2020-08-17 17:36:53 UTC
Well, the root cause for IPA failing was the new package :P

I don't think we can debug this from automation.

The recorded reason for failure to start replication is:

Error (-2) Problem connecting to replica - LDAP error: Local error (connection error)

That comes from 389-ds, which is acting as a client in this case, trying to resolve the IPA server to setup a replication connection with.

Comment 11 Adam Williamson 2020-08-17 17:47:37 UTC
well yeah, and that's why I was trying to get the logs out of the server end (the *original* FreeIPA server) too, but just didn't manage it, it's tricky with how openQA works.

Comment 12 Robbie Harwood 2020-08-21 18:35:36 UTC
(Thanks for your patience while I was PTO.)

(In reply to Rob Crittenden from comment #10)
> I don't think we can debug this from automation.

I will try to reproduce locally and see if I can get anything.

Comment 13 Robbie Harwood 2020-08-24 21:36:28 UTC
Reproduced locally, but this is awful to debug.

First, the file seems to use a mixture of logger and print - the print statements don't show up in the log file.  For my purposes, this means that `print("Starting replication, please wait until this has completed.")` is invisible.  So nothing below start_replication() logs a hint of progress, as far as I can tell, and there aren't external calls.

Second, the LDAP functions return error codes, but they're discarded.  On line 1240, we have the return code from check_repl_init which is meaningful.  And check_repl_init() prints a bunch of stuff, which would be helpful, but again - print statements.

On the the replica, no krb5 calls seem to fail.  KCM:0 is left with creds for admin -> {ldap/ipa.rharwood.test, krbtgt/RHARWOOD.TEST, HTTP/ipa.rharwood.test}, while FILE:/tmp/krbccbmvno7oj/ccache I believe has host/replica.rharwood.test -> {krbtgt/RHARWOOD.TEST, HTTP/ipa.rharwood.test, ldap/ipa.rharwood.test} based on KRB5_TRACE output (plus some IPA session cookie stuff).

On the server, nothing is logged for access attempt that I can find.  I'm not totally out of ideas here, but would appreciate information on how to actually get logs out of dirsrv / the python ldap client, preferably without needing to use pdb.

Comment 14 Adam Williamson 2020-08-24 22:46:49 UTC
Dunno if this helps, but one thing I like about Python is that you can just hack it up in place. No need to rebuild anything. Just edit the file and change or add whatever additional logging you like. It's fairly convenient...

Comment 15 Rob Crittenden 2020-08-25 15:23:31 UTC
Not much happens after the "Starting replication" message.

A mod is sent to the remote 389-ds to start replication then the local 389-ds waits for it to finish by querying the mod'd entry for status updates. Can you tell if the mod was successful or is it a place where an LDAP error is missed? The connection is made earlier so I'd expect that to uncover any canonicalization issues before it even got to the point of writes.

I'm not sure which error codes you are referring to. I believe that in the case of an exception either the code == the exception or we print/log the error value.

And yes, there is still some inconsistency between what is printed and logged, it's unfortunate but hasn't been a priority to address.

For additional logging on the 389-ds side I'd refer you to https://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#troubleshooting

Replication logging (8192) on both sides might be a good place to start. Interrupting the installer using pdb can be an effective way to inject a value like this during a live install so you can more finely target the logging. There are some log levels that will drag down 389 performance to the point where things start timing out (replication logging isn't one of them).

Comment 16 Robbie Harwood 2020-08-26 21:20:39 UTC
Yeah, this is the problem I'm trying to articulate: it gets printed, but doesn't show up in the log:

    [ldap://ipa.rharwood.test:389] reports: Update failed! Status: [Error (-2) - LDAP error: Local error - no response received]

    [root@replica ~]# grep -i response /var/log/ipareplica-install.log
    [root@replica ~]# 

Cranking the logging on dirsrv (to 136) on the primary gets us:

    [26/Aug/2020:20:07:39.073773891 +0000] - DEBUG - slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Matching credential not found)) errno 2 (No such file or directory)
    [26/Aug/2020:20:07:39.078513493 +0000] - DEBUG - slapi_ldap_bind - Error: could not perform interactive bind for id [] authentication mechanism [GSSAPI]: error -2 (Local error)

but bizarrely there's no information about *who* this user was trying to authenticate as or anything like that.

Unfortunately my IPA server and replica are now both cooked and I can't enroll clients.  Also this happens:

[root@ipa tmp]# ipa host-del replica.rharwood.test
ipa: ERROR: Configured time limit exceeded
[root@ipa tmp]# 

I'll rebuild these VMs tomorrow, but there is probably too much chaff here for me to sift through.

Comment 17 Robbie Harwood 2020-08-28 18:33:18 UTC
Yeah, I'm going to refer to the experts on this one.

389ds folks, we're seeing a connection misbehavior during IPA replica install.  (To reproduce, make sure all machines are running krb5-1.18.2-19 and not something newer, then follow https://www.freeipa.org/page/V4/Replica_Setup .  The ipa-replica-install step will fail during part of replication.

While I doubt the eventual fix will be in 389ds, I've been unable to make headway on debugging.  There seem to be a /ton/ of LDAP connections even in an "at rest" IPA, and there isn't anything that I can see like httpd's access log to show what users presented credentials for what.  (See #c16 - that's as much info as I could figure out how to get.)

While the current krb5 builds don't cause this problem, this is because I've reverted the change.  However, the change is still present upstream, so we still need to do something about it - which means it needs to be debugged, and it's only been a problem here.

Comment 18 mreynolds 2020-08-28 18:51:53 UTC
Are the systems still up and running?

Are the DS logs available from both master and replica?  

A stack trace from master when the replica can not connect would be useful.

In this case it would also be interesting to enable "Connection" logging in DS which might provide more clues as to what's going on.  

    # dsconf slapd-YOUR_INSTANCE config replace nsslapd-errorlog-level=8

And, can a client(ldapsearch) reach the master if run from the replica?  Meaning, is the master hung, or is replication session hung/misbehaving?

Comment 19 Robbie Harwood 2020-08-31 21:26:37 UTC
Stability is not great on my local systems, so no.  However, reproducibility has been consistent - just follow https://www.freeipa.org/page/V4/Replica_Setup with two VMs.  I'll try to get the information you're after tomorrow in case you run into trouble.

Comment 20 Robbie Harwood 2020-09-01 16:54:58 UTC
Created attachment 1713344 [details]
dirsrv on primary with connection logging

Attached logs from dirsrv with connection logging enabled.  (I'll attach logs from the replica in a moment.)

I don't know how to get you a stack trace here since IPA is using the python bindings.

An ldapsearch can reach the primary without issue (I tested using `ldapsearch -N -Y GSSAPI -H ldap://ipa.example.test`).

Comment 21 Robbie Harwood 2020-09-01 16:56:01 UTC
Created attachment 1713345 [details]
dirsrv logs from replica

Comment 22 mreynolds 2020-09-03 13:16:59 UTC
Sorry there is nothing in the DS logs that is providing any more information.  All I can tell is that it's failing when calling openldap's function ldap_sasl_interactive_bind_s().  There is nothing more in DS we can do to get more information out of this bind failure.  Maybe Matus who works on openldap might have more insight?  CCing Matus...

Comment 23 Robbie Harwood 2020-09-03 14:20:23 UTC
Thanks for taking a look!

Comment 24 Matus Honek 2020-09-29 14:29:43 UTC
I believe we need to add more logging to 389-ds; add an option that will allow logging in e.g. errorlog the LDAP_DEBUG_* messages from libldap, this should be fairly easy. And given the issue might lie in the actual credentials, probably also enable 389-ds to be able to log SASL structs but that sounds like more work. I am afraid I cannot suggest any better, sorry. HTH

Comment 25 Robbie Harwood 2020-12-02 20:51:15 UTC
(Moving to Simon since they provided me some 389ds test builds.)

With the test 389ds from https://koji.fedoraproject.org/koji/taskinfo?taskID=56533162 and krb5 from https://copr.fedorainfracloud.org/coprs/rharwood/krb5-1.19/ , I'm not really seeing more information.  (I set log level 8 based on #c18).  Will upload logs in a moment.

Comment 26 Robbie Harwood 2020-12-02 20:52:13 UTC
Created attachment 1735786 [details]
test 389 + krb5, level 8 dirsrv logs

Comment 27 Robbie Harwood 2020-12-02 20:53:12 UTC
Created attachment 1735787 [details]
replica installation log (389 test, krb5 1.19) for collation

Comment 28 Simon Pichugin 2020-12-03 08:24:29 UTC
(In reply to Robbie Harwood from comment #26)
> Created attachment 1735786 [details]
> test 389 + krb5, level 8 dirsrv logs

Try to add the default level too - 16384. So it will be 16392.

Also, could you please provide the exact reproducing steps for the deployment? I'll try to check it locally too.

Comment 29 Robbie Harwood 2020-12-03 16:57:00 UTC
Reproduction instructions (mostly lifted from https://www.freeipa.org/page/V4/Replica_Setup ):

I'm using two VMs, a primary (called ipa.rharwood.test) and a replica (replica.rharwood.test).  They're both fc32 because that's what your packages built for, and I'm also using packages from my copr.  Before starting, you'll need to make sure hostnames and such are right and can resolve each other properly.

# on both
dnf -y copr enable rharwood/krb5-1.19
dnf -y install /usr/bin/koji
koji download-task 56533162
dnf -y install ./*.rpm
dnf -y update
dnf -y install freeipa-server{,-dns}

# on primary - install IPA server (I'm using password secretes for everything)
ipa-server-install -r RHARWOOD.TEST -p secretes -a secretes --setup-dns -N -U --auto-forwarders

# on replica - install IPA client
ipa-client-install -p admin -w secretes --domain=rharwood.test --server=ipa.rharwood.test -N
# type yes at the prompts

# on primary
printf "secretes\n" | kinit admin
ipa hostgroup-add-member ipaservers --hosts replica.rharwood.test

# snapshot VMs here - replica uninstallation doesn't seem to work

# on replica - set up any debugging (e.g., 389ds on primary) etc. before this step
printf "secretes\n" | kinit admin
ipa-replica-install # may have to type yes at the prompt

Comment 30 Robbie Harwood 2020-12-03 17:26:19 UTC
Actually, digging more into the replica uninstall failure, that ends up in lib389 as well.

[root@ipa ~]# KRB5_TRACE=/dev/stderr ipa-replica-manage list -v ipa.rharwood.test
[1655] 1607016072.730218: ccselect module realm chose cache KCM:0 with client principal admin for server principal ldap/ipa.rharwood.test
[1655] 1607016072.730219: Getting credentials admin -> ldap/ipa.rharwood.test@ using ccache KCM:0
[1655] 1607016072.730220: Retrieving admin -> ldap/ipa.rharwood.test@ from KCM:0 with result: 0/Success
[1655] 1607016072.730222: Creating authenticator for admin -> ldap/ipa.rharwood.test@, seqnum 789394849, subkey aes256-cts/4FC5, session key aes256-cts/F867
[1655] 1607016072.730227: Read AP-REP, time 1607016072.730223, subkey aes256-cts/5B00, seqnum 100356027
[1655] 1607016074.029311: ccselect module realm chose cache KCM:0 with client principal admin for server principal ldap/ipa.rharwood.test
[1655] 1607016074.029312: Getting credentials admin -> ldap/ipa.rharwood.test@ using ccache KCM:0
[1655] 1607016074.029313: Retrieving admin -> ldap/ipa.rharwood.test@ from KCM:0 with result: 0/Success
[1655] 1607016074.029315: Creating authenticator for admin -> ldap/ipa.rharwood.test@, seqnum 721383457, subkey aes256-cts/CD91, session key aes256-cts/F867
[1655] 1607016074.029320: Read AP-REP, time 1607016074.29316, subkey aes256-cts/A190, seqnum 314557370
[1655] 1607016074.029327: ccselect module realm chose cache KCM:0 with client principal admin for server principal ldap/ipa.rharwood.test
[1655] 1607016074.029328: Getting credentials admin -> ldap/ipa.rharwood.test@ using ccache KCM:0
[1655] 1607016074.029329: Retrieving admin -> ldap/ipa.rharwood.test@ from KCM:0 with result: 0/Success
[1655] 1607016074.029331: Creating authenticator for admin -> ldap/ipa.rharwood.test@, seqnum 77337085, subkey aes256-cts/C01D, session key aes256-cts/F867
[1655] 1607016074.029336: Read AP-REP, time 1607016074.29332, subkey aes256-cts/321F, seqnum 96929796
replica.rharwood.test: replica
  last update status: Error (-1) Problem connecting to replica - LDAP error: Can't contact LDAP server (connection error)
  last update ended: 1970-01-01 00:00:00+00:00
[root@ipa ~]# 

(The trace ends up calling sasl_interactive_bind_s - there are no Kerberos errors in the spew.)

Same deal there - plenty of SASL stuff logged, but I can't see why the connection was failed.

Comment 31 Robbie Harwood 2020-12-03 17:52:14 UTC
In the logs, there's a lot of sasl connection layer stuff, and the only things that look like failures are things like:

[03/Dec/2020:17:42:16.942744960 +0000] - DEBUG - slapd_ldap_sasl_interactive_bind - Error: could not perform interactive bind for id [] mech [GSSAPI]: LDAP error -2 (Local error) (SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Matching credential not found)) errno 2 (No such file or directory)
[03/Dec/2020:17:42:16.944137963 +0000] - DEBUG - slapi_ldap_bind - Error: could not perform interactive bind for id [] authentication mechanism [GSSAPI]: error -2 (Local error)

(The rest is read/write things.)

Comment 32 Simon Pichugin 2020-12-07 16:01:13 UTC
Created attachment 1737352 [details]
ipa.rharwood.test.dirsrv.errors.log

Comment 33 Simon Pichugin 2020-12-07 16:08:22 UTC
(In reply to Robbie Harwood from comment #29)
> Reproduction instructions (mostly lifted from
> https://www.freeipa.org/page/V4/Replica_Setup ):
> 
> I'm using two VMs, a primary (called ipa.rharwood.test) and a replica
> (replica.rharwood.test).  They're both fc32 because that's what your
> packages built for, and I'm also using packages from my copr.  Before
> starting, you'll need to make sure hostnames and such are right and can
> resolve each other properly.

Thanks for the great guide and repos!
I've installed and reproduced the issue without any problems.

So, with the build you and I provided I see the logs I've attached.
This piece of logs is on repeat so I've put only that.

Please, tell me if you see anything helpful in the logs.
If not, we need to find another way to get to the bottom of the issue... I'll keep thinking.

P.S. please, note that these logs are seen with default dirsrv loglevel. You don't need to set anything while doing the steps you've provided.

Comment 34 Simon Pichugin 2020-12-07 16:47:27 UTC
Could you also, please, provide the link for your commit that was reverted? The one with dns_canonicalize_hostname I guess.
I probably can track it down but we need it here anyway. :)

Comment 35 Robbie Harwood 2020-12-08 20:00:31 UTC
> Could you also, please, provide the link for your commit that was reverted? The one with dns_canonicalize_hostname I guess.

Not sure which repo you're referring to, so:

- dist-git: https://src.fedoraproject.org/rpms/krb5/c/c59e4a1c673512e66b4f5cfe53a1c64f7dd6b635?branch=master (added in -19, removed in -20)
- upstream krb5: https://github.com/krb5/krb5/commit/3fcc365a6f049730b3f47168f7112c03997c5c0b

If it helps, I have a COPR with 1.19 (that contains this change) here: https://copr.fedorainfracloud.org/coprs/rharwood/krb5-1.19/ while the RPMs in all Fedora don't have the change.

> Please, tell me if you see anything helpful in the logs.
> If not, we need to find another way to get to the bottom of the issue... I'll keep thinking.

The problem with the logs from my standpoint is that they don't log anything about who tried to connect and to whom.  All that gets logged is:

[07/Dec/2020:10:43:58.022596571 -0500] - ERR - Openldap Client - Log: ldap_sasl_interactive_bind: user selected: GSSAPI
[07/Dec/2020:10:43:58.024496154 -0500] - ERR - Openldap Client - Log: ldap_int_sasl_bind: GSSAPI

This tells me that SASL was attempted using GSSAPI, but not what the client was (or claimed to be), nor what the server requested was.  Since we suspect this is related to canonicalization changes, knowing both of those is strongly desirable.  Compare to a ssh log: it shows what user the connection tried to use, even if the connection failed.

It may be possible to get this information by setting KRB5_TRACE, but I don't understand the 389ds/freeipa interaction (and 389ds architecture) well enough to know how to set that in the process.

Comment 36 Simon Pichugin 2020-12-09 16:39:29 UTC
I think you can set KRB5_TRACE in systemd service file and it'll print.

For that, I see that FreeIPA has a template for 389-ds server - /usr/share/ipa/ds-ipa-env.conf.template

Right after 'dnf -y install freeipa-server{,-dns}' step and before you run any installation steps (ipa-server-install or ipa-client-install) add this line to /usr/share/ipa/ds-ipa-env.conf.template

    Environment=KRB5_TRACE=/dev/stderr
    # or this (I am not sure if stderr will work properly)
    Environment=KRB5_TRACE=/tmp/foo

Then, proceed with the steps you've provided.
And after you get the failure, you can check the logs on both 'ipa' and 'replica' servers.

Comment 37 Robbie Harwood 2020-12-10 23:34:57 UTC
We're talking about 389ds's ldaputil.c:set_krb5_creds() - which is called with username == "".

krb5_sname_to_principal(ctx, config_get_localhost(), "ldap", KRB5_NT_SRV_HST, &princ); /* Creates princ with realm = "" because we can't know the realm here. [1] */
krb5_unparse_name(princ, &princ_name); /* princ_name is now "ldap/ipa.rharwood.test@" */
krb5_get_init_creds_keytab(&creds, princ, kt);
krb5_cc_resolve("MEMORY:random", &cc);
krb5_cc_initialize(cc, princ); /* Problem - this initializes for a principal with realm == KRB5_REFERRAL_REALM, i.e., "" */
krb5_cc_store_cred(cc, &creds);

Then later, SASL -> GSSAPI and GSSAPI since "RHARWOOD.TEST" != "", we determine that we do not have the right credentials.

In order of preference, my suggested ways to fix this are:

1. Remove all of this and set KRB5_CLIENT_KTNAME/KRB5_KTNAME.  (See https://web.mit.edu/kerberos/krb5-latest/doc/basic/keytab_def.html#default-client-keytab )
2. Use krb5_get_init_creds_opt_set_out_ccache() and let krb5 do the initialize and store - krb5_get_init_creds_keytab() does the right thing already.  Note that Heimdal doesn't have this function, so this won't work there.  I don't know how much of an issue this is for 389ds.
3. Acquire the correct principal name (i.e., one that knows the realm) from cred.  This will lose FAST negotiation state (i.e., metadata).
4. Do nothing; instruct IPA to set $HACK_PRINCIPAL_NAME.

I assume the onus is on me to send a 389ds patch for this, but if not I'm happy to provide guidance.  However, I am done for today :)

1: When we retry later, we don't reconstruct the name and instead just pull it from the ccache - but because we've set it there, it's broken forever.

Comment 38 Robbie Harwood 2020-12-10 23:44:29 UTC
Ah, Heimdal doesn't support client keytabs in a released version (they do in their development branch.).  How much does 389ds care about maintaining Heimdal support?

Comment 39 Fedora Program Management 2021-04-29 16:49:10 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 40 Harry Coin 2021-05-17 14:26:41 UTC
This still happens on fc34.  Very basic two vm sandbox setup.  Master has ca, kra, multi-domain dnssec running (requires patches to pkcs11/libp11 or named crashes), ui works well, all services running.  On the replica, I installed all the freeipa-server / dns / trust packages with dnf.  Of possible interest, the same vm aiming to be a replica that fails below can enroll as a client, but upgrading to replica failed (possibly for other reasons).  I did an uninstall / revert to base packages, removed all references to the replica in the topology and host lits on the master.  Then this attempt to install without the client having been installed first using the admin user/password fails as follows:

[root@registry2 log]# tail -f ipareplica-install.log 
2021-05-17T02:27:10Z DEBUG retrieving schema for SchemaCache url=ldap://registry1.1.quietfountain.com:389 conn=<ldap.ldapobject.SimpleLDAPObject object at 0x7f68992a36a0>
2021-05-17T02:27:10Z DEBUG Successfully updated nsDS5ReplicaId.
2021-05-17T02:27:10Z DEBUG Add or update replica config cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config
2021-05-17T02:27:12Z DEBUG Added replica config cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config
2021-05-17T02:27:13Z DEBUG Add or update replica config cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config
2021-05-17T02:27:14Z DEBUG Update replica config cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config
2021-05-17T02:27:20Z DEBUG Waiting up to 300 seconds for replication (ldap://registry1.1.quietfountain.com:389) cn=meToregistry2.1.quietfountain.com,cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config (objectclass=*)
2021-05-17T02:27:20Z DEBUG Entry found [LDAPEntry(ipapython.dn.DN('cn=meToregistry2.1.quietfountain.com,cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config'), {'objectClass': [b'nsds5replicationagreement', b'top'], 'cn': [b'meToregistry2.1.quietfountain.com'], 'nsDS5ReplicaHost': [b'registry2.1.quietfountain.com'], 'nsDS5ReplicaPort': [b'389'], 'nsds5replicaTimeout': [b'120'], 'nsDS5ReplicaRoot': [b'dc=1,dc=quietfountain,dc=com'], 'description': [b'me to registry2.1.quietfountain.com'], 'nsDS5ReplicatedAttributeList': [b'(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsDS5ReplicaTransportInfo': [b'LDAP'], 'nsDS5ReplicaBindMethod': [b'SASL/GSSAPI'], 'nsds5ReplicaStripAttrs': [b'modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp'], 'nsDS5ReplicatedAttributeListTotal': [b'(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsds5replicareapactive': [b'0'], 'nsds5replicaLastUpdateStart': [b'19700101000000Z'], 'nsds5replicaLastUpdateEnd': [b'19700101000000Z'], 'nsds5replicaChangesSentSinceStartup': [b''], 'nsds5replicaLastUpdateStatus': [b'Error (0) Replica acquired successfully: Incremental update started'], 'nsds5replicaLastUpdateStatusJSON': [b'{"state": "green", "ldap_rc": "0", "ldap_rc_text": "Success", "repl_rc": "0", "repl_rc_text": "replica acquired", "date": "2021-05-17T02:27:18Z", "message": "Error (0) Replica acquired successfully: Incremental update started"}'], 'nsds5replicaUpdateInProgress': [b'FALSE'], 'nsds5replicaLastInitStart': [b'19700101000000Z'], 'nsds5replicaLastInitEnd': [b'19700101000000Z']})]
2021-05-17T02:27:20Z DEBUG Waiting up to 300 seconds for replication (ldapi://%2Frun%2Fslapd-1-QUIETFOUNTAIN-COM.socket) cn=meToregistry1.1.quietfountain.com,cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config (objectclass=*)
2021-05-17T02:27:21Z DEBUG Entry found [LDAPEntry(ipapython.dn.DN('cn=meToregistry1.1.quietfountain.com,cn=replica,cn=dc\=1\,dc\=quietfountain\,dc\=com,cn=mapping tree,cn=config'), {'objectClass': [b'nsds5replicationagreement', b'top'], 'cn': [b'meToregistry1.1.quietfountain.com'], 'nsDS5ReplicaHost': [b'registry1.1.quietfountain.com'], 'nsDS5ReplicaPort': [b'389'], 'nsds5replicaTimeout': [b'120'], 'nsDS5ReplicaRoot': [b'dc=1,dc=quietfountain,dc=com'], 'description': [b'me to registry1.1.quietfountain.com'], 'nsDS5ReplicatedAttributeList': [b'(objectclass=*) $ EXCLUDE memberof idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsDS5ReplicaTransportInfo': [b'LDAP'], 'nsDS5ReplicaBindMethod': [b'SASL/GSSAPI'], 'nsds5ReplicaStripAttrs': [b'modifiersName modifyTimestamp internalModifiersName internalModifyTimestamp'], 'nsDS5ReplicatedAttributeListTotal': [b'(objectclass=*) $ EXCLUDE entryusn krblastsuccessfulauth krblastfailedauth krbloginfailedcount'], 'nsds5replicareapactive': [b'0'], 'nsds5replicaLastUpdateStart': [b'19700101000000Z'], 'nsds5replicaLastUpdateEnd': [b'19700101000000Z'], 'nsds5replicaChangesSentSinceStartup': [b''], 'nsds5replicaLastUpdateStatus': [b'Error (0) No replication sessions started since server startup'], 'nsds5replicaLastUpdateStatusJSON': [b'{"state": "green", "ldap_rc": "0", "ldap_rc_text": "success", "repl_rc": "0", "repl_rc_text": "replica acquired", "date": "2021-05-17T02:27:21Z", "message": "Error (0) No replication sessions started since server startup"}'], 'nsds5replicaUpdateInProgress': [b'FALSE'], 'nsds5replicaLastInitStart': [b'19700101000000Z'], 'nsds5replicaLastInitEnd': [b'19700101000000Z']})]
2021-05-17T02:28:19Z DEBUG Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/ipaserver/install/service.py", line 635, in start_creation
    run_step(full_msg, method)
  File "/usr/lib/python3.9/site-packages/ipaserver/install/service.py", line 621, in run_step
    method()
  File "/usr/lib/python3.9/site-packages/ipaserver/install/dsinstance.py", line 425, in __setup_replica
    repl.setup_promote_replication(
  File "/usr/lib/python3.9/site-packages/ipaserver/install/replication.py", line 1922, in setup_promote_replication
    raise RuntimeError("Failed to start replication")
RuntimeError: Failed to start replication

Frustrating not to be able to do 'the basics' with freeipa.

Comment 41 Harry Coin 2021-05-17 14:52:34 UTC
P.S. Posted the above here,  as https://bugzilla.redhat.com/show_bug.cgi?id=1869009 was marked as a duplicate of this bug, and that's where the like failure log was posted.

Comment 42 Adam Williamson 2021-05-17 15:55:39 UTC
well, openQA hasn't hit this since, due to the change mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1868482#c9 . Did you change anything regarding that in your local setup?

Comment 43 Harry Coin 2021-05-17 15:59:47 UTC
(In reply to Adam Williamson from comment #42)
> well, openQA hasn't hit this since, due to the change mentioned in
> https://bugzilla.redhat.com/show_bug.cgi?id=1868482#c9 . Did you change
> anything regarding that in your local setup?

AFAIK it's exactly and only what dnf does to add freeipa-server + dns + adtrust to a generic fc34 workstation install.

Comment 44 Ben Cotton 2021-05-25 17:12:20 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 45 Adam Williamson 2021-05-25 18:49:49 UTC
Harry: the 'dupe' bug was marked as fixed by the same update as this. So I suspect what you're seeing is different at least somehow, and this bug just got EOLed. At this point can you just post a new bug against F34 for your issue, with all relevant logs? Thanks!

Comment 46 Red Hat Bugzilla 2023-09-12 03:46:12 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.