Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1946074

Summary: boot failure when LV is a cryptoluks device used as sysroot
Product: [Fedora] Fedora Reporter: Chris Murphy <bugzilla>
Component: dracutAssignee: dracut-maint-list
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 34CC: alexus_m, awilliam, bugzilla, dracut-maint-list, dsanzmor, gmarr, jan.public, jlayton, jonathan, mail, niki.guldbrand, robatino, scorreia, steeve.mccauley, tcamuso, updates, vashirov, vonbrand, zbyszek
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AcceptedBlocker
Fixed In Version: dracut-053-2.fc34 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-25 13:14:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1829024    
Attachments:
Description Flags
screenshot of failure
none
rdsosreport.txt none

Description Chris Murphy 2021-04-04 03:59:50 UTC
Created attachment 1768992 [details]
screenshot of failure

Description of problem:

When an LVM LV used as sysroot is LUKS encrypted (as contrasted to a LUKS partition as LVM PV, making all LV's encrypted as a group - Anaconda supports both layouts), boot fails in initramfs, when it was created with dracut 053.


Version-Release number of selected component (if applicable):
dracut-053-1.fc34.x86_64

How reproducible:
Always, in this configuration.


Steps to Reproduce:
1. Custom installation is required to only check encryption on the '/' mountpoint encrypt checkbox, thereby encrypting this LV. Fedora-Workstation-Live-x86_64-34-20210401.n.0.iso is what I used because it has dracut 053 in it, but didn't get the bump to systemd-248; it still has rc4.
2. Complete the installation
3. Reboot

Actual results:

Fails (see screenshot)

Expected results:

Should boot

Additional info:

Just to be extra clear, there are two layouts possible in Anaconda:
A. LVM->LUKS->FS
B. LUKS->LVM->FS

The (A) layout is what this bug affects. We can get here in Custom partitioning by creating an LVM layout as in Figure 20 [https://docs.fedoraproject.org/en-US/fedora/f33/install-guide/install/Installing_Using_Anaconda/#sect-installation-gui-manual-partitioning-lvm] and selecting the '/' mountpoint and checking the 'Encrypt' box to the right of Device Type: LVM popup.

(B) layout is what we used to get with Fedora 32 Automatic + "Encrypt my data" checked; and in Fedora 34 Custom if you check the new "Encrypt my data" box. The partition is cryptoluks, the dmcrypt device is made a PV-VG and all LVs made from that one crypto luks device. That layout still works.

See also bug 1945596 where quite a lot debugging was done including finding the (unexpected) fix which I've tested.
https://github.com/dracutdevs/dracut/commit/ba4bcf5f4f11ad624c647ddf4f566997186135e7

Likely also related but are updates, and not failed clean installs, so I'm not going to flag them as dups: bug 1945901, bug 1945950, bug 1945530.

Comment 1 Chris Murphy 2021-04-04 04:00:24 UTC
Created attachment 1768993 [details]
rdsosreport.txt

Comment 2 Fedora Blocker Bugs Application 2021-04-04 04:02:43 UTC
Proposed as a Blocker for 34-final by Fedora user chrismurphy using the blocker tracking app because:

 Final: The installer must be able to create and install to any workable partition layout using any file system and/or container format combination offered in a default installer configuration. 

Basic: All release-blocking images must boot in their supported configurations.

Comment 3 Geoffrey Marr 2021-04-05 23:43:07 UTC
Discussed during the 2021-04-05 blocker review meeting: [0]

The decision to classify this bug as an "AcceptedBlocker (Final)" was made as it violates the following criterion:

"The installer must be able to create and install to any workable partition layout using any file system and/or container format combination offered in a default installer configuration"

[0] https://meetbot.fedoraproject.org/fedora-blocker-review/2021-04-05/f34-blocker-review.2021-04-05-16.02.txt

Comment 4 Chris Murphy 2021-04-07 05:32:23 UTC
Small detail, I got the wrong basic criterion in comment 2. The ISO image boots OK. 

The problem is the resulting installation doesn't boot. That's in the section Post-install requirements -> Expected installed system boot behavior, which has three bullets all of which require boot to work.  https://fedoraproject.org/wiki/Basic_Release_Criteria#Expected_installed_system_boot_behavior

Comment 5 Adam Williamson 2021-04-07 20:54:55 UTC
*** Bug 1945596 has been marked as a duplicate of this bug. ***

Comment 6 Adam Williamson 2021-04-07 20:56:07 UTC
According to the dupe, https://github.com/dracutdevs/dracut/commit/ba4bcf5f4f11ad624c647ddf4f566997186135e7 resolves this. So marking as POST.

Comment 7 Steeve McCauley 2021-04-07 21:52:43 UTC
The update to /usr/lib/dracut/modules.d/35network-manager/nm-run.service also fixed this for me.

I originally just tried reinstalling the kernel, but it still wouldn't boot.  I had to rebuild initramfs manually,

$ sudo dracut --force --kver 5.11.11-300.fc34.x86_64

after which the fc34 kernel booted.

Comment 8 Fedora Update System 2021-04-09 00:46:33 UTC
FEDORA-2021-50707f8501 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-50707f8501

Comment 9 Fedora Update System 2021-04-09 13:34:24 UTC
FEDORA-2021-50707f8501 has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-50707f8501`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-50707f8501

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Justin M. Forbes 2021-04-09 17:01:20 UTC
*** Bug 1947952 has been marked as a duplicate of this bug. ***

Comment 11 Horst H. von Brand 2021-04-09 19:11:38 UTC
Running the cited `dnf upgrade` command says there are no packages to update.

As far as I see, this is a new `dracut`, i.e., a tool to (re)create the `initrd` image. Isn't a further step to update said image required as part of the fix?

Comment 12 Horst H. von Brand 2021-04-09 21:56:03 UTC
Updated `dracut` as directed, rebuilt `initramfs` for the latest Fedora 34 kernel with:

```
dracut --force /boot/initramfs-5.11.12-300.fc34.x86_64.img 5.11.12-300.fc34.x86_64
```

The result now boots fine.

Comment 13 Adam Williamson 2021-04-09 23:04:09 UTC
The message is auto-generated, it's not smart enough to be adjusted for specific foibles like that. (I actually also thought a scriptlet should trigger a rebuild when updating dracut, though it seems not).

Comment 14 Niki Guldbrand 2021-04-11 06:59:41 UTC
*** Bug 1948063 has been marked as a duplicate of this bug. ***

Comment 15 Fedora Update System 2021-04-13 01:34:32 UTC
FEDORA-2021-50707f8501 has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 16 Tony Camuso 2021-06-22 17:19:21 UTC
I don't know if this is the same problem. 

If not, I'll open a new BZ.

I provisioned F34 just a couple days ago with kernel 5.12.11-300.fc34.x86_64

Today, I added the SSL3 repo and ran a dnf update. The updated kernel is listed by grub as 5.13.0-0.rc3.25.ssl3.x86_64.

However, when I rebooted, I got this little prize.

[   21.248359] sd 1:1:0:0: [sda] Write Protect is off
[   21.248361] fbcon: mgag200drmfb (fb0) is primary device
[   21.248412] sd 1:1:0:0: [sda] Write cache: disabled, read cache: disabled, doesn't support DPO or FUA
[   21.248420] sd 1:1:0:0: [sda] Optimal transfer size 262144 bytes
[   21.332426]  sda: sda1 sda2 sda3
[   21.333278] sd 1:1:0:0: [sda] Attached SCSI disk
[   21.381948] random: fast init done
[   21.382719] Console: switching to colour frame buffer device 128x48
[   21.519005] sr 6:0:0:0: [sr0] scsi3-mmc drive: 24x/24x writer dvd-ram cd/rw xa/form2 cdda tray
[   21.541641] mgag200 0000:01:00.1: [drm] fb0: mgag200drmfb frame buffer device
[   21.556853] cdrom: Uniform CD-ROM driver Revision: 3.20
[  OK  ] Started Show Plymouth Boot Screen.
[  OK  ] Started Forward Password R…s to Plymouth Directory Watch.
[  OK  ] Reached target Paths.
[  OK  ] Reached target Basic System.
[   ***] A start job is running for /dev/dis…b236-3c6cca655d04 (25s / no limit)

Comment 17 Tony Camuso 2021-06-22 17:21:27 UTC
The original installed kernel with the original initramfs boots just fine.

Comment 18 Chris Murphy 2021-06-22 21:01:28 UTC
I don't see how it could be the same as this bug. What are the contents of the /boot/loader/entries directory? Is there a drop in file for that kernel? And can you post its contents?

Comment 19 Tony Camuso 2021-06-23 10:33:39 UTC
(In reply to Chris Murphy from comment #18)
> I don't see how it could be the same as this bug. What are the contents of
> the /boot/loader/entries directory? Is there a drop in file for that kernel?
> And can you post its contents?

Here are the contents of /boot ...

$ ls -alch /boot
total 126M
dr-xr-xr-x. 6 root root 4.0K Jun 22 12:00 .
dr-xr-xr-x. 1 root root  140 Jun 22 13:26 ..
-rw-r--r--. 1 root root  168 Jun 21 08:14 .vmlinuz-5.12.11-300.fc34.x86_64.hmac
-rw-r--r--. 1 root root  172 Jun 22 12:00 .vmlinuz-5.13.0-0.rc3.25.ssl3.x86_64.hmac
-rw-------. 1 root root 5.5M Jun 21 08:14 System.map-5.12.11-300.fc34.x86_64
-rw-------. 1 root root 4.9M Jun 22 12:00 System.map-5.13.0-0.rc3.25.ssl3.x86_64
-rw-r--r--. 1 root root 228K Jun 21 08:14 config-5.12.11-300.fc34.x86_64
-rw-r--r--. 1 root root 206K Jun 22 12:00 config-5.13.0-0.rc3.25.ssl3.x86_64
drwx------. 3 root root 4.0K Dec 31  1969 efi
drwx------. 2 root root 4.0K Jun 22 13:01 grub2
-rw-------. 1 root root  50M Jun 21 08:15 initramfs-0-rescue-f726897024a54920a7c1f31ae1dcc22b.img
-rw-------. 1 root root  18M Jun 21 08:17 initramfs-5.12.11-300.fc34.x86_64.img
-rw-------. 1 root root  18M Jun 22 12:00 initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img
drwxr-xr-x. 3 root root 4.0K Jun 21 08:14 loader
drwx------. 2 root root  16K Jun 21 08:11 lost+found
-rwxr-xr-x. 1 root root  11M Jun 21 08:15 vmlinuz-0-rescue-f726897024a54920a7c1f31ae1dcc22b
-rwxr-xr-x. 1 root root  11M Jun 21 08:14 vmlinuz-5.12.11-300.fc34.x86_64
-rwxr-xr-x. 1 root root  11M Jun 22 12:00 vmlinuz-5.13.0-0.rc3.25.ssl3.x86_64

I don't know what you mean by, "drop-in" file. I can give you access to the system, if you need it.

Comment 20 Chris Murphy 2021-06-23 16:39:17 UTC
Please try again. List the contents of the /boot/loader/entries/ directory. Thanks.

Comment 21 Tony Camuso 2021-06-23 16:46:48 UTC
# ls -alch /boot/loader/entries/
total 20K
drwx------. 2 root root 4.0K Jun 22 12:00 .
drwxr-xr-x. 3 root root 4.0K Jun 21 08:14 ..
-rw-r--r--. 1 root root  415 Jun 21 08:17 f726897024a54920a7c1f31ae1dcc22b-0-rescue.conf
-rw-r--r--. 1 root root  343 Jun 21 08:17 f726897024a54920a7c1f31ae1dcc22b-5.12.11-300.fc34.x86_64.conf
-rw-r--r--. 1 root root  358 Jun 22 12:00 f726897024a54920a7c1f31ae1dcc22b-5.13.0-0.rc3.25.ssl3.x86_64.conf


# cat /boot/loader/entries/*
title Fedora (0-rescue-f726897024a54920a7c1f31ae1dcc22b) 34 (Thirty Four)
version 0-rescue-f726897024a54920a7c1f31ae1dcc22b
linux /vmlinuz-0-rescue-f726897024a54920a7c1f31ae1dcc22b
initrd /initramfs-0-rescue-f726897024a54920a7c1f31ae1dcc22b.img
options root=UUID=28e9e697-9fa3-4746-b236-3c6cca655d04 ro rootflags=subvol=root console=ttyS1,115200n81 
grub_users $grub_users
grub_arg --unrestricted
grub_class kernel
title Fedora (5.12.11-300.fc34.x86_64) 34 (Thirty Four)
version 5.12.11-300.fc34.x86_64
linux /vmlinuz-5.12.11-300.fc34.x86_64
initrd /initramfs-5.12.11-300.fc34.x86_64.img
options root=UUID=28e9e697-9fa3-4746-b236-3c6cca655d04 ro rootflags=subvol=root console=ttyS1,115200n81 
grub_users $grub_users
grub_arg --unrestricted
grub_class kernel
title Fedora (5.13.0-0.rc3.25.ssl3.x86_64) 34 (Thirty Four)
version 5.13.0-0.rc3.25.ssl3.x86_64
linux /vmlinuz-5.13.0-0.rc3.25.ssl3.x86_64
initrd /initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img
options root=UUID=28e9e697-9fa3-4746-b236-3c6cca655d04 ro rootflags=subvol=root console=ttyS1,115200n81
grub_users $grub_users
grub_arg --unrestricted
grub_class kernel

Comment 22 Chris Murphy 2021-06-23 21:34:17 UTC
OK I don't see a problem with the BLS dropin snippet. It has the proper rootflags option, same as the working boot entries. But this message at the hanging startup:

>[   ***] A start job is running for /dev/dis…b236-3c6cca655d04 (25s / no limit)

matches the root=UUID found in each BLS snippet, which suggests that this file system is not being found, thus not mounted. Please post the output from:

sudo grep -i btrfs /boot/config-5.13.0-0.rc3.25.ssl3.x86_64
sudo lsinitrd -i btrfs /boot/initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img

I suspect that there isn't a btrfs driver enabled in this kernel, so let's see... 

If it's y, or if it's m and the mod is in the initrd, then I'll just need to see an rdsosreport or equivalent which you can get by booting the bad entry with this boot parameter added:  rd.timeout=60 And while you'll be stuck in a dracut shell you can do something like:
blkid    ##find the ext4 boot volume
mount    ##mount it to /sysroot
journalctl -b > /sysroot/journal.log
umount /sysroot
reboot

Now you can boot the good one, and post that /boot/journal.log but I don't need any of that if btrfs is not being built in this ssl3 kernel; I'm not at all familiar with it.

Comment 23 Tony Camuso 2021-06-24 12:44:31 UTC
(In reply to Chris Murphy from comment #22)

First, let me say, many thanks for the replies and the help.
 
> sudo grep -i btrfs /boot/config-5.13.0-0.rc3.25.ssl3.x86_64

# grep -i btrfs /boot/config-5.13.0-0.rc3.25.ssl3.x86_64
# CONFIG_BTRFS_FS is not set

> sudo lsinitrd -i btrfs /boot/initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img

# lsinitrd -i btrfs /boot/initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img
getopt: invalid option -- 'i'

So I tried this instead ...

# lsinitrd  /boot/initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img | grep btrfs
btrfs
-rw-r--r--   1 root     root           20 Jun 10 11:53 etc/cmdline.d/00-btrfs.conf
-rw-r--r--   1 root     root          387 May 13 09:42 usr/lib/udev/rules.d/64-btrfs-dm.rules
-rw-r--r--   1 root     root          616 May 15 13:10 usr/lib/udev/rules.d/64-btrfs.rules
-rwxr-xr-x   1 root     root      1047216 May 13 19:39 usr/sbin/btrfs
lrwxrwxrwx   1 root     root            5 Jun 10 11:53 usr/sbin/btrfsck -> btrfs
-rwxr-xr-x   1 root     root         1189 May 13 09:42 usr/sbin/fsck.btrfs


> I suspect that there isn't a btrfs driver enabled in this kernel, so let's
> see... 
> 
> If it's y, or if it's m and the mod is in the initrd, then I'll just need to

It's neither y nor m, but the mod is in the initrd. So, building a kernel and
expecting to boot it would be folly without first setting CONFIG_BTRFS_FS in
the .config.

But that shouldn't affect installing a kernel.

> see an rdsosreport or equivalent which you can get by booting the bad entry
> with this boot parameter added:  rd.timeout=60 And while you'll be stuck in
> a dracut shell you can do something like:
> blkid    ##find the ext4 boot volume
> mount    ##mount it to /sysroot
> journalctl -b > /sysroot/journal.log
> umount /sysroot
> reboot
> 
> Now you can boot the good one, and post that /boot/journal.log but I don't
> need any of that if btrfs is not being built in this ssl3 kernel; I'm not at
> all familiar with it.

I will take these steps today, if I can get another system to offload my work.

Comment 24 Chris Murphy 2021-06-25 02:09:36 UTC
># lsinitrd -i btrfs /boot/initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img
>getopt: invalid option -- 'i'

Ooops, should have been 'lsinitrd /path | grep -i btrfs'

But in any case, no btrfs module is in there and that's the problem.

>It's neither y nor m, but the mod is in the initrd. 

Only user space tools are in the initrd. CONFIG_BTRFS_FS must be y or m. That's the problem with this kernel, it flat out does not support btrfs. Again, I'm not familiar with this ssl kernel but whoever is building it needs to have a bug filed against their spec file, such that CONFIG_BTRFS_FS is either y or m. For many years Fedora only had it configured as m, and it worked fine as sysroot. Since btrfs became the default, it's a built-in driver.

Comment 25 Tony Camuso 2021-06-25 13:14:54 UTC
(In reply to Chris Murphy from comment #24)
> ># lsinitrd -i btrfs /boot/initramfs-5.13.0-0.rc3.25.ssl3.x86_64.img
> >getopt: invalid option -- 'i'
> 
> Ooops, should have been 'lsinitrd /path | grep -i btrfs'
> 
> But in any case, no btrfs module is in there and that's the problem.
> 
> >It's neither y nor m, but the mod is in the initrd. 
> 
> Only user space tools are in the initrd. CONFIG_BTRFS_FS must be y or m.
> That's the problem with this kernel, it flat out does not support btrfs.
> Again, I'm not familiar with this ssl kernel but whoever is building it
> needs to have a bug filed against their spec file, such that CONFIG_BTRFS_FS
> is either y or m. For many years Fedora only had it configured as m, and it
> worked fine as sysroot. Since btrfs became the default, it's a built-in
> driver.

'dnf provides kernel' reveals the culprit.

kernel-core-5.13.0-0.rc3.25.ssl3.x86_64 : The Linux kernel
Repo        : c9s-build-ssl3
Matched from:
Provide    : kernel = 5.13.0-0.rc3.25.ssl3

I'll need to reach out to the CentOS team to see why they do not configure btrfs at least as m.

Meanwhile, I can rebuild my ssl3 kernel with CONFIG_BTRFS_FS=y

Many thanks. I'm reclosing this BZ with ERRATA, because it is not related to the original problem.