Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 2247872
Summary: | Don't write /etc/lvm/devices/system.devices when not doing an end-user install | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> |
Component: | anaconda | Assignee: | Vojtech Trefny <vtrefny> |
Status: | ON_QA --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 39 | CC: | anaconda-maint, bugzilla, emmanuel, gmarr, k.koukiou, kparal, pboy, pbrobinson, pwhalen, robatino, teigland, vslavik, vtrefny, w |
Target Milestone: | --- | Keywords: | CommonBugs |
Target Release: | --- | Flags: | teigland:
needinfo?
(pboy) |
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | https://discussion.fedoraproject.org/t/95126 RejectedBlocker AcceptedFreezeException | ||
Fixed In Version: | anaconda-40.22.3-1.fc40 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | Type: | Bug | |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2187795 |
Description
Adam Williamson
2023-11-04 00:07:15 UTC
Nominating for a F40 blocker discussion. OS images should not include /etc/lvm/devices/system.devices. It is specific to the hardware of the system. Ideally, image installers will have methods to generate system.devices after a system has been installed (e.g. run "vgimportdevices -a" after install.) LVM will run fine without a devices file, but will be missing the advantages it provides. In the future we might look at having lvm itself detect a newly installed system and generate a local system.devices itself. Maybe having a oneshot systemd script with a ConditionFirstBoot would allow it to check for /etc/lvm/devices/system.devices and do the bits needed? (In reply to Peter Robinson from comment #3) > Maybe having a oneshot systemd script with a ConditionFirstBoot would allow > it to check for /etc/lvm/devices/system.devices and do the bits needed? I'll take a look at that, it sounds like it might work. Thanks for the suggestion. (In reply to Adam Williamson from comment #0) > Since 9cccada80d21d30b4b4adc8919e278d7dbc316d1 , anaconda writes a > /etc/lvm/devices/system.devices file when PVs are present in the install. ... > See https://bugzilla.redhat.com/show_bug.cgi?id=2246871 for some earlier > discussion of this, I am splitting it out into a separate bug report for > clarity, and nominating it for CommonBugs status. > > I suspect this affects F38 and earlier too, but we noticed it with F39 > (pboy, would be interesting if you could check if this was also the case > with earlier releases). I checked F38 and F37. In those distribution images the directory /etc/lvm/devices is empty. So we didn't have that issue there. According to my finding, while "normally" using the system, a system.devices file was never created during system operation. So we always missed the advantages David mentioned in #2. I also checked again the libvirt x86 VM images. It contains the system.devices file, too. But it causes no issue because a VM usually uses a vda3. So, the question may be, what made Anaconda to leave the /etc/lvm/devices directory for aarch54 empty in F37/38, but not in F39. And why created the file with x86. Furthermore, in F38 using arm-image-installer 3.8, it recognized the existence for a VG of the server system with the same name as in the SBC image and announced it would be renamed to fedora-server. But it did not, the name is still fedora. That caused no issue as far as I know, neither in generating the mSD card nor during the system operations using the mSD. But if you try to install the system from a system booted from a mSD card, to an onboard eMMC storage, the installation fails not because VGs with the same name, but 2 partitions with the same UUID. But that's probably off-topic here and worth a separate bug. Proposed fixes for Anaconda and Blivet: - https://github.com/rhinstaller/anaconda/pull/5325 - https://github.com/storaged-project/blivet/pull/1169/ With this we won't create the devices file when running image install. If we want a oneshot service to create it during the first boot, I think it should be handled by LVM, not by the installer. I think these solutions fix a follow-up problem, the root cause is somewhere else, and it doesn't fix the broader problem. See #2258764 The LVM group made an unannounced change in F38 that modified the behavior of vgscan and vgchange. It changes in lvm.conf option 'use_devicesfile = 1' (previously 0). Therefore, both commands now work differently for new VGs and limit the search to devices listed in the /etc/lvm/devices/system.devices file. Unfortunately, this renders both commands more or less useless for practical LVM administration tasks. If you want to integrate a new VG into a running system, it will most likely be on a new device. It is therefore probably better to revert to the previous state. The LVM Group does not appear to be keen to do this, and has not yet specified any reason why this change has been made. Just an additional proposal: A complete solution would include 1. Remove the device file from the image, because as mentioned in #2: OS images should not include /etc/lvm/devices/system.devices. It is specific to the hardware of the system. 2. Create the device file specifically for the installed hardware at first system boot, because upstream introduced that feature and we miss features otherwise (see #2) 3. Resolve the vgscan/vgchange issue by * either change the package and set use_devicesfile = 0 in lvm.conf, if LVM group is happy with this * or change lvm.conf just for server, unless LVM group does advise against it * or rewrite various scripts, specifically arm-imange-installer to use vgimportdevices -a before any other LVM commands, as mentioned in Bug #2258764 #3 by Paul Howarth and reset the device file afterward, as well as adjust a lot of our documentation, because it is missleading or incomplete now. David Teigland provided additional information in another thread I would like to share here: === use_devicesfile=1 is meant to be the standard way of using lvm for years now, and it should have been that way in fedora at least a couple years ago. The devices file feature was introduced to lvm primarily as an "opt in" mechanism for using devices with lvm. Previously, lvm has always had an "opt out" approach in which devices needed to rejected with the filter to stop lvm from using them. Over the past several years, it's become increasingly likely that lvm devices attached to a host no longer belong to the host and it's not safe for the host to assume it can use them. e.g. lvm devices are quite likely to belong to a guest VM, and there were many instances of hosts using and corrupting lvm devices that were in use by a guest VM. Similar problems exist with machines connected to a SAN. === So it's probably best not to permanently change LVM configuration nor permanently remove the device file. Sorry for this "additional addition" David has kindly provided further information and suggestions (see https://bugzilla.redhat.com/show_bug.cgi?id=2258764#c10). Accordingly, points 1. and 2. from #8 (and the already proposed PR) make sense, and the third alternative from 3. In arm-image-installer, either "--devices /dev/foo" would have to be added to the LVM commands or the device would have to be added to device file at the beginning and then removed at the end (lvmdevices --adddev|deldev /dev/foo). So I think after a longer round trip we end up with an adaptation of the Anaconda image generation quasi as a side effect of the original issue and a modification of the arm-image-installer script as assumed at the beginning. Discussed during the 2024-02-12 blocker review meeting: [0] The decision to delay the classification of this as a blocker bug was made as pboy is working through the plan and the implications here, we will delay the decision for a bit so we have a clearer picture. [0] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-02-12/f40-blocker-review.2024-02-12-17.05.txt We could `rm /etc/lvm/devices/system.devices` in %post of the kickstart used to create the image. If the file is included, LVM commands don't work as expected. @Vojtech Trefny
>Proposed fixes for Anaconda and Blivet:
> - https://github.com/rhinstaller/anaconda/pull/5325
> - https://github.com/storaged-project/blivet/pull/1169/
What is the current status here? As far as I see it has been merged (Jan10), but with Fedora 40 Branched 20240219 the image file still includes the devices file.
@pwhalen >If the file is included, LVM commands don't work as expected. Well, you overlook that the "work as expected" has changed! With version F39 the new "work as expected" is that the vg* commands look into the devices file and work only on those devices listed therein. If you connect a new device and want to have it included, you have either to add it on purpose to the devices file (opt in) using 'lvmdevices --adddev /dev/foo' or instruct each vg* command to use additional devices using the command line option --devices /dev/foo See my description in #9 and #10 or - if you prefer to read the original and not just my citation - you may check https://bugzilla.redhat.com/show_bug.cgi?id=2258764, specifically #8, #9, #10 Therefore, we have to adjust the images *and* the arm-image-installer script to the new system-wide way LVM works. We missed this for F39 because unfortunately this change was discussed neither on the devel mailing list nor in a change proposal (see https://bugzilla.redhat.com/show_bug.cgi?id=2258764#c4). @pwhalen: As an addendum: During our entire discussion last year about the F39 installation problem, we were "on the wrong side of the fence" the whole time because we didn't know about that change. Nearly all the time and efforts we spent with Bug https://bugzilla.redhat.com/show_bug.cgi?id=2246871 was for nothing. And all the arguments and considerations we made there turned out to be incorrect and completely beside the point. We now have to rethink this. (In reply to Peter Boy from comment #13) > @Vojtech Trefny > > >Proposed fixes for Anaconda and Blivet: > > - https://github.com/rhinstaller/anaconda/pull/5325 > > - https://github.com/storaged-project/blivet/pull/1169/ > > What is the current status here? As far as I see it has been merged (Jan10), > but with Fedora 40 Branched 20240219 the image file still includes the > devices file. I forgot there is one more place where the LVM devices file is being written be Anaconda: https://github.com/rhinstaller/anaconda/pull/5484 (In reply to Peter Boy from comment #14) > @pwhalen > > >If the file is included, LVM commands don't work as expected. > > Well, you overlook that the "work as expected" has changed! If the file is deleted, functionality returns to what was in previous releases. > > Therefore, we have to adjust the images *and* the arm-image-installer script > to the new system-wide way LVM works. We missed this for F39 because > unfortunately this change was discussed neither on the devel mailing list > nor in a change proposal (see > https://bugzilla.redhat.com/show_bug.cgi?id=2258764#c4). Let's keep this BZ focused on LVM. Please open another for any issues encountered with the arm-image-installer. @Vojtech Trefny: Just a question: Does the Anaconda image handling capability allow configuring the image to issue something as "vgimportdevices -a" at first boot on the target system? That would ensure we get the same configuration for systems installed with ISO file and installed using an image file. I would like to get a consistent installation result across the various installation methods, specifically for Fedora Server Edition. Affected would be the aarch64 installation iamge we are talking about all the time here and the KVM image. I'm curious about any advice about the current best practice for a first-boot, one-shot service that would run vgimportdevices -a. Googling found various examples of this sort of thing, but most were fairly old, and possibly outdated. I'm also thinking about a variation of vgimportdevices to run here, which would basically be "vgimportdevices <rootvg>", and only import LVM devices for the root VG, rather than everything. We don't really know if other VGs, which happen to be attached and visible during first boot, are truely safe for the host to be using. So, it would be safest to import only the root VG, and require the admin to decide themselves which other VGs the host should have access to. That said, anaconda does run vgimportdevices -a during install, which is going to help in cases where the user does want to access other existing VGs after install. > Let's keep this BZ focused on LVM. Please open another for any issues encountered with the arm-image-installer. Created: New system-wide LVM configuration requires adaptation of arm-image-installer (https://bugzilla.redhat.com/show_bug.cgi?id=2265422) Discussed during the 2024-03-04 blocker review meeting: [0] The decision to classify this bug as a "RejectedBlocker (Final)" and an "AcceptedFreezeException (Final)" was made as this cannot block the release as it affects the qcow2 image only, which is not in the release-blocking list for Fedora 40. As it's a significant issue in a non-blocking image, we grant it a freeze exception. [0] https://meetbot.fedoraproject.org/blocker-review_matrix_fedoraproject-org/2024-03-04/f40-blocker-review.2024-03-04-17.00.log.txt The release Beta 1.10 still contains the wrong devices file in bpth images: * Fedora-Server-KVM-40_Beta-1.10.x86_64.qcow2 * Fedora-Server-40_Beta-1.10.aarch64.raw.xz We have a FE for this bug, so it could still get fixed for the final release. Any chance? > I'm curious about any advice about the current best practice for a first-boot, one-shot service that would run vgimportdevices -a. Googling found various examples of this sort of thing, but most were fairly old, and possibly outdated. In a recent posting Stephen Gallagher (sgallagh) referred to https://docs.fedoraproject.org/en-US/packaging-guidelines/Initial_Service_Setup/ as the Fedora way to handle this. > I'm also thinking about a variation of vgimportdevices to run here, which would basically be "vgimportdevices <rootvg>", and only import LVM devices for the root VG, rather than everything. We don't really know if other VGs, which happen to be attached and visible during first boot, are truely safe for the host to be using. That would be the very conservative and very cautious approach. I think the VGs that are configured during the installation or during the generation of an image should be sufficiently safe to include. > We have a FE for this bug, so it could still get fixed for the final release. Any chance?
The Anaconda part of the fix is included in 40.22.3 which was released few days ago so this should be fixed in the next compose
Peter, can you confirm if this is good now? Thanks! I checked with branched 2024-04-06. Both images, still, contain the file /etc/lvm/devices/system.devices with the content of /dev/vda3. But if I dd the Fedora-Server-40-20240406.n.0.aarch64.raw.xz directly to a box containing tow-boot SPI (so it can boot the root filesystem directly w/o modifications by arm-image-installer) the system.devices file is obviously changed at first boot and contains the correct partition entries. I haven't checked this with the KVM image, yet. At the end, it looks like a successful fix here, different from the early planning to remove a system.devices file for images. But I'm not sure. Would be helpful to get some information from the maintainers. Tested with rc 1.12 * Both Server image files (Fedora-Server-KVM-40-1.12.x86_64.qcow2 & Fedora-Server-40-1.12.aarch64.raw.xz) still contain a devices file in /etc/lvm/devices/system.devices, which refers to /dev/vda3, obviously from the build host. * If you leave this unchanged, the devices file in the KVM image will be left as /dev/vda3, which is correct for a libvirt KVM, and for the ARM SBC device correctly changed to mmcblk1p3 * If you manually delete the devices file in each of the images, then after the first boot the /etc/lvm/devices subdirectory in both images remains empty. So, the fix provides a correct result, albeit differently than originally thought. Nevertheless, we need information on the rules according to which the correction is made in order to be able to document this in the documentation and in the release notes. I am glad it works now, but unfortunately not thanks to the changes that I made. I just found the fixes for Anaconda and Blivet doesn't work in this situation at all -- the LVM devices feature is skipped only during image installation and the server images are not installed this way. As far as I can tell (at least for the KVM image), it's just a normal installation so we won't be possible to tell that we need to skip writing /etc/lvm/devices/system.devices in this case so this needs to be solved in a different way, for example in post script in kickstart. I have an initial working version of automatic system.devices creation for image-based OS installation. I'm not familiar with the image preparation process, so I'd like to know if the image preparation steps this requires are workable: - create empty file /etc/lvm/devices/auto-import-rootvg - remove any existing /etc/lvm/devices/system.devices - enable lvm-devices-init.path and lvm-devices-init.service Patch description: This is intended for image-based OS deployments, where an installer is not run on the target machine to create a custom system.devices. Instead, the OS image preparation can configure the image so that lvm will automatically create system.devices for the root VG on first boot. image preparation: - create empty file /etc/lvm/devices/auto-import-rootvg - remove any existing /etc/lvm/devices/system.devices - enable lvm-devices-init.path and lvm-devices-init.service on first boot: - udev triggers vgchange -aay --autoactivation event <rootvg> - vgchange activates LVs in the root VG - vgchange finds auto-import-rootvg, and no system.devices, so it creates /run/lvm/lvm-devices-init - lvm-devices-init.path is run when /run/lvm/lvm-devices-init appears, and triggers lvm-devices-init.service - lvm-devices-init.service runs vgimportdevices --rootvg --auto - vgimportdevices finds auto-import-rootvg, and no system.devices, so it creates system.devices containing PVs in the root VG, and removes /etc/lvm/devices/auto-import-rootvg and /run/lvm/lvm-devices-init lvm-devices-import.path: [Unit] Description=lvm-devices-import to create system.devices # /run/lvm/lvm-devices-import created by vgchange -aay <rootvg> [Path] PathExists=/run/lvm/lvm-devices-import Unit=lvm-devices-import.service ConditionPathExists=!/etc/lvm/devices/system.devices [Install] WantedBy=multi-user.target lvm-devices-import.service: [Unit] Description=Create lvm system.devices [Service] Type=oneshot RemainAfterExit=no ExecStart=/usr/sbin/vgimportdevices --rootvg --auto ConditionPathExists=!/etc/lvm/devices/system.devices [Install] WantedBy=multi-user.target |