Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1976653
Summary: | dracut lock-up of 1m30s in initrd due to new dbus-broker ordering causing slow boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dusan Bobak <drunkez> | ||||||
Component: | dracut | Assignee: | dracut-maint-list | ||||||
Status: | NEW --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 34 | CC: | ade.rixon, adrien-xx-redhatbz, ahabig, amessina, balay, belegdol, bmason, bmchugh, bugzilla, bugzilla, casep, chriscarson60187, christof, danielsun3164, denezmarchand, dfediuck, dgilbert, dominik, dracut-maint-list, drunkez, faber, florian, frank, fschwarz, fsumsal, garrett.mitchener, goeran, hdegoede, henrik, hpillay, i.maximets, ipazosat, jburgess777, jesus, jonathan, jonjac, joshas, larsper, lprosseda, m3lh0r, mail, marko.bevc, mdtha, mihai, mironov.ivan, nalin, pdwyer, pioann, redhat, remyabel, remy-redhat, rgarcia, robatino, rsandu, Simon.Gerhards, skrenger, stanley.king, stefan.hoelldampf, stefano.biagiotti, stefanrin, stepglenn, stevenfalco, suvi, todoleza, tomastrnka, tseewald, viraptor, viraptor, wurstsemmel, zbyszek | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | Type: | Bug | |||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Dusan Bobak
2021-06-27 21:28:04 UTC
What is your kernel command line? cat /proc/cmdline Hi, I am facing the same issue. Today I upgraded to I upgraded my laptop from Fedora 32 to Fedora 34 and everything works fine but it takes a lot to boot. After a quick analysis I found out this service is consuming all the boot time [ipazosat@x1 ~]$ systemd-analyze blame 1min 30.453s dracut-initqueue.service Logs: dracut-initqueue.service - dracut initqueue hook Loaded: loaded (/usr/lib/systemd/system/dracut-initqueue.service; static) Active: inactive (dead) since Mon 2021-06-28 18:07:54 CEST; 5min ago Docs: man:dracut-initqueue.service(8) Main PID: 480 (code=exited, status=0/SUCCESS) CPU: 0 Jun 28 18:06:23 x1 systemd[1]: Starting dracut initqueue hook... Jun 28 18:06:23 x1 dracut-initqueue[500]: Scanning devices nvme0n1p3 for LVM logical volumes fedora_localhost-live/root Jun 28 18:06:23 x1 dracut-initqueue[500]: fedora_localhost-live/swap Jun 28 18:06:23 x1 dracut-initqueue[521]: inactive '/dev/fedora_localhost-live/swap' [7.77 GiB] inherit Jun 28 18:06:23 x1 dracut-initqueue[521]: inactive '/dev/fedora_localhost-live/home' [159.11 GiB] inherit Jun 28 18:06:23 x1 dracut-initqueue[521]: inactive '/dev/fedora_localhost-live/root' [70.00 GiB] inherit Jun 28 18:07:53 x1 dracut-initqueue[523]: WARNING: D-Bus notification failed: Transport endpoint is not connected Jun 28 18:07:54 x1 systemd[1]: Finished dracut initqueue hook. Jun 28 18:07:54 x1 systemd[1]: dracut-initqueue.service: Deactivated successfully. Jun 28 18:07:54 x1 systemd[1]: Stopped dracut initqueue hook. I don't have units failing in my laptop etc. ● x1 State: running Jobs: 0 queued Failed: 0 units Model is X1 Carbon GEN 7. [ipazosat@x1 ~]$ cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.12.12-300.fc34.x86_64 root=/dev/mapper/fedora_localhost--live-root ro resume=/dev/mapper/fedora_localhost--live-swap rd.lvm.lv=fedora_localhost-live/root rd.lvm.lv=fedora_localhost-live/swap rhgb quiet BR, Iván Same problem here. $ sudo systemd-analyze blame | head -1 1min 40.713s dracut-initqueue.service $ cat /proc/cmdline BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.12.12-300.fc34.x86_64 root=/dev/mapper/fedora-root ro resume=/dev/mapper/fedora-swap rd.lvm.lv=fedora/root rd.luks.uuid=luks-e4b4abe4-9c4d-4d67-9faf-9148c7ad2642 rd.lvm.lv=fedora/swap rhgb quiet systemd.unified_cgroup_hierarchy=0 $ sudo journalctl -b -u dracut-initqueue.service -- Journal begins at Fri 2021-01-08 08:31:53 CET, ends at Tue 2021-06-29 08:10:14 CEST. -- Jun 29 07:46:24 localhost.localdomain systemd[1]: Starting dracut initqueue hook... Jun 29 07:46:35 localhost.localdomain dracut-initqueue[721]: Scanning devices dm-0 for LVM logical volumes fedora/root Jun 29 07:46:35 localhost.localdomain dracut-initqueue[721]: fedora/swap Jun 29 07:46:35 localhost.localdomain dracut-initqueue[743]: inactive '/dev/fedora/swap' [7.78 GiB] inherit Jun 29 07:46:35 localhost.localdomain dracut-initqueue[743]: inactive '/dev/fedora/home' [<894.89 GiB] inherit Jun 29 07:46:35 localhost.localdomain dracut-initqueue[743]: inactive '/dev/fedora/root' [50.00 GiB] inherit Jun 29 07:48:05 localhost.localdomain dracut-initqueue[745]: WARNING: D-Bus notification failed: Transport endpoint is not connected Jun 29 07:48:05 localhost.localdomain systemd[1]: Finished dracut initqueue hook. Jun 29 07:48:05 localhost.localdomain systemd[1]: dracut-initqueue.service: Deactivated successfully. Jun 29 07:48:05 localhost.localdomain systemd[1]: Stopped dracut initqueue hook. This problem started to happen a couple of days ago: -- Boot c997dd7f982b4ddba44687cc30aa7893 -- Jun 26 11:52:33 localhost.localdomain systemd[1]: Starting dracut initqueue hook... Jun 26 11:52:56 localhost.localdomain dracut-initqueue[892]: Scanning devices dm-0 for LVM logical volumes fedora/root Jun 26 11:52:56 localhost.localdomain dracut-initqueue[892]: fedora/swap Jun 26 11:52:56 localhost.localdomain dracut-initqueue[913]: inactive '/dev/fedora/swap' [7.78 GiB] inherit Jun 26 11:52:56 localhost.localdomain dracut-initqueue[913]: inactive '/dev/fedora/home' [<894.89 GiB] inherit Jun 26 11:52:56 localhost.localdomain dracut-initqueue[913]: inactive '/dev/fedora/root' [50.00 GiB] inherit Jun 26 11:52:56 localhost.localdomain systemd[1]: Finished dracut initqueue hook. Jun 26 11:52:56 localhost.localdomain systemd[1]: dracut-initqueue.service: Deactivated successfully. Jun 26 11:52:56 localhost.localdomain systemd[1]: Stopped dracut initqueue hook. -- Boot e130f712098347768c3481fea9143c41 -- Jun 26 11:58:03 localhost.localdomain systemd[1]: Starting dracut initqueue hook... Jun 26 11:58:12 localhost.localdomain dracut-initqueue[707]: Scanning devices dm-0 for LVM logical volumes fedora/root Jun 26 11:58:12 localhost.localdomain dracut-initqueue[707]: fedora/swap Jun 26 11:58:12 localhost.localdomain dracut-initqueue[728]: inactive '/dev/fedora/swap' [7.78 GiB] inherit Jun 26 11:58:12 localhost.localdomain dracut-initqueue[728]: inactive '/dev/fedora/home' [<894.89 GiB] inherit Jun 26 11:58:12 localhost.localdomain dracut-initqueue[728]: inactive '/dev/fedora/root' [50.00 GiB] inherit Jun 26 11:59:43 localhost.localdomain dracut-initqueue[730]: WARNING: D-Bus notification failed: Transport endpoint is not connected Jun 26 11:59:43 localhost.localdomain systemd[1]: Finished dracut initqueue hook. Jun 26 11:59:43 localhost.localdomain systemd[1]: dracut-initqueue.service: Deactivated successfully. Jun 26 11:59:43 localhost.localdomain systemd[1]: Stopped dracut initqueue hook. dnf history shows I upgraded around that moment, at 2021-06-26 11:55. dracut was not upgraded, but dbus-broker was. Perhaps that's the issue? Transaction ID : 771 Begin time : Sat 26 Jun 2021 11:55:02 AM CEST Begin rpmdb : 2878:43d2b0ad8a3a1726a1144c156f05d3d7887e763d End time : Sat 26 Jun 2021 11:57:23 AM CEST (141 seconds) End rpmdb : 2877:11a12724f93cb5006c00a2fcb70c53cdb5afaa90 User : root <root> Return-Code : Success Releasever : Command Line : Comment : Packages Altered: Install kernel-5.12.12-300.fc34.x86_64 @updates Install kernel-core-5.12.12-300.fc34.x86_64 @updates Install kernel-debug-devel-5.12.12-300.fc34.x86_64 @updates Install kernel-devel-5.12.12-300.fc34.x86_64 @updates Install kernel-modules-5.12.12-300.fc34.x86_64 @updates Install kernel-modules-extra-5.12.12-300.fc34.x86_64 @updates Upgrade cmake-3.20.5-1.fc34.x86_64 @updates Upgrade cmake-data-3.20.5-1.fc34.noarch @updates Upgrade cmake-filesystem-3.20.5-1.fc34.x86_64 @updates Upgrade cmake-rpm-macros-3.20.5-1.fc34.noarch @updates Upgrade dbus-broker-29-1.fc34.x86_64 @updates Upgrade gst-editing-services-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-plugins-bad-free-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-plugins-base-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-plugins-good-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-plugins-good-gtk-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-plugins-good-qt-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade gstreamer1-plugins-ugly-free-1.19.1-2.1.18.4.fc34.x86_64 @updates Upgrade libinput-1.18.0-2.fc34.x86_64 @updates Upgrade libinput-devel-1.18.0-2.fc34.x86_64 @updates Upgrade libxcrypt-4.4.22-2.fc34.i686 @updates Upgrade libxcrypt-4.4.22-2.fc34.x86_64 @updates Upgrade libxcrypt-compat-4.4.22-2.fc34.x86_64 @updates Upgrade libxcrypt-devel-4.4.22-2.fc34.i686 @updates Upgrade libxcrypt-devel-4.4.22-2.fc34.x86_64 @updates Upgrade nano-5.8-2.fc34.x86_64 @updates Upgrade nano-default-editor-5.8-2.fc34.noarch @updates Upgrade pam-1.5.1-6.fc34.x86_64 @updates Upgrade pipewire-0.3.30-4.fc34.x86_64 @updates Upgrade pipewire-alsa-0.3.30-4.fc34.x86_64 @updates Upgrade pipewire-gstreamer-0.3.30-4.fc34.x86_64 @updates Upgrade pipewire-jack-audio-connection-kit-0.3.30-4.fc34.x86_64 @updates Upgrade pipewire-libs-0.3.30-4.fc34.x86_64 @updates Upgrade pipewire-pulseaudio-0.3.30-4.fc34.x86_64 @updates Upgrade pipewire-utils-0.3.30-4.fc34.x86_64 @updates Upgrade ripgrep-13.0.0-1.fc34.x86_64 @updates Upgrade rpmautospec-rpm-macros-0.2.3-1.fc34.noarch @updates Upgrade tigervnc-1.11.0-12.fc34.x86_64 @updates Upgrade tigervnc-icons-1.11.0-12.fc34.noarch @updates Upgrade tigervnc-license-1.11.0-12.fc34.noarch @updates Upgrade totem-1:3.38.1-1.fc34.x86_64 @updates Upgrade totem-video-thumbnailer-1:3.38.1-1.fc34.x86_64 @updates Upgrade usermode-1.114-2.fc34.x86_64 @updates Upgrade xorg-x11-server-Xwayland-21.1.1-3.fc34.x86_64 @updates Upgraded kernel-5.12.9-300.fc34.x86_64 @@System Upgraded kernel-core-5.12.9-300.fc34.x86_64 @@System Upgraded kernel-debug-devel-5.12.9-300.fc34.x86_64 @@System Upgraded kernel-devel-5.12.9-300.fc34.x86_64 @@System Upgraded kernel-modules-5.12.9-300.fc34.x86_64 @@System Upgraded kernel-modules-extra-5.12.9-300.fc34.x86_64 @@System Removed kmod-VirtualBox-5.12.9-300.fc34.x86_64-6.1.22-1.fc34.x86_64 @@System Upgraded cmake-3.20.3-1.fc34.x86_64 @@System Upgraded cmake-data-3.20.3-1.fc34.noarch @@System Upgraded cmake-filesystem-3.20.3-1.fc34.x86_64 @@System Upgraded cmake-rpm-macros-3.20.3-1.fc34.noarch @@System Upgraded dbus-broker-28-3.fc34.x86_64 @@System Upgraded gst-editing-services-1.19.1-1.fc34.x86_64 @@System Upgraded gstreamer1-1.19.1-1.fc34.x86_64 @@System Upgraded gstreamer1-plugins-bad-free-1.19.1-2.fc34.x86_64 @@System Upgraded gstreamer1-plugins-base-1.19.1-1.fc34.x86_64 @@System Upgraded gstreamer1-plugins-good-1.19.1-1.fc34.x86_64 @@System Upgraded gstreamer1-plugins-good-gtk-1.19.1-1.fc34.x86_64 @@System Upgraded gstreamer1-plugins-good-qt-1.19.1-1.fc34.x86_64 @@System Upgraded gstreamer1-plugins-ugly-free-1.19.1-1.fc34.x86_64 @@System Upgraded libinput-1.18.0-1.fc34.x86_64 @@System Upgraded libinput-devel-1.18.0-1.fc34.x86_64 @@System Upgraded libxcrypt-4.4.20-2.fc34.i686 @@System Upgraded libxcrypt-4.4.20-2.fc34.x86_64 @@System Upgraded libxcrypt-compat-4.4.20-2.fc34.x86_64 @@System Upgraded libxcrypt-devel-4.4.20-2.fc34.i686 @@System Upgraded libxcrypt-devel-4.4.20-2.fc34.x86_64 @@System Upgraded nano-5.6.1-1.fc34.x86_64 @@System Upgraded nano-default-editor-5.6.1-1.fc34.noarch @@System Upgraded pam-1.5.1-5.fc34.x86_64 @@System Upgraded pipewire-0.3.30-2.fc34.x86_64 @@System Upgraded pipewire-alsa-0.3.30-2.fc34.x86_64 @@System Upgraded pipewire-gstreamer-0.3.30-2.fc34.x86_64 @@System Upgraded pipewire-jack-audio-connection-kit-0.3.30-2.fc34.x86_64 @@System Upgraded pipewire-libs-0.3.30-2.fc34.x86_64 @@System Upgraded pipewire-pulseaudio-0.3.30-2.fc34.x86_64 @@System Upgraded pipewire-utils-0.3.30-2.fc34.x86_64 @@System Upgraded ripgrep-12.1.1-6.fc34.x86_64 @@System Upgraded rpmautospec-rpm-macros-0.2.1-1.fc34.noarch @@System Upgraded tigervnc-1.11.0-11.fc34.x86_64 @@System Upgraded tigervnc-icons-1.11.0-11.fc34.noarch @@System Upgraded tigervnc-license-1.11.0-11.fc34.noarch @@System Upgraded totem-1:3.38.0-6.fc34.x86_64 @@System Upgraded totem-video-thumbnailer-1:3.38.0-6.fc34.x86_64 @@System Upgraded usermode-1.114-1.fc34.x86_64 @@System Upgraded xorg-x11-server-Xwayland-21.1.1-2.fc34.x86_64 @@System The delay is due to a 90-second timeout when dracut-initqueue tries to call DBus immediately after finding the root device: čen 29 09:45:20 electra dracut-initqueue[676]: Scanning devices dm-0 for LVM logical volumes electra/root čen 29 09:45:20 electra dracut-initqueue[676]: electra/swap čen 29 09:45:20 electra dracut-initqueue[697]: inactive '/dev/electra/root' [50.00 GiB] inherit čen 29 09:45:20 electra dracut-initqueue[697]: inactive '/dev/electra/home' [340.00 GiB] inherit čen 29 09:45:20 electra dracut-initqueue[697]: inactive '/dev/electra/swap' [8.00 GiB] inherit čen 29 09:45:20 electra systemd[1]: Found device /dev/mapper/electra-root. čen 29 09:45:20 electra systemd[1]: Reached target Initrd Root Device. čen 29 09:46:50 electra dracut-initqueue[699]: WARNING: D-Bus notification failed: Transport endpoint is not connected čen 29 09:46:50 electra systemd[1]: Found device /dev/electra/swap. čen 29 09:46:50 electra systemd[1]: Starting Resume from hibernation using device /dev/electra/swap... čen 29 09:46:50 electra systemd-hibernate-resume[714]: Could not resume from '/dev/electra/swap' (253:2). čen 29 09:46:50 electra kernel: PM: Image not found (code -22) čen 29 09:46:50 electra systemd[1]: systemd-hibernate-resume: Deactivated successfully. čen 29 09:46:50 electra systemd[1]: Finished Resume from hibernation using device /dev/electra/swap. čen 29 09:46:50 electra systemd[1]: Reached target Local File Systems (Pre). čen 29 09:46:50 electra systemd[1]: Reached target Local File Systems. čen 29 09:46:50 electra systemd[1]: Reached target System Initialization. At this point, dbus-broker isn't running yet, so the call can't succeed. This is due to the following recent commit to dbus-broker, which explicitly orders it after sysinit.target (which, as you can see above, is reached long after the affected DBus call). https://github.com/bus1/dbus-broker/commit/28af5ac996c94922fc49660e0efa955f57575187 In F34 as of https://bodhi.fedoraproject.org/updates/FEDORA-2021-221498e066 > From 28af5ac996c94922fc49660e0efa955f57575187 Mon Sep 17 00:00:00 2001 > From: David Rheinsberg <david.rheinsberg> > Date: Mon, 21 Jun 2021 11:12:41 +0200 > Subject: [PATCH] units: explicitly order against dbus.socket and > sysinit.target > > First of all, order all our units after dbus.socket. This is implicit > in systemd for all socket activated units. However, dues to the aliasing > to dbus.service, this is only true if our units are enabled. Hence, by > being explicit, we make sure the dependency is correctly shown when > debugging disabled services as well. > > Secondly, make sure to order after sysinit.target, so basic system > setup is done before we start. This is especially important to make sure > the launcher can resolve user names and read the disk. > > Signed-off-by: David Rheinsberg <david.rheinsberg> As a workaround until this gets fixed properly, just edit /usr/lib/systemd/system/dbus-broker.service and remove "sysinit.target" from the "After=" line. (Or if you want it the kosher way, copy that file into /etc/systemd/system and customize it there. But then you'll have to remember to remove the file once this gets fixed in the package.) Afterwards, run "dracut -f" to rebuild your initramfs. *** Bug 1976621 has been marked as a duplicate of this bug. *** Can confirm that the workaround in comment #4 works (Bug #1976621) *** Bug 1976989 has been marked as a duplicate of this bug. *** I use dnf update --exclude=dracut* till this is fixed. *** Bug 1977919 has been marked as a duplicate of this bug. *** Another workaround (Ricardo Garcia in Bug #1976989) > In my case setting notify_dbus=0 in /etc/lvm/lvm.conf and running "dracut > -f" as root did fix the issue. *** Bug 1978109 has been marked as a duplicate of this bug. *** *** Bug 1978218 has been marked as a duplicate of this bug. *** Hi, I have two computers with F34 fully upgraded on : kernel-5.12.13-300.fc34.x86_64 and dracut-055-2.fc34.x86_64. On one of them no problem : sudo systemd-analyze blame | grep dracut-initqueue\.service 215ms dracut-initqueue.service On the other one, the bug is there : sudo systemd-analyze blame | grep dracut-initqueue\.service 1min 30.421s dracut-initqueue.service I don't know why there is such a difference. Regards, Denez Who has the problem, who also has any crypto_LUKS devices attached at boot time? Could someone who is having this problem boot with rd.debug and attach a journal? journalctl -b -o short-monotonic --no-hostname > journal.log I tried to reproduce the problem to provide the information Chris Murphy requested in Comment 14 - and the problem is gone. Details: I had the problem on two systems, one with and one without encrypted partitions mounted at boot time. To work around I downgraded dracut on both system; that took me from 55-2 to 53-4. When I updated both back to 55-2 to reproduce the problem, the problem did not recur on either system. I explored the history a bit. The dracut installed at the time of my f33->f34 upgrade was 53-5. A subsequent upgrade took me to 54-12. Following that I was at 55-2. Why going back to 53-4 and forward to 55-2 suppressed the problem, I can't say. Note that downgrading/upgrading dracut has a good chance of not fixing or recreating the problem on its own. You'll need to rebuild the initramfs *with the version of dracut* you're testing, because dracut issues (and fixes) get baked into the initramfs. To trigger the problem the initrd need to be built with both a dracut-055-2, and with the latest dbus-broker update. The cause to the problem is a change in dbus-broker, causing it not to start at the time dracut expects it to start. The workaround in Comment #4 reverts the core cause to the problem, modifying the start condition for dbus-broker. But there is apparently some discussion on what is the proper fix as the change in dbus-broker is there for a reason. Thanks for the summary Henrik. Obviously I glossed over comment 4. oops! Created attachment 1797740 [details]
journal.log wiht rd.debug
On this box - /home is the encrypted partition.
Created attachment 1797742 [details] output from journalctl as requested by comment 14 *** Bug 1976873 has been marked as a duplicate of this bug. *** *** Bug 1979013 has been marked as a duplicate of this bug. *** (In reply to Henrik Nordström from comment #17) > To trigger the problem the initrd need to be built with both a dracut-055-2, > and with the latest dbus-broker update. Actually dacut-054 or later. > The cause to the problem is a change in dbus-broker, causing it not to start > at the time dracut expects it to start. > > The workaround in Comment #4 reverts the core cause to the problem, > modifying the start condition for dbus-broker. But there is apparently some > discussion on what is the proper fix as the change in dbus-broker is there > for a reason. Forgot to add that the system need to be configured forhibernate/resume operation to trigger this. How it triggers this is still unclear to me. But the latter is just a trigger, not the cause, I think. Don't quite understand yet how the resume= command line argument triggers it, or why it also makes a difference for the boot screen setup. See Bug #1976621 for more tests regarding dracut versions and resume= command line option. (In reply to Henrik Nordström from comment #23) > (In reply to Henrik Nordström from comment #17) > But the latter is just a trigger, not the cause, I think. > > Don't quite understand yet how the resume= command line argument triggers > it, or why it also makes a difference for the boot screen setup. That's because systemd-hibernate-resume (in my case) is generated from the kernel command line, and it is ordered Before=local-fs-pre.target, which in turn is in an ordering chain of local-fs-pre.target → local-fs.target → sysinit.target → dbus-broker.service. If you don't have a resume=, local-fs-pre won't do much after the point where the root dev appears and local-fs.target will be reached as soon as the root device is found and mounted (which is when the DBus call gets triggered) and dbus-broker will thus be online in time to catch that call. (In reply to Tomáš Trnka from comment #24) > If you don't have a resume=, local-fs-pre won't do much after the point > where the root dev appears > and local-fs.target will be reached as soon as the root device is found and > mounted (which is when > the DBus call gets triggered) and dbus-broker will thus be online in time to > catch that call. Still something missing in the explanation I think. Are you saying it's currently timing related where the dbus notification just happens to work without resume? The systemd-hibernate-resume@.service slice places a dependency on the swap device before local-fs-pre.target, delaying local-fs-pre.target a bit. And what exactly is it that generates the dbus event, and why? Logs indicate it's triggered via dracut-initqueue, not systemd. From what I can gather from the log messages it's from "lvm lvscan" in usr/sbin/lvm_scan (/usr/lib/dracut/modules.d/90lvm/lvm_scan.sh), which is apparently sourced by dracut-initqueue somehow. Apparently etc/udev/rules.d/64-lvm.rules (/usr/lib/dracut/modules.d/90lvm/64-lvm.rules) And is there any drawbacks from the dbus-broker startup workaround in comment #4? The dbus-broker commit it reverts says "This is especially important to make sure the launcher can resolve user names and read the disk.". And how do this guarantee that the boot order is correct for this dbus notification to work? Or is it still just timing dependent? (In reply to Henrik Nordström from comment #25) > (In reply to Tomáš Trnka from comment #24) > Still something missing in the explanation I think. Are you saying it's > currently timing related where the dbus notification just happens to work > without resume? The point is that without resume=, starting dbus-broker is independent from the initqueue processing (which is on the DBus call), so it's pretty much guaranteed that dbus-broker will start in time to catch the call (it has a 90-second timeout) and unblock initqueue. With resume=, dbus-broker can't get started until the DBus call finishes or times out. > The systemd-hibernate-resume@.service slice places a dependency on the swap > device before local-fs-pre.target, delaying local-fs-pre.target a bit. Yes, but that "a bit" is the crucial bit here. The sequence of operations is as follows: 1) initqueue runs /sbin/lvm_scan 2) /sbin/lvm_scan runs "lvchange -ay" or "vgchange -ay" to activate selected LVs/VGs depending on rd.lvm.* arguments on the kernel cmdline 3) lvchange/vgchange activates root LV 4) lvchange/vgchange fires a notification of the activated LV via DBus 5) lvchange/vgchange proceeds to activate other LVs, such as the swap LV As soon as 3) happens, systemd notices the root device (reaching local-fs-pre.target) and proceeds to mount the root FS, reaching local-fs.target and subsequently sysinit.target, which unblocks dbus-broker.service. So if there's no resume=, local-fs-pre.target only depends on the root device and nothing else, so all the other targets will be reached in parallel with initqueue sitting at the DBus call, dbus-broker will eventually come up and unblock the whole thing. If there's a resume= from a LV that would get activated in step 5, the whole thing deadlocks because step 5 can't be reached before DBus is ready and DBus can't start before step 5 is done (because sysinit.target then depends on the swap device through local-fs-pre.target). > And what exactly is it that generates the dbus event, and why? Logs indicate > it's triggered via dracut-initqueue, not systemd. From what I can gather > from the log messages it's from "lvm lvscan" in usr/sbin/lvm_scan > (/usr/lib/dracut/modules.d/90lvm/lvm_scan.sh), which is apparently sourced > by dracut-initqueue somehow. Apparently etc/udev/rules.d/64-lvm.rules > (/usr/lib/dracut/modules.d/90lvm/64-lvm.rules) Exactly, it's a notification of a LV activation, fired by lvm if the "notify_dbus" setting is enabled in lvm.conf. And lvm_scan is enqueued into the initqueue by etc/udev/rules.d/64-lvm.rules. > And is there any drawbacks from the dbus-broker startup workaround in > comment #4? The dbus-broker commit it reverts says "This is especially > important to make sure the launcher can resolve user names and read the > disk.". Well, that's just dbus-broker assuming it can't get much work done without a root file system, which is not a valid assumption inside the initramfs. Any name resolution mechanism requiring SSSD and/or the network is bound to fail in the initramfs anyway. > And how do this guarantee that the boot order is correct for this > dbus notification to work? Or is it still just timing dependent? Assuming that your dbus-broker isn't configured to run under an user that requires network name resolution, everything should work. All the files necessary for dbus-broker to start should be bundled by dracut in the initramfs image, so not having any filesystems should not be an issue. Eliminating the dependency on sysinit.target just breaks the dependency cycle. I can confirm, got the same issue going from F33->F34 on T490s and https://bugzilla.redhat.com/show_bug.cgi?id=1976653#c4 solved it here! Thanks. The issue arose for me after a F33->F34 update but the workaround in comment 4 did not work for me. I did run `sudo dracut -f` afterwards. I will follow this bug for developments. (In reply to Demetris M from comment #28) > The issue arose for me after a F33->F34 update but the workaround in comment > 4 did not work for me. I did run `sudo dracut -f` afterwards. I will follow > this bug for developments. Did you follow both steps? Edit /usr/lib/systemd/system/dbus-broker.service and THEN run dracut -f? (In reply to Henrik Nordström from comment #29) > Did you follow both steps? Edit /usr/lib/systemd/system/dbus-broker.service > and THEN run dracut -f? Yes, with the caveat that I did copy the service file to /etc/systemd/system and made the changes there, like the comment suggested. I confirmed that the correct path was active with systemctl status. I will try to edit /usr/lib/systemd/system/dbus-broker.service directly and report any difference in the outcome. (In reply to Henrik Nordström from comment #29) > (In reply to Demetris M from comment #28) > > The issue arose for me after a F33->F34 update but the workaround in comment > > 4 did not work for me. I did run `sudo dracut -f` afterwards. I will follow > > this bug for developments. > > Did you follow both steps? Edit /usr/lib/systemd/system/dbus-broker.service > and THEN run dracut -f? Update: There is no difference in the outcome, the workaround didn't work on my system. To be clear, I made sure the service file was reloaded after editing it. (In reply to Demetris M from comment #31) > Update: There is no difference in the outcome, the workaround didn't work on > my system. > > To be clear, I made sure the service file was reloaded after editing it. Which kernel were you running when running "dracut -f"? If nothing is specified then it only rebuilds the initramfs for the current running kernel, not other versions. If not running the kernel where the boot is delayed then you must also specify the kernel version to dracut. a) Let the computer sit until it boots. Then run "dracut -f". or b) Boot another kernel version with an initrd which do not delay the boot. Then run "dracut -f 5.12.13-300.fc34.x86_64" (or whatever kernel version is it you want to rebuild the initramfs for, "rpm -q kernel" for installed versions) (In reply to Henrik Nordström from comment #32) > Which kernel were you running when running "dracut -f"? If nothing is > specified then it only rebuilds the initramfs for the current running > kernel, not other versions. > > If not running the kernel where the boot is delayed then you must also > specify the kernel version to dracut. > > a) Let the computer sit until it boots. Then run "dracut -f". > > or > > b) Boot another kernel version with an initrd which do not delay the boot. > Then run "dracut -f 5.12.13-300.fc34.x86_64" (or whatever kernel version is > it you want to rebuild the initramfs for, "rpm -q kernel" for installed > versions) Hi, thank you for the instructions. I can confirm that the workaround has now worked after a fresh attempt, although I am not sure exactly what step I missed before (I was booting the problematic -latest- kernel version, and running dracut -f without a kernel specified, and then rebooting the same latest version). *** Bug 1981292 has been marked as a duplicate of this bug. *** Suggesting to update the title of this BZ, otherwise there will be more and more duplicates. Search in bugzilla is horrible especially if title is not very informative. So I can see that there this was caused by the following issues/commits https://github.com/bus1/dbus-broker/issues/268 https://github.com/bus1/dbus-broker/commit/28af5ac996c94922fc49660e0efa955f57575187 They contain a little bit of discussion but seeing that Lennart Poettering has openend the ticket it seems unlikely that it will be rolled back. Is there another thread that discusses solutions? The Github commit mentions contributing a separate unit file for initrd, for example. Rolling that back seems the obvious solution given the severity of the regression it introduced vs the subtleties of the improvements it made. For several weeks now, anyone running (or installing) Fedora on a laptop can't boot unless they're glacially patient. The population of users and potential users who can find this bugzilla and implement the workaround is surely only a fraction of those who've added themselves to the CC list. If the upstream dbus people say "not our problem", then reverting it in the fedora spec file seems straightforward enough till a long-term compromise is reached. Agree ^^ As a counter data point, I did add myself to Cc of this bug out of curiosity, but I'm not seeing this 1.5 minute boot delay on any of my machines (and they had a couple of kernel updates since the fix was released). The fix to the original bug does work for me and makes one of my VMs boot properly instead of booting without network connectivity. So, I'd very much like to see the fix not being reverted. (In reply to Dominik 'Rathann' Mierzejewski from comment #39) > As a counter data point, I did add myself to Cc of this bug out of curiosity, > but I'm not seeing this 1.5 minute boot delay on any of my machines Same. I never experienced this bug. Currently I have: dracut-055-3.fc34.x86_64 dbus-broker-29-1.fc34.x86_64 kernel-5.13.1-300.fc34.x86_64 It appears that having suspend/resume enabled in one's booting kernel is required to trigger the problem. So, if you're running on a laptop: it's a problem; if not, no worries. Comments 24 and following have a discussion of how/why this might matter. If people without the problem also happen to have resume set, then maybe there's some other way out of the problem? > So, I'd very much like to see the fix not being reverted.
What purpose does saying you are not experiencing the issue without providing additional info provide? You're just adding noise to the issue. I and many others do experience this issue and the dbus broker workaround works. I subscribed to the bug to see when a fix will be provided. If the problem doesn't exist for you great, not sure why preventing a fix for other people is in any way productive.
For what it's worth, as a laptop user I also found this bug intrusive at the highest degree. Achieving reliable suspend and hibernation on Linux eas a great achievement and giving that up for some nearly invisible benefit is incomprehensible. I don't think the issue is unique to laptops. I'm on a desktop and don't use suspend/hibernate. I experience the issue in the OP whenever I reboot my computer. If resume=UUID=<swapUUID> boot parameter is needed to trigger the bug, that might explain the variable results. Fedora 33 and newer do not create a swap partition by default and thus this boot parameter doesn't exist. It would only be present on systems upgraded from Fedora 32 and older; or custom installations with a swap partition added manually. I tried that on my Lenovo laptop. Normally it has a resume=/dev/mapper/… option set. When I entered the boot menu and edited out that option, it no longer hangs for a minute and a half during the boot. (In reply to remyabel from comment #42) > > So, I'd very much like to see the fix not being reverted. > > What purpose does saying you are not experiencing the issue without > providing additional info provide? You're just adding noise to the issue. I > and many others do experience this issue and the dbus broker workaround > works. I subscribed to the bug to see when a fix will be provided. If the > problem doesn't exist for you great, not sure why preventing a fix for other > people is in any way productive. Well, because reverting the dbus-broker dependencies fix as proposed in Comment 37 would break my VM? Without the fix, I need to manually log in to the console and restart systemd-networkd to get network connectivity. This bug slows down boot on my laptop since some time. I found this bug report via a post on Ask Fedora (https://ask.fedoraproject.org/t/slow-boot-initrd-dracut-initqueue-service/10278/4). Some details below: Filesystem: /, /home and swap within LVM with LUKS. Swap partition with 26 GB, system allows hibernation to disk. Laptop: Dell XPS 13 with 16 GB RAM OS: Fedora 34, upgrading since Fedora ~28 $ systemd-analyze blame 1min 45.652s dracut-initqueue.service $ cat /proc/cmdline BOOT_IMAGE=(hd0,gpt4)/vmlinuz-5.12.14-300.fc34.x86_64 root=/dev/mapper/fedora_localhost--live-root ro resume=/dev/mapper/fedora_localhost--live-swap rd.lvm.lv=fedora_localhost-live/root rd.luks.uuid=luks-0eb6e5b3-704e-4df7-a28b-c37ae7264274 rd.lvm.lv=fedora_localhost-live/swap rhgb quiet $ rpm -q dracut dracut-055-3.fc34.x86_64 $ uname -r 5.12.14-300.fc34.x86_64 $ rpm -q dbus-broker dbus-broker-29-1.fc34.x86_64 (In reply to wurstsemmel from comment #48) > This bug slows down boot on my laptop since some time. I found this bug > report via a post on Ask Fedora > > $ cat /proc/cmdline > resume=/dev/mapper/fedora_localhost--live-swap Any of the workarounds mentioned related to this bug should help on your system Workaround from comment #4: Revert the dbus-broker change, allowing it to start early as it did before the last update: 0. Let the system sit until it boots. 1. Edit /usr/lib/systemd/system/dbus-broker.service and remove "sysinit.target" from the "After=" line. 2. Run "dracut -f" to rebuild the initramfs. Note: if you boot another kernel version then you need to specify the kernel version for which you want to rebuild the initrd to dracut. By default it only rebuilds the initrd for the current booted kernel. Or disable hibernate/resume from disk support. Most people do not use this and having the option enabled triggers the bug even if you do not use it. Does not affect normal suspend to ram. grubby --remove-args=resume --update-kernel=ALL Or disable dbus notification in lvm. 0. Let the system sit until it boots. 1. Edit /etc/lvm/lvm.conf and change notify_dbus to 0 (the normal setting is 1). 2. Run "dracut -f" to rebuild the initramfs. Note: if you boot another kernel version then you need to specify the kernel version for which you want to rebuild the initrd to dracut. By default it only rebuilds the initrd for the current booted kernel. then reboot to verify (In reply to Dominik 'Rathann' Mierzejewski from comment #47) > Well, because reverting the dbus-broker dependencies fix as proposed in > Comment 37 would break my VM? Without the fix, I need to manually log in to > the console and restart systemd-networkd to get network connectivity. Is there anything special with the network setup in your VM? Why does changing the startup priority of dbus-broker break your VM? (In reply to Henrik Nordström from comment #50) > (In reply to Dominik 'Rathann' Mierzejewski from comment #47) > > > Well, because reverting the dbus-broker dependencies fix as proposed in > > Comment 37 would break my VM? Without the fix, I need to manually log in to > > the console and restart systemd-networkd to get network connectivity. > > Is there anything special with the network setup in your VM? Why does > changing the startup priority of dbus-broker break your VM? I'm using systemd-networkd instead of NetworkManager to minimize the OS footprint. It has 2 static IPs, so there's not need for all the NM features, anyway. systemd-networkd fails to start if dbus-broker is unavailable. See bug 1948042 , https://github.com/systemd/systemd/issues/19212 and https://github.com/bus1/dbus-broker/issues/268 for context. |