Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 2121791 - kernel regression causing mdraid systems to hang during reboot
Summary: kernel regression causing mdraid systems to hang during reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: AcceptedBlocker
Depends On:
Blocks: F37BetaBlocker
TreeView+ depends on / blocked
 
Reported: 2022-08-26 15:59 UTC by Dusty Mabe
Modified: 2023-02-24 21:11 UTC (History)
30 users (show)

Fixed In Version: kernel-5.19.6-300.fc37
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-09-02 22:27:35 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
console.txt (deleted)
2022-08-26 16:02 UTC, Dusty Mabe
no flags Details
journal.txt (deleted)
2022-08-26 16:03 UTC, Dusty Mabe
no flags Details

Description Dusty Mabe 2022-08-26 15:59:21 UTC
1. Please describe the problem:

Also reported here: https://github.com/coreos/fedora-coreos-tracker/issues/1282

In Fedora CoreOS we have tests that set up RAID1 on the /boot/ and /root/ partitions and then subsequently removes one of the disks to simulate a failure. Sometime recently this test started timing out occasionally. Looking a bit closer it appears instances are getting stuck during reboot:

```
Kernel 6.0.0-0.rc2.19.fc38.x86_64 on an x86_64 (ttyS0)
SSH host key: SHA256:cLWgLYKmVuFgnQsGTQu1SqTYzm85d3mdtzFN9hISmzk (ECDSA)
SSH host key: SHA256:xhQ8cmpyrUg9NqHNHss1aCaISL7NjbikRyTn/BPXbfM (ED25519)
SSH host key: SHA256:5WrL3R0ThWe+dbXyhYNG9bNOWNjo0jQha9ADU/AnJJI (RSA)
ens5: 10.0.2.15 fec0::118e:7c2c:92c5:a1ce
Ignition: ran on 2022/08/23 12:42:42 UTC (at least 2 boots ago)
Ignition: user-provided config was applied
Ignition: wrote ssh authorized keys file for user: core
qemu0 login: [   17.165546] md/raid1:md126: Disk failure on vdb3, disabling device.
[   17.165546] md/raid1:md126: Operation continuing on 1 devices.
[   17.168987] md: super_written gets error=-5
[   17.169323] md/raid1:md127: Disk failure on vdb4, disabling device.
[   17.169323] md/raid1:md127: Operation continuing on 1 devices.
[[0;32m  OK  [0m] Stopped [0;1;39msession-3.scope[0m - Session 3 of User core.
[[0;32m  OK  [0m] Stopped [0;1;39msession-12.scope[0m - Session 12 of User core.
[[0;32m  OK  [0m] Removed slice [0;1;39msystem-after…- Slice /system/afterburn-sshkeys.
[[0;32m  OK  [0m] Removed slice [0;1;39msystem-modpr…lice[0m - Slice /system/modprobe.
[[0;32m  OK  [0m] Removed slice [0;1;39msystem-sshd\…e[0m - Slice /system/sshd-keygen.
[[0;32m  OK  [0m] Stopped target [0;1;39mmulti-user.target[0m - Multi-User System.
[[0;32m  OK  [0m] Stopped target [0;1;39mafterburn-s…hkeys@.service template instances.
[[0;32m  OK  [0m] Stopped target [0;1;39mgetty.target[0m - Login Prompts.
[[0;32m  OK  [0m] Stopped target [0;1;39mmachines.target[0m - Containers.
[[0;32m  OK  [0m] Stopped target [0;1;39mnss-lookup.…m - Host and Network Name Lookups.
[[0;32m  OK  [0m] Stopped target [0;1;39mremote-cryp…et[0m - Remote Encrypted Volumes.
[[0;32m  OK  [0m] Stopped target [0;1;39msound.target[0m - Sound Card.
[[0;32m  OK  [0m] Stopped target [0;1;39mtimers.target[0m - Timer Units.
[[0;32m  OK  [0m] Stopped [0;1;39mfstrim.timer[0m - Discard unused blocks once a week.
[[0;32m  OK  [0m] Stopped [0;1;39mlogrotate.timer[0m - Daily rotation of log files.
[[0;32m  OK  [0m] Stopped [0;1;39mraid-check.timer[… - Weekly RAID setup health check.
[[0;32m  OK  [0m] Stopped [0;1;39mrpm-ostree-countme… Weekly rpm-ostree Count Me timer.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-tmpfiles-c… Cleanup of Temporary Directories.
[[0;32m  OK  [0m] Closed [0;1;39mlvm2-lvmpolld.socket[0m - LVM2 poll daemon socket.
[[0;32m  OK  [0m] Closed [0;1;39msystemd-coredump.so…et[0m - Process Core Dump Socket.
[[0;32m  OK  [0m] Closed [0;1;39msystemd-rfkill.sock…l Switch Status /dev/rfkill Watch.
         Unmounting [0;1;39metc.mount[0m - /etc...
         Unmounting [0;1;39musr.mount[0m - /usr...
         Stopping [0;1;39mNetworkManager-di…nager Script Dispatcher Service...
         Stopping [0;1;39mchronyd.service[0m - NTP client/server...
[[0;32m  OK  [0m] Stopped [0;1;39mconsole-login-help…via console-login-helper-messages.
         Stopping [0;1;39mdracut-shutdown.s…tore /run/initramfs on shutdown...
         Stopping [0;1;39mgetty[0m - Getty on tty1...
         Stopping [0;1;39mpolkit.service[0m - Authorization Manager...
         Stopping [0;1;39mrpm-ostreed.servi…ostree System Management Daemon...
         Stopping [0;1;39mserial-getty@ttyS…ice[0m - Serial Getty on ttyS0...
         Stopping [0;1;39msshd.service[0m - OpenSSH server daemon...
         Stopping [0;1;39msystemd-hostnamed.service[0m - Hostname Service...
         Stopping [0;1;39msystemd-logind.se…ice[0m - User Login Management...
         Stopping [0;1;39msystemd-random-se…ice[0m - Load/Save Random Seed...
         Stopping [0;1;39muser[0m - User Manager for UID 1000...
         Stopping [0;1;39mzincati.service[0m - Zincati Update Agent...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-logind.service[0m - User Login Management.
[[0;32m  OK  [0m] Stopped [0;1;39msshd.service[0m - OpenSSH server daemon.
[[0;32m  OK  [0m] Stopped [0;1;39mzincati.service[0m - Zincati Update Agent.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-hostnamed.service[0m - Hostname Service.
[[0;32m  OK  [0m] Stopped [0;1;39mchronyd.service[0m - NTP client/server.
[[0;32m  OK  [0m] Stopped [0;1;39mrpm-ostreed.servic…m-ostree System Management Daemon.
[[0;32m  OK  [0m] Stopped [0;1;39mpolkit.service[0m - Authorization Manager.
[[0;32m  OK  [0m] Stopped [0;1;39mNetworkManager-dis…Manager Script Dispatcher Service.
[[0;32m  OK  [0m] Stopped [0;1;39mgetty[0m - Getty on tty1.
[[0;32m  OK  [0m] Stopped [0;1;39mserial-getty@ttyS0…rvice[0m - Serial Getty on ttyS0.
[[0;32m  OK  [0m] Stopped [0;1;39muser[0m - User Manager for UID 1000.
[[0;1;31mFAILED[0m] Failed unmounting [0;1;39metc.mount[0m - /etc.
[[0;1;31mFAILED[0m] Failed unmounting [0;1;39musr.mount[0m - /usr.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-random-see…rvice[0m - Load/Save Random Seed.
[[0;32m  OK  [0m] Stopped [0;1;39mdracut-shutdown.se…estore /run/initramfs on shutdown.
[[0;32m  OK  [0m] Removed slice [0;1;39msystem-getty.slice[0m - Slice /system/getty.
[[0;32m  OK  [0m] Removed slice [0;1;39msystem-seria…[0m - Slice /system/serial-getty.
[[0;32m  OK  [0m] Stopped target [0;1;39mboot-comple…arget[0m - Boot Completion Check.
         Starting [0;1;39mdracut-shutdown-o…down failure to perform cleanup...
         Stopping [0;1;39muser-runtime-dir@…untime Directory /run/user/1000...
[[0;32m  OK  [0m] Finished [0;1;39mdracut-shutdown-o…utdown failure to perform cleanup.
[[0;32m  OK  [0m] Unmounted [0;1;39mrun-user-1000.mount[0m - /run/user/1000.
[[0;32m  OK  [0m] Stopped [0;1;39muser-runtime-dir@1… Runtime Directory /run/user/1000.
[[0;32m  OK  [0m] Removed slice [0;1;39muser-1000.slice[0m - User Slice of UID 1000.
         Stopping [0;1;39msystemd-user-sess…vice[0m - Permit User Sessions...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-user-sessi…ervice[0m - Permit User Sessions.
[[0;32m  OK  [0m] Stopped target [0;1;39mnetwork.target[0m - Network.
[[0;32m  OK  [0m] Stopped target [0;1;39mnss-user-lo…[0m - User and Group Name Lookups.
[[0;32m  OK  [0m] Stopped target [0;1;39mremote-fs.target[0m - Remote File Systems.
[[0;32m  OK  [0m] Stopped target [0;1;39mremote-fs-p…eparation for Remote File Systems.
         Stopping [0;1;39mNetworkManager.service[0m - Network Manager...
[[0;32m  OK  [0m] Stopped [0;1;39mconsole-login-help…via console-login-helper-messages.
[[0;32m  OK  [0m] Stopped target [0;1;39msshd-keygen.target[0m.
[[0;32m  OK  [0m] Stopped [0;1;39mcoreos-ignition-wr…reate Ignition Status Issue Files.
         Stopping [0;1;39msystemd-homed-act…vice[0m - Home Area Activation...
[[0;32m  OK  [0m] Stopped [0;1;39mNetworkManager.service[0m - Network Manager.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-homed-acti…ervice[0m - Home Area Activation.
[[0;32m  OK  [0m] Stopped target [0;1;39mnetwork-pre…get[0m - Preparation for Network.
         Stopping [0;1;39msystemd-homed.service[0m - Home Area Manager...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-homed.service[0m - Home Area Manager.
[[0;32m  OK  [0m] Stopped target [0;1;39mbasic.target[0m - Basic System.
[[0;32m  OK  [0m] Stopped target [0;1;39mpaths.target[0m - Path Units.
[[0;32m  OK  [0m] Stopped [0;1;39mostree-finalize-st… OSTree Monitor Staged Deployment.
[[0;32m  OK  [0m] Stopped target [0;1;39mslices.target[0m - Slice Units.
[[0;32m  OK  [0m] Removed slice [0;1;39muser.slice[0m - User and Session Slice.
[[0;32m  OK  [0m] Stopped target [0;1;39msockets.target[0m - Socket Units.
[[0;32m  OK  [0m] Closed [0;1;39mbootupd.socket[0m.
[[0;32m  OK  [0m] Closed [0;1;39mdocker.socket[0m - Docker Socket for the API.
[[0;32m  OK  [0m] Closed [0;1;39miscsid.socket[0m - Open-iSCSI iscsid Socket.
[[0;32m  OK  [0m] Closed [0;1;39miscsiuio.socket[0m - Open-iSCSI iscsiuio Socket.
         Stopping [0;1;39mdbus-broker.servi…[0m - D-Bus System Message Bus...
[[0;32m  OK  [0m] Stopped [0;1;39mdbus-broker.service[0m - D-Bus System Message Bus.
[[0;32m  OK  [0m] Closed [0;1;39mdbus.socket[0m - D-Bus System Message Bus Socket.
[[0;32m  OK  [0m] Stopped target [0;1;39msysinit.target[0m - System Initialization.
[[0;32m  OK  [0m] Unset automount [0;1;39mproc-sys-f…rmats File System Automount Point.
[[0;32m  OK  [0m] Stopped target [0;1;39mcryptsetup.…get[0m - Local Encrypted Volumes.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-ask-passwo…quests to Console Directory Watch.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-ask-passwo… Requests to Wall Directory Watch.
[[0;32m  OK  [0m] Stopped target [0;1;39mintegrityse…Local Integrity Protected Volumes.
[[0;32m  OK  [0m] Stopped target [0;1;39mveritysetup… - Local Verity Protected Volumes.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-boot-updat…0m - Automatic Boot Loader Update.
         Stopping [0;1;39msystemd-resolved.…e[0m - Network Name Resolution...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-sysctl.service[0m - Apply Kernel Variables.
[[0;32m  OK  [0m] Stopped [0;1;39mcoreos-printk-quie…eOS: Set printk To Level 4 (warn).
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-modules-lo…service[0m - Load Kernel Modules.
         Stopping [0;1;39msystemd-update-ut…rd System Boot/Shutdown in UTMP...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-resolved.s…ice[0m - Network Name Resolution.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-update-utm…cord System Boot/Shutdown in UTMP.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-tmpfiles-s…te Volatile Files and Directories.
[[0;32m  OK  [0m] Stopped target [0;1;39mlocal-fs.target[0m - Local File Systems.
         Unmounting [0;1;39mboot.mount[0m - CoreOS Dynamic Mount for /boot...
         Unmounting [0;1;39mtmp.mount[0m - Temporary Directory /tmp...
         Stopping [0;1;39msystemd-journal-f…h Journal to Persistent Storage...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-journal-fl…ush Journal to Persistent Storage.
[[0;32m  OK  [0m] Stopped [0;1;39mostree-remount.ser… - OSTree Remount OS/ Bind Mounts.
         Unmounting [0;1;39mvar.mount[0m - /var...
[[0;32m  OK  [0m] Unmounted [0;1;39mboot.mount[0m - CoreOS Dynamic Mount for /boot.
[[0;32m  OK  [0m] Unmounted [0;1;39mtmp.mount[0m - Temporary Directory /tmp.
[[0;32m  OK  [0m] Stopped target [0;1;39mswap.target[0m - Swaps.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-fsck@dev-d…32080-d1b0-4d76-8ea2-c0fdd5f2737a.
[[0;32m  OK  [0m] Removed slice [0;1;39msystem-syste…[0m - Slice /system/systemd-fsck.
[[0;32m  OK  [0m] Unmounted [0;1;39mvar.mount[0m - /var.
         Unmounting [0;1;39msysroot-ostree-…ostree/deploy/fedora-coreos/var...
[[0;32m  OK  [0m] Unmounted [0;1;39msysroot-ostree-d…t/ostree/deploy/fedora-coreos/var.
         Unmounting [0;1;39msysroot.mount[0m - /sysroot...
[[0;32m  OK  [0m] Unmounted [0;1;39msysroot.mount[0m - /sysroot.
[[0;32m  OK  [0m] Stopped target [0;1;39mblockdev@de… Preparation for /dev/mapper/root.
[[0;32m  OK  [0m] Stopped target [0;1;39mlocal-fs-pr…reparation for Local File Systems.
         Stopping [0;1;39mlvm2-monitor.serv…ng dmeventd or progress polling...
         Stopping [0;1;39msystemd-cryptsetu…m - Cryptography Setup for root...
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-remount-fs…ount Root and Kernel File Systems.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-tmpfiles-s…reate Static Device Nodes in /dev.
[[0;32m  OK  [0m] Stopped [0;1;39msystemd-cryptsetup…[0m - Cryptography Setup for root.
[[0;32m  OK  [0m] Stopped target [0;1;39mcryptsetup-…m - Local Encrypted Volumes (Pre).
[[0;32m  OK  [0m] Reached target [0;1;39mumount.target[0m - Unmount All Filesystems.
[[0;32m  OK  [0m] Stopped [0;1;39mlvm2-monitor.servi…sing dmeventd or progress polling.
[[0;32m  OK  [0m] Reached target [0;1;39mshutdown.target[0m - System Shutdown.
[[0;32m  OK  [0m] Reached target [0;1;39mfinal.target[0m - Late Shutdown Services.
[[0;32m  OK  [0m] Finished [0;1;39msystemd-reboot.service[0m - System Reboot.
[[0;32m  OK  [0m] Reached target [0;1;39mreboot.target[0m - System Reboot.
[   17.978854] block device autoloading is deprecated and will be removed.
[   17.982555] block device autoloading is deprecated and will be removed.
[   17.985537] block device autoloading is deprecated and will be removed.
[   17.987546] block device autoloading is deprecated and will be removed.
[   17.989540] block device autoloading is deprecated and will be removed.
[   17.991547] block device autoloading is deprecated and will be removed.
[   17.993555] block device autoloading is deprecated and will be removed.
[   17.995539] block device autoloading is deprecated and will be removed.
[   17.997577] block device autoloading is deprecated and will be removed.
[   17.999544] block device autoloading is deprecated and will be removed.
[   22.979465] blkdev_get_no_open: 1666 callbacks suppressed
[   22.979467] block device autoloading is deprecated and will be removed.
[   22.984459] block device autoloading is deprecated and will be removed.
[   22.986473] block device autoloading is deprecated and will be removed.
[   22.988470] block device autoloading is deprecated and will be removed.
[   22.990469] block device autoloading is deprecated and will be removed.
[   22.992471] block device autoloading is deprecated and will be removed.
[   22.994471] block device autoloading is deprecated and will be removed.
[   22.996469] block device autoloading is deprecated and will be removed.
[   22.998468] block device autoloading is deprecated and will be removed.
[   23.000470] block device autoloading is deprecated and will be removed.
...
...
...
[  618.221270] blkdev_get_no_open: 1664 callbacks suppressed
[  618.221273] block device autoloading is deprecated and will be removed.
[  618.224274] block device autoloading is deprecated and will be removed.
[  618.227267] block device autoloading is deprecated and will be removed.
[  618.229274] block device autoloading is deprecated and will be removed.
[  618.231277] block device autoloading is deprecated and will be removed.
[  618.233277] block device autoloading is deprecated and will be removed.
[  618.235282] block device autoloading is deprecated and will be removed.
[  618.237370] block device autoloading is deprecated and will be removed.
[  618.239356] block device autoloading is deprecated and will be removed.
[  618.241290] block device autoloading is deprecated and will be removed.
```



2. What is the Version-Release number of the kernel:

6.0.0-0.rc2.19.fc38.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Yes. It appears this problem was introduced between `kernel-5.19.0-0.rc3.27.fc37` (good)
and `kernel-5.19.0-0.rc4.33.fc37` (bad). 


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes I can reproduce the issue.

The steps to set it up are complicated. The source for our mirror tests are
here: https://github.com/coreos/coreos-assembler/blob/61b68b437d489da8d899bf5788732c329e49c9a9/mantle/kola/tests/misc/boot-mirror.go

TL;DR set up a RAID. Simulate a disk failure. Try to reboot. Watch reboot take much longer than usual. 

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Yes.


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.


Will attach.

Comment 1 Dusty Mabe 2022-08-26 16:02:50 UTC
Created attachment 1907916 [details]
console.txt

Comment 2 Dusty Mabe 2022-08-26 16:03:33 UTC
Created attachment 1907917 [details]
journal.txt

Comment 3 Dusty Mabe 2022-08-26 16:05:35 UTC
I did a git bisect between v5.19-rc3 and v5.19-rc4. I believe the first bad commit is a09b314005f3:


```
$ git bisect bad
a09b314005f3a0956ebf56e01b3b80339df577cc is the first bad commit
commit a09b314005f3a0956ebf56e01b3b80339df577cc
Author: Christoph Hellwig <hch>
Date:   Tue Jun 14 09:48:27 2022 +0200

    block: freeze the queue earlier in del_gendisk
    
    Freeze the queue earlier in del_gendisk so that the state does not
    change while we remove debugfs and sysfs files.
    
    Ming mentioned that being able to observer request in debugfs might
    be useful while the queue is being frozen in del_gendisk, which is
    made possible by this change.
    
    Signed-off-by: Christoph Hellwig <hch>
    Link: https://lore.kernel.org/r/20220614074827.458955-5-hch@lst.de
    Signed-off-by: Jens Axboe <axboe>

 block/genhd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
```


Reverting this commit and building on top of latest git master (4c612826b) gave me successful results.

Comment 4 Ivan Mironov 2022-08-28 05:06:11 UTC
This happens to me on Fedora 36 x86_64 with kernel 5.19.4. I see a lot of "block device autoloading is deprecated and will be removed" during reboot or poweroff on a machine with mdadm RAID6, and reboot/poweroff is not happening. I can confirm that rebuilding 5.19.4 with reverted "block: freeze the queue earlier in del_gendisk" fixes this.

Comment 5 Ivan Mironov 2022-08-28 12:41:16 UTC
Interestingly, Fedora 36 aarch64 with kernel 5.19.4 on an Allwinner H6 SBC with mdadm RAID1 is not affected.

Comment 6 Leszek Matok 2022-08-29 16:28:53 UTC
F36 (x86-64), mdadm raid1 on / (no separate /boot, but that doesn't seem important).

I've had this issue with 5.19.1 (from https://koji.fedoraproject.org/koji/buildinfo?buildID=2044709) and now after the official 5.19.4 F36 update, more people will see their reboots hang with these messages.

I came here to remind future visitors that Alt-SysRq-B helps :)

Comment 7 Norbert Jurkeit 2022-08-30 13:36:58 UTC
Same for me on a PC with several mdadm raid1 devices after upgrade from 5.18.19-200.fc36.x86_64 to 5.19.4-200.fc36.x86_64.

This didn't happen however during kernel test days running kerneltest-5.19.1.iso on the same hardware.

Comment 8 Fedora Blocker Bugs Application 2022-08-31 19:50:17 UTC
Proposed as a Freeze Exception for 37-beta by Fedora user dustymabe using the blocker tracking app because:

 Machine's with RAID1 setups should be able to shutdown/reboot without hanging.

Comment 9 pgnet.dev 2022-08-31 20:01:26 UTC
fwiw, given source of the logged message,

[v2] block: deprecate autoloading based on dev_t
 https://patchwork.kernel.org/project/linux-block/patch/20220104071647.164918-1-hch@lst.de/#24842631

[PATCH] block: deprecate autoloading based on dev_t
  https://patchwork.kernel.org/project/linux-block/patch/20220104071647.164918-1-hch@lst.de/#24842631


here, with

	uname -rm
		5.19.4-200.fc36.x86_64 x86_64

changing ,

edit /etc/mdadm.conf

	MAILADDR root
-	AUTO +imsm +1.x -all
+	#AUTO +imsm +1.x -all
-	ARRAY /dev/md/0 level=raid1 num-devices=2 UUID=11...bb
-	ARRAY /dev/md/1 level=raid1 num-devices=2 UUID=22...cc
+	ARRAY /dev/md0 level=raid1 num-devices=2 metadata=1.2 UUID=11...bb name=dev003:0
+	ARRAY /dev/md1 level=raid1 num-devices=2 metadata=1.2 UUID=22...cc name=dev003:1

seems to consistently eliminate the message on boot start

before edit,

	mdadm --detail --scan
		ARRAY /dev/md/dev003:0 metadata=1.2 name=dev003:0 UUID=11...bb
		ARRAY /dev/md/dev003:1 metadata=1.2 name=dev003:1 UUID=22...cc

	ls -al /dev/md/dev003\:* /dev/md{0,1}
		brw-rw---- 1 root disk 9, 0 Aug 31 12:42 /dev/md0
		brw-rw---- 1 root disk 9, 1 Aug 31 12:42 /dev/md1
		lrwxrwxrwx 1 root root    6 Aug 31 12:42 /dev/md/dev003:0 -> ../md0
		lrwxrwxrwx 1 root root    6 Aug 31 12:42 /dev/md/dev003:1 -> ../md1

	dmesg | grep deprecated
		[    7.026798] block device autoloading is deprecated and will be removed.

after edit,

	mdadm --detail --scan
		ARRAY /dev/md0 metadata=1.2 name=dev003:0 UUID=11...bb
		ARRAY /dev/md1 metadata=1.2 name=dev003:1 UUID=22...cc

	ls -al /dev/md/dev003\:* /dev/md{0,1}
		ls: cannot access '/dev/md/dev003:*': No such file or directory
		brw-rw---- 1 root disk 9, 0 Aug 31 12:42  /dev/md0
		brw-rw---- 1 root disk 9, 1 Aug 31 12:42  /dev/md1

	dmesg | grep deprecated
		(empty)

and, on this one test machine, eliminates boot hang/loop on restart; have NOT tested more broadly yet

Comment 10 pgnet.dev 2022-08-31 21:02:42 UTC
> and, on this one test machine, eliminates boot hang/loop on restart; have NOT tested more broadly yet

tested these changes on 4 machines.

2 stopped looping on boot, 2 continue to do so.

there's more to this ...

Comment 11 Fedora Blocker Bugs Application 2022-09-01 18:15:12 UTC
Proposed as a Blocker for 37-beta by Fedora user bcotton using the blocker tracking app because:

 Adding an F37 Beta blocker nomination to the existing FE nomination. This seems like a violation of the basic release criterion:

    It must be possible to trigger a clean system shutdown using standard console commands.

https://fedoraproject.org/wiki/Basic_Release_Criteria#Shutdown

Comment 12 Adam Williamson 2022-09-01 18:28:12 UTC
The commit in question was reverted in 5.19.6-300.fc37, it looks like:

https://koji.fedoraproject.org/koji/buildinfo?buildID=2055666

"- Revert "block: freeze the queue earlier in del_gendisk" (Justin M. Forbes)"

can you try that kernel and see if it helps?

Comment 13 Justin M. Forbes 2022-09-01 19:30:22 UTC
Yes, it was reverted, and discussed.  I do not want this bug closed because upstream has made no movement on this yet.  It is not reverted from Rawhide and won't be reverted in the 6.0 branch unless upstream chooses to do so. Leaving this open will help me track it without forgetting.

Comment 14 Adam Williamson 2022-09-01 19:37:33 UTC
We don't have to close the bug, but for F37 Beta purposes, I need to know if that update addresses it, so we can pull it into Beta.

Comment 15 pgnet.dev 2022-09-01 20:20:49 UTC
fwiw, on 2 F36 boxes, with

	uname -rm
		5.19.4-200.fc36.x86_64 x86_64

hanging in loop @ shutdown, upgrading to

	uname -rm
		5.19.6-200.fc36.x86_64 x86_64

, remaking init, after reboot, subsequent reboots are OK.  no more loop. 

no testing beyond that.

Comment 16 Fedora Update System 2022-09-01 21:31:43 UTC
FEDORA-2022-ccb0138bb6 has been submitted as an update to Fedora 37. https://bodhi.fedoraproject.org/updates/FEDORA-2022-ccb0138bb6

Comment 17 Dusty Mabe 2022-09-02 03:39:19 UTC
5.19.6 builds seem to be working for me.

Comment 18 Fedora Update System 2022-09-02 08:26:50 UTC
FEDORA-2022-ccb0138bb6 has been pushed to the Fedora 37 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-ccb0138bb6`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-ccb0138bb6

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 19 Norbert Jurkeit 2022-09-02 12:07:28 UTC
I haven't installed F37 yet but can confirm that 5.19.6 builds fix the issue for me with F35 and F36.

Comment 20 Adam Williamson 2022-09-02 16:59:26 UTC
+5 in https://pagure.io/fedora-qa/blocker-review/issue/882 , marking accepted.

Comment 21 Fedora Update System 2022-09-02 22:27:35 UTC
FEDORA-2022-ccb0138bb6 has been pushed to the Fedora 37 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 22 Dusty Mabe 2022-09-24 15:54:12 UTC
The revert for the offending kernel commit landed upstream in https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4c66a326b5ab784cddd72de07ac5b6210e9e1b06


Note You need to log in before you can comment on or make changes to this bug.