Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 736386 - kernel-2.6.40-4.fc15.x86_64 fails to boot due to LVM PV on RAID not starting
Summary: kernel-2.6.40-4.fc15.x86_64 fails to boot due to LVM PV on RAID not starting
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: 15
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: dracut-maint
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-09-07 14:59 UTC by Doug Ledford
Modified: 2012-01-23 09:26 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 729205
Environment:
Last Closed: 2012-01-23 09:26:20 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Doug Ledford 2011-09-07 14:59:55 UTC
+++ This bug was initially created as a clone of Bug #729205 +++


--- Additional comment from rhbugzilla on 2011-09-03 01:02:27 EDT ---

I installed the mdadm-3.2.2-9.fc15 package and rebooted as directed.  It made no difference.  Looking further, it seems that the initramfs needs to be rebuilt, so I did that and copied it into /boot.

Now, booting still fails and the following messages from dmesg seem relevant...

dracut: Autoassembling MD Raid
md: md127 stopped.
md: bind <sda>
md: bind <sdb>
dracut: mdadm: Container /dev/md127 has been assembled with 2 devices
md: md126 stopped
md: bind <sda>
md: bind <sdb>
md: raid1 personality registered for level 1
bio: create slab <bio-1> at 1 [Not sure this is relevant, but it's here in the middle of the others.]
dracut: mdadm: array /dev/md126 now has 2 devices
dracut Warning: No root device "block:/dev/mapper/vg_hostname-lv_root" found
dracut Warning: LVM vg_host/lv_root not found
dracut Warning: LVM vg_host/lv_swap not found

Assuming the numbers that precede the lines are seconds, there's about a 23 second lag between the first "dracut Warning:" line and the previous line.

Rodney


--- Additional comment from dledford on 2011-09-07 10:56:21 EDT ---

OK, this bug is getting overly confusing because we are getting different problems reported under the same bug.

First, Rodney, you're original bug was this:
dracut: mdadm: Container /dev/md127 has been assembled with 2 drives
dracut: mdadm (IMSM): Unsupported attributes: 40000000
dracut: mdadm IMSM metadata load not allowed due to attribute incompatibility

In response to that specific bug (about the unsupported attributes) I built a new mdadm with a patch to fix the issue.  Your system still doesn't boot now, so the question is why.  You then posted these messages:
md: raid1 personality registered for level 1
bio: create slab <bio-1> at 1 [Not sure this is relevant, but it's here in the
middle of the others.]
dracut: mdadm: array /dev/md126 now has 2 devices
dracut Warning: No root device "block:/dev/mapper/vg_hostname-lv_root" found
dracut Warning: LVM vg_host/lv_root not found
dracut Warning: LVM vg_host/lv_swap not found

The important thing to note here is that mdadm is no longer rejecting your array, and in fact it started your raid device.  Now, what's happening is that the lvm PV on top of your raid device isn't getting started.  Regardless of the fact that your system isn't up and running yet, the original bug in the bug report *has* been fixed and verified.  So, this bug is no longer appropriate for any other problem reports because the specific issue in this bug is resolved.

Of course, that doesn't get yours or any of the other poster's systems running, so we need to open a new bug(s) for tracking the remaining issues.

I've not heard back from Charlweed on what his problem is.  Rodney, your new problem appears to be that the raid device is started, but the lvm PV on top of your raid device is not.  Michael, unless you edited lines out of your debug messages you posted, I can't see where your hard drives are being detected and can't see where the raid array is even attempting to start.  Dracut is starting md autoassembly, but it's not finding anything to assemble and so it does nothing.  So I'll clone this twice to track the two different issues.  This bug, however, is now verified and ready to be closed out when the package is pushed live.

Comment 1 Ivor Durham 2011-09-08 22:56:34 UTC
After my ill-fated decision to install FC15 last Sunday because of growing flakiness in my FC14 installation, I have been dead in the water (except for repeated attempts to re-build the system only to die after some later update) first with #729205 and now this bug.

I re-installed from the Live DVD, with a "Use all" default repartition. Rebooted after the basic installation completed. Executed the yum command to install mdadm-3.2.2-9.fc15, then "yum update kernel*" rather than the full yum update. Then I re-built initramfs as instructed. I got a slew of warnings about potential missing firmware, but I don't know if they are expected or relevant. I rebooted successfully, getting past #729205. Then I did the full "yum update" which also completed successfully. Just to be sure everything was ok before restoring my "stuff", I rebooted again and now it dies with this problem. I confirmed mdadm-3.2.2-9.fc15 was still installed after the full update and before rebooting this last time. It looks like something modified during the final "yum update" may have introduced this problem.

dmesg shows the exact sequence of messages as above, assembling md127 then md126 and then getting "No root device "block:/dev/mapper/vg_clowder-lv_root" found", where clowder is the assigned hostname of my system. I don't have a way to capture the console log unless it can be written to a USB flash drive, but if there's any other information which would help get me past this showstopper I'll get it by hook or by crook as quickly as possible. The system is a Dell Dimension E520 with integrated Intel Matrix Storage Manager through which two identical disks were configured for RAID 1.

Comment 2 Doug Ledford 2011-09-09 00:26:58 UTC
Ivor: can you boot up the live cd (or boot from a backup kernel command), install the dracut package on the live cd, rebuild your initramfs for the kernel that doesn't boot using the dracut off of the live cd, then see if it boots up properly?  I'm beginning to suspect that maybe there was a dracut upgrade in that final yum update that might be playing a role here and downgrading to the older dracut and rebuilding your initramfs might work around it.

Comment 3 Ivor Durham 2011-09-09 06:14:22 UTC
Doug, I am able to boot after re-building initramfs with the Live DVD dracut. Here are the steps I took:

1. Booted from the Live DVD and created directories /tmp/a and /tmp/b
2. Mounted /dev/md126p1 as /tmp/a to get to the /boot partition
3. Mounted /dev/dm-4 as /tmp/b to get to the / partition
4. cd /lib/modules (in the Live system)
5. (cd /tmp/b/lib/modules; tar cf - 2.6.40.4-5.fc15.x86_64) | tar xf -
   (Without copying the files over I got errors during the boot about being unable to find modules and it crashed with this bug again.)
6. cd /tmp/a
7. dracut initramfs-2.6.40.4-5.fc15.x86_64 2.6.40.4-5.fc15.x86_64 --force
   The dracut package was already available on the Live DVD system. I got the slew of warnings about not finding firmware ".bin" files here as it built the new initramfs. "ls -l in initramfs..." reports:
-rw-r--r--. 1 root root 14932953 Sep  9  2011 initramfs-2.6.40.4-5.fc15.x86_64.img
8. Rebooted successfully!
   uname -a reports:
   Linux clowder 2.6.40.4-5.fc15.x86_64 #1 SMP Tue Aug 30 14:38:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
   "rpm -qa | fgrep dracut" reports: dracut-009-12.fc15.noarch This is the version from my full "yum update" which was installed between my previous successful boot with the updated mdadm and the crash with this bug.
   
I hope this is the correct sequence of steps you had in mind. It at least got me over a big hurdle! Thank you.

Comment 4 Doug Ledford 2011-09-09 15:19:24 UTC
Ivor: Thanks very much!  Now that you are up and running, you can copy your initramfs to a backup file name (such as initramfs-2.6.40.4-5.fc15.x86_64.img.good) and modify /etc/grub.conf to have a new entry that lists this backup file, then on your running system remake your original initramfs using the dracut on the system, then attempt to reboot to the new initramfs (having the old initramfs as a backup).  If it doesn't boot, go back to using the backup initramfs, if it does boot then something odd is happening to cause dracut to fail to build a good initramfs only on upgrade versus all the time.  Regardless though, I'm pretty sure this is a dracut issue at this point, so I'm going to reassign the bug.  However, I *think* the dracut owner is out traveling at the moment, so I don't know how quickly this will get looked at.

Comment 5 Rodney Barnett 2011-09-11 00:18:54 UTC
I finally got back to this and started off by booting 2.6.40-4.fc15.x86_64 with the initramfs I made as described in the bug from which this one was cloned.  I was rather surprised to see the system boot successfully since I hadn't changed anything.

I rebooted a few times and noticed that when the Intel firmware reported the RAID volume status as Normal, the system would boot and when it reported the status as Verify, it would not.

I'm currently running 2.6.40-5.fc15.x86_64.  It needed no modifications to boot with a Normal volume though it seems to have the same trouble with a RAID volume with a Verify status.

Rodney

Comment 6 Rodney Barnett 2011-09-11 00:26:08 UTC
Sorry I should've said 2.6.40.4-5.fc15.x86_64 for the version of the kernel I'm now running.

Rodney

Comment 7 Doug Ledford 2011-09-12 15:14:44 UTC
Hmmm, ok, so when the array is in state VERIFY it doesn't boot, when it's in a clean state, it does.  Can you boot the machine up with the array in VERIFY state, wait until dracut drops you to a debug shell, then get me the output of mdadm -E /dev/sda (assuming your array is on /dev/sda, if not, then any one of the disks that does make up your array)?  I need to see why mdadm is failing to start your array when it's in VERIFY state.

Comment 8 Rodney Barnett 2011-09-13 02:35:52 UTC
I get approximately the following from /sbin/mdadm -E /dev/sda...

/dev/sda:
          Magic : Intel Raid ISM Cfg Sig.
        Version : 1.1.00
    Orig Family : 00000000
         Family : 47b4aff4
     Generation : 00041d10
     Attributes : All supported
           UUID : ...:...:...:...
       Checksum : 905161a2 correct
    MPB Sectors : 2
          Disks : 2
   RAID Devices : 1

  Disk00 Serial : WD-...
          State : active
             Id : 00000000
    Usable Size : ... 250.06 GB)

[Volume_0000]:
           UUID : ...:...:...:...
     RAID Level : 1 <-- 1
        Members : 2 <-- 2
          Slots : [UU] <-- [UU]
    Failed disk : none
      This Slot : 0
     Array Size : ... 250.06 GB)
   Per Dev Size : ... 250.06 GB)
  Sector Offset : 0
    Num Stripes : 1907704
     Chunk Size : 64 KiB <-- 64 KiB
  Migrate State : repair
      Map State : normal <-- normal
     Checkpoint : 0 (512)
    Dirty State : dirty

  Disk01 Serial : WD-...
          State : active
             Id : 00010000
    Usable Size : ... 250.06 GB)


Rodney

Comment 9 Fedora Admin XMLRPC Client 2011-10-20 16:20:24 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 10 Rodney Barnett 2011-10-22 19:57:52 UTC
I haven't followed the other clones of the original bug (729205) very carefully, but I noticed that mdadm-3.2.2-10.fc15 was indicated as a fix for one of those other clones, so I tried it out with a rebuilt initramfs.  My system will now boot even when the Intel firmware reports the RAID volume status as Verify.

Thanks to all who worked on this.

Rodney


Note You need to log in before you can comment on or make changes to this bug.