Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 196556
Summary: | Fedora hangs on boot when kernel-2.6.17-1.2139_FC5smp is used. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Ondrej Dolak <ondrej.dolak> | ||||
Component: | kernel | Assignee: | Alasdair Kergon <agk> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5 | CC: | amlau, andrewg, davej, dmitryburstein, gwendolen.lynch, james, jbonnett, jbrassow, leo_canale, mauelshagen-do-not-use, orion, pjones, vic, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2006-08-31 04:13:06 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Ondrej Dolak
2006-06-24 17:47:13 UTC
Created attachment 131483 [details]
lspci
Sorry, I forgot one important thing, it hangs after device-mapper is initialized. Same for me: I have Intel Corporation 82801ER (ICH5R) SATA Controller (rev 02) Also have Intel Corporation 82801EB (ICH5) SATA Controller (rev 02) and is hanging at `Making device-mapper control node' I'm having the same problem. It hangs after "device-mapper: 4.6.0 - ioctl (2006-02-07) initialized: dm-devel" I'm running the i686 uniprocessor kernel on AMD64/MSI K8N Neo2. My board has 2 nVidia SATA controllers but I don't use them so they were disabled in the BIOS. I didn't know what they were so I re-enabled them so I could do lspci: 00:09.0 IDE interface: nVidia Corporation CK8S Serial ATA Controller (v2.5) (rev a2) 00:0a.0 IDE interface: nVidia Corporation CK8S Serial ATA Controller (v2.5) (rev a2) Also hanging here at the following point: #] device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com #] Loading dm-mirror.ko module #] Loading dm-zero.ko module #] Loading dm-snapshot.ko module System then hangs indefinately, but *does* respond to <CTRL><ALT><DEL> and reboots cleanly. Mobo is Giga-byte GA8ANXP-D with Intel ICH6R, two Raptors (unused by Linux) containing a WinXP install on a striped fake/bios-raid. Linux is installed on /dev/hda on the standard IDE bus (not SATA). Works fine on all previous kernel releases. Am rebuilding now to disable the experimental device-mapper support, since all other attempts to disable it have failed, including: 1) ... Commenting out the whole of the "#device mapper & related initialization" section of /etc/rc.d/rc.sysinit 2) ... Turning off the mdmpd service 3) ... Even deleting the three modules, dm-mirror.ko, dm-zero.ko, and dm-snapshot.ko Yet still I get the above message, followed by a hang, which is very odd considering that the referenced modules have been deleted??? Just a note while I rebuild, I am seeing a large amount of the following warnings: ".config:<nnnn>:warning: trying to reassign symbol <OPT>" Where nnnn is a series of numbers and OPT is pretty much every config option, i.e. PCI, ISA, HOTPLUG, AGP, I2C, etc. Some also read "trying to reassign nonexistant sysmbol", e.g. XEN_PHYSDEV_ACCESS. An earlier rebuild succeeded however (with the same warnings), and this is probably not related to this bug. Is this due to the experimental module versioning feature? My box hangs at
>> device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com
Though I don't think I have an SMP kernel (unless all of them are SMP). The box
is ASUS Terminator C3, based on VIA C3 processor. There's one 160GB Seagate SATA
drive and 512MB RAM in it.
The rebuild didn't help: #] Loading ext3.ko module #] /proc/misc: Mo entry for device-mapper found. #] Is device-mapper driver missing from kernel? #] nash received SIGSEGV! Backtrace: <...> #] kernel panic - not syncing: Attempted to kill init! So it looks like we're absolutely dependent on device-mapper now, even on systems with no LVM or RAID. Anyway, this is obviously +UPSTREAM. My box hangs at device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com Also. I tried both kernel-2.6.17-1.2139_FC5smp and kernel-2.6.17-1.2139_FC5. I have 2 sata 80g raid 1. It is an Intel raid chip set. I have 1G of RAM in this machine. The kerne-2.6.16-1.2133_FC5smp works. My other box is a non raid and using kernel-2.6.17-1.2139_FC5 it boots ok but is spits out a lot of info. I've the same prob on a Promise FastTrak 20276 onboard SATA-RAID controller with two HDs attached as RAID 0. The boot process of the 2.6.17 (non smp) kernel hangs at the same position - device-mapper initialised. A few lines above "sdb: unknown partition table" is written. It seems that the stripe set is not recognized but both discs are handled seperately!? Kernel 2.6.16 is still working. Is this a duplicate of bug 186842 or something different? Is this a duplicate of bug 186842 or something different? No way it boots with the previous kernel [engwnbie@smokey ~]$ su - Password: [root@smokey ~]# dmraid -r /dev/sda: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0 /dev/sdb: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0 [root@smokey ~]# dmraid -s *** Group superset isw_ececagfhaj --> Active Subset name : isw_ececagfhaj_RAID_Volume1 size : 160086016 stride : 128 type : mirror status : ok subsets: 0 devs : 2 spares : 0 [root@smokey ~]# dmraid -rD /dev/sda: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0 /dev/sdb: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0 [root@smokey ~]# This is not a duplicate of bug 186842: everything is working for me with kernel-smp-2.6.16-1.2133_FC5, but hangs on kernel-smp-2.6.17-1.2139_FC5. I've tried to install the latest of dmraid (1.0.0.rc11) and device-mapper (1.02.07) from the development branch, but with no positive results. Just for your information, the output of "dmraid -rD" is: /dev/sda: isw, "isw_ecidiahfeh", GROUP, ok, 160086526 sectors, data@ 0 /dev/sdb: isw, "isw_ecidiahfeh", GROUP, ok, 160086526 sectors, data@ 0 Under kernal-2.6.16 by boot raid 0 volume:- [root@www ~]# dmraid -s *** Active Set name : pdc_fiagfhab size : 625163264 stride : 128 type : stripe status : ok subsets: 0 devs : 2 spares : 0 All FC5 kernel-2.6.17 including 2139 won't boot on my system that boots a raid 0 array on a Promise PDC20376 (FastTrak) :- #Loading jbd.ko module #Loading ext3.ko module #Locading dm-mod.ko module #device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel #Loading dm-mirror.ko module #Loading dm-zero.ko module #Loading dm-snapshot.ko module #Making device-mapper control mode Kernel 2.6.17-2139_FC The hangs !! Still doesn't work under kernel-2.6.17-1.2145_FC5-smp-i686. My raid for additional info: sudo dmraid -rD /dev/sda: pdc, "pdc_gdfdahcie", mirror, ok, 156250000 sectors, data@ 0 /dev/sdb: pdc, "pdc_gdfdahcie", mirror, ok, 156250000 sectors, data@ 0 sudo dmraid -s *** Active Set name : pdc_gdfdahcie size : 156250000 stride : 128 type : mirror status : ok subsets: 0 devs : 2 spares : 0 Re: comment 16 Confirmed. Same here, 2145 hangs on boot as well. My machine does not have RAID, so it seems that the issue is not RAID-related. Re: comment 18 Can you boot to runlevel 3 only, and please describe the last few lines of output before the hang? I.e. is it the same as in comment 6 ? If not, and you really don't have SATA RAID (in use or otherwise) then you should open a separate bug report. Re: comment 19 I installed 2145 also. My system is a Raid 1 configuration it will boot with kernel-smp-2.6.16-1.2133_FC5 see comment 13. But with 2145 and 2139 it wont let me boot at all. So I cannot boot runlevel 3. My system with 2139 only gets to this line. #] device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel[AT]redhat.com With 2145 all it gets to is Uncompressing Linux.. Ok, booting the kernel Red Hat nash version 5.0.32 starting Then it just sits there. Leo, I was addressing Comment #18 From Dmitry, which seems like another issue since he does not have raid, although yes the 2.6.17 kernels seem to have multiple issues, which will all need addressed in bug reports. Keith, my situation is the same as Leo's. It boots to the same lines (in both cases) and then just sits there. So I'm afraid I can't boot into RL3 either. If there's anything else you'd like me to try, I could do that today (tomorrow the server is going back into its closet, running an older kernel until this bug is fixed). One other bit of info. I don't have a RAID array, but I do use LVM (since it's a default install option). Folks who reported raid issues where it hangs at different lines of output, seem to be reporting a different issue. Mine is the opposite situation; I don't have any LVM filesystems, but I do have an (unused by Linux) SATA BIOS RAID (ICH6R) used by Windows. It's not set to automount, although I believe "dmraid -ay" is called by init, and the new(ish) HAL/udev stuff seems to be doing the same during stage1 (which might account for why I can't disable it), but does not account for why a kernel rebuild (omitting device-mapper) also fails (absolute dependency on DM?). Anyway, I am *not* the bug assignee, nor the package maintainer, I'm just an affected user much like yourself. This does AFAICT appear to be purely an upstream issue at kernel dev, and short of patch workarounds, there really isn't much we can do. There are some fairly major changes in 2.6.17 compared to 2.6.16, and as I've said elsewhere, it looks like there's going to be quite a few teething problems with this release. My advice is (not that we have much choice right now) stick with a working kernel and resist updates until the resolution has been found. It's a poor choice for those hoping for kernel updates to resolve earlier issues, but that's the way it is right now. Keith Comment 21 I realized after I posted and read again what you meant. I was going to post to quantify, but never made it. Also like you I wish to help, I'm not wining. You are right there is a lot of noise on this release of the kernel from most distro's See here: https://www.redhat.com/archives/fedora- test-list/2006-July/msg00048.html. I don't think they know what caused it yet. Keith Comment 21 I realized after I posted and read again what you meant. I was going to post to quantify, but never made it. Also like you I wish to help, I'm not wining. You are right there is a lot of noise on this release of the kernel from most distro's See here: https://www.redhat.com/archives/fedora- test-list/2006-July/msg00048.html. I don't think they know what caused it yet. I am having exactly the same error as Leo. Booting off a promise tx150 in mirrored mode. Not using an SMP kernel. Just installed updates. Still doesn't work under kernel-2.6.17-1.2157_FC5-smp-i686. Wahey! kernel-smp-2.6.17-1.2157_FC5.i686 WorksForMe®. No errors or warnings. Well done Dave and Juan! Still can't boot kernel-2.6.17-1.2157_FC5 Under kernal-2.6.16 by boot raid 0 volume:- [root@www ~]# dmraid -s *** Active Set name : pdc_fiagfhab size : 625163264 stride : 128 type : stripe status : ok subsets: 0 devs : 2 spares : 0 All FC5 kernel-2.6.17 including 2157 won't boot on my system that boots a raid 0 array on a Promise PDC20376 (FastTrak) :- #Loading jbd.ko module #Loading ext3.ko module #Locading dm-mod.ko module #device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel #Loading dm-mirror.ko module #Loading dm-zero.ko module #Loading dm-snapshot.ko module #Making device-mapper control mode kernel-2.6.17-1.2157_FC5 still hangs following device-mapper and is obviously still failing to find the RAID 0 volume /dev/mapper/pdc_fiagfhab 2157 still hangs on
>> Uncompressing Linux.. Ok, booting the kernel
>> Red Hat nash version 5.0.32 starting
in my case. Back to 33.
I believe this is related to 196626. Fix suggested there may fix this bug as well. Well, well, well... Three holes in the ground. This indeed fixed it for me, rebuilt parted using the rawhide version and then rebuilt mkinitrd against that and all is goodness! (In reply to comment #32) > I believe this is related to 196626. Fix suggested there may fix this bug as well. Confirmed. Rebuilding parted, mkinitrd and reinstaling kernel fix this :) kernel-2.6.17-1.2174_FC5 still fails to boot my raid 0 array on a Promise PDC20376 (FastTrak). Hangs after device-mapper:- #Loading jbd.ko module #Loading ext3.ko module #Locading dm-mod.ko module #device-mapper: 4.6.0-ioctl (2006-02-17) initialised: dm-devel #Loading dm-mirror.ko module #Loading dm-zero.ko module #Loading dm-snapshot.ko module #Making device-mapper control mode Under kernal-2.6.16 by boot raid 0 volume:- [root@www ~]# dmraid -s *** Active Set name : pdc_fiagfhab size : 625163264 stride : 128 type : stripe status : ok subsets: 0 devs : 2 spares : 0 Still having to stay with 2.6.16-1.2133_FC5 Re: Comment 35 Did you try the fix posted in Bug 196626, fimefija? It looks like your system is stopping at exactly the same point mine was, and this fix seems to have worked for me: #yum -y --enablerepo development update mkinitrd #mv /boot/initrd-2.6.17-1.2174_FC5.img /boot/initrd-2.6.17-1.2174_FC5.img.old #mkinitrd /boot/initrd-2.6.17-1.2174_FC5.img 2.6.17-1.2174_FC5 Your RAID array information looks similar to mine, as well: *** Active Set name : pdc_fejcbccf size : 1250284544 stride : 128 type : stripe status : ok subsets: 0 devs : 2 spares : 0 Thanks having:- Rebuilt mkinitrd my system finally boots kernel 2.6.17! #yum -y --enablerepo development update mkinitrd #mv /boot/initrd-2.6.17-1.2174_FC5.img /boot/initrd-2.6.17-1.2174_FC5.img.old #mkinitrd /boot/initrd-2.6.17-1.2174_FC5.img 2.6.17-1.2174_FC5 *** Active Set name : pdc_fiagfhab size : 625163264 stride : 128 type : stripe status : ok subsets: 0 devs : 2 spares : 0 Now runing 2.6.17-1.2174 ! And for those of us that boots kernel 2.6.17 smp versions [root@smokey ~]# yum -y --enablerepo development update mkinitrd [root@smokey ~]# mv /boot/initrd-2.6.17-1.2174_FC5smp.img /boot/initrd-2.6.17-1.2174_FC5smp.img.old [root@smokey ~]# mkinitrd /boot/initrd-2.6.17-1.2174_FC5smp.img 2.6.17-1.2174_FC5smp [engwnbie@smokey ~]$ su - Password: [root@smokey ~]# dmraid -rD /dev/sda: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0 /dev/sdb: isw, "isw_ececagfhaj", GROUP, ok, 160086526 sectors, data@ 0 [root@smokey ~]# dmraid -s *** Group superset isw_ececagfhaj --> Active Subset name : isw_ececagfhaj_RAID_Volume1 size : 160086016 stride : 128 type : mirror status : ok subsets: 0 devs : 2 spares : 0 [root@smokey ~]# [root@smokey ~]# uname -r 2.6.17-1.2174_FC5smp [root@smokey ~]# I have similar problem on hp nx9420. kernel-smp-2.6.17-1.2174_FC5 does not boot while kernel-smp-2.6.17-1.2157_FC5 was OK!! Have no raid but have LVM. I tried the proposed fix but can't find right mkinitrd: I get version 5.0.40 or version needing glibc upgrade!! Re: Comment 39 Yes, Marc, you will need to upgrade glibc as well as, if I recall correctly, one other library in order to upgrade mkinitrd. If you update using yum with the commands listed, it should automatically install all the dependencies for you. I've added bug 204260 and prodded bug 189708. Perhaps we'll get an errata for parted and mkinitrd sometime soon. *** This bug has been marked as a duplicate of 189708 *** |