Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 220470
Description
Joseph Sacco
2006-12-21 16:06:56 UTC
Created attachment 144195 [details]
boot ssequence [most of it...]
The problem persists with 2.6.19-1.2891.fc7smp. FWIW: * 2.6.18-1.2849.fc6smp boots OK * yaboot.conf entry for the 2.6.19 kernel is identical in form to the entry for the 2.6.18 kernel * replacing lvm2--2.02.17-1.fc7 -> lvm2-2.02.06-4 did not help [LVM is being used] * replacing sym53c8xx.ko with the version from 2.6.18-1.2849fc6smp did not help -Joseph The source for kernel-2.6.19.1, downloaded from kernel.org, builds and runs. A kernel built using the config file from kernel-2.6.19-1.2891.fc7 successfully booted up and and ran. I noticed that the config file lacked some of the network modules for NAT. I rebuilt the kernel using the config file from 2.6.18-1.2868.fc6smp. -Joseph The problem persists with 2.6.19-1.2904.fc7smp. -Joseph The problem persists with 2.6.19-1.2906.fc7smp. -Joseph The problem persists with 2.6.19-1.2909.fc7cmp. -Joseph The problem persists with 2.6.19-1.2911.fc7smp. -Joseph Hi! I have the exact same problem with a 64-bit Intel Itanium workstation (hp workstation zx6000) sporting a SCSI drive subsystem. No problem with the 2.6.18 series kernels. Since this problem seems not to be limited to 32-bit PowerMac systems with SCSI drives, could you please change the bug title accordingly? Thanks. OK... Done. -Joseph Thanks. The problem persists with kernel 2.6.19-1.2912.fc7 on my Itanium workstation. Same here... The problem persists with 2.6.19-1.2912.fc7smp. -Joseph For what it's worth... The current stable kernel source from kernel.org [10Jan07] linux-2.6.19.2, builds and runs. That being said, I have renamed this bug report "2.6.20" series kernels do not boot on systems with SCSI drives. -Joseph The problem persists with 2.6.19-1.2913.fc7smp. -Joseph The problem persists with 2.6.19-1.2914.fc7smp. Sigh... -Joseph The problem persists with 2.6.19-1.2916.fc7smp. -Joseph The problem persists with 2.6.19-1.2917.fc7smp. -Joseph Please attach bootlog from a working kernel. Created attachment 147108 [details]
/var/log/messages [compresssed]
Attachment contains a compressed copy of the boot sequence recorded in
/var/log/messages.
-Joseph
That's really strange: 2.6.20 doesn't detect any SCSI drives. Created attachment 147122 [details]
contents of initrd for kernel-2.6.19.2
Created attachment 147123 [details]
contents of initrd for kernel-2.6.19-1.2917.gc7smp [does not boot]
Chuck, Welcome to my world... :-) The initrd files are compress CPIO archives. I have listings for two: kernel-2.6.19.2 ==> boots kernel-2.6.19-1.2917.fcsmp ==> does *not* boot I see that both contain the SCSI modules. The question now is why isn't the 2.6.19-1.2917 ramdisk loading the SCSI drivers? -Joseph Even better than that. I did a fresh reinstall on my Itanium workstation last night: the system doesn't boot and no initrd file was created! The elilo.conf file (Linux loader configuration file on Itanium systems) only contains the following lines: image=vmlinuz-2.6.19-1.2917.fc7 label=linux read-only root=/dev/VolGroup00/LogVol00 append="rhgb quiet" And the EFI subsystem only stores the vmlinux-2.6.19-1.2917.fc7, elilo.efi and elilo.conf files. Try running mkinitrd with various forced options. I've been having to do this on x86_64 for a while. (In reply to comment #24) > Try running mkinitrd with various forced options. I've been having to do this on > x86_64 for a while. Are you thinking of a particular option I should try? (In reply to comment #25) > Are you thinking of a particular option I should try? Well, I simply run mkinitrd /boot/efi/efi/redhat/initrd-2.6.19-1.2917.fc7.img 2.6.19-1.2917.fc7 and it did the trick for me. So it seems that something necessary was performed during the install process. I know, this is not always a possible option ;-) Regenerating initrd using: * mkinitrd with default arguments or *new-kernel-pkg --package kernel --mkinitrd --depmod --install 2.6.19-1.a.b.c does not resolve the problem on my linuxPPC. -Joseph The problem persists with 2.6.20-1.2922.fc7smp. Sigh... -Joseph Try booting the non-working kernel with the kernel parameter: scsi_logging_level=0xe00000 I *think* this will log highlevel SCSI events. (It's not documented very well.) Is this the same issue as bug 218444? (Because the attached logfile also lists a missing /dev/root as the problem) It may be the same issue, or a different issue with the same symptoms. Actually, we may be talking about a dozen different issues which appear similar. *sigh*. Created attachment 147502 [details]
configuration file used to build 2.6.20 kernel and modules.
I have managed to build a 2.6.20 kernel that boots and runs on a 32-bit PowerMAC with SCSI drives. What I did: * fetch and unpack the 2.6.20 tarball from kernel.org * run 'make menuconfig' I started with an config file that worked with the 2.6.19 series kernels and made two manual changes: - reduce number of CPU's to 2 - configure the 8250 serial driver to load as a module [see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=155895] * run 'make' The build completes for the kernel but fails for the modules. Building modules, stage 2. MODPOST 1215 modules WARNING: "eth_io_copy_and_sum" [drivers/net/smc-ultra.ko] undefined! The problem is a change in an include file asm-ppc/io.h which I patched after looking at the previous version: --- io.h- 2007-02-05 21:52:16.000000000 -0500 +++ io.h 2007-02-06 09:47:30.000000000 -0500 @@ -729,6 +729,7 @@ } #define page_to_bus(page) (page_to_phys(page) + PCI_DRAM_OFFSET) +#define eth_io_copy_and_sum(a,b,c,d) eth_copy_and_sum((a),(void __force *)(void __iomem *)(b),(c),(d)) #endif /* CONFIG_PPC32 */ [The latest rawhide SRPM also patched this file among other things.] With this change the module build completes. * run 'make modules_install' * generate initrd file /sbin/new-kernel-pkg --package kernel --mkinitrd --depmod --install 2.6.20 * reboot and watch the fun... ==> Success!!! Discussion ---------- So what's different? The first thing that comes to mind is mkinitrd was updated today. However, I do not believe that mkinitrd was the problem: * I previously built 2.6.19.2 under rawhide using the kernel.org source * I was able to build 2.6.20 under FC6. * Regenerating initrd files for both 2.6.19-1.2917.fc7smp and 2.6.20-1.2922.fc7smp did not allow them to boot. So... What else could it be? Configuration? Patches? I have attached a copy of the configuration file I used. Maybe sharper eyes can see something I missed when comparing it with the rawhide configuration files. -Joseph Created attachment 147560 [details]
configuration file used to build 2.6.20 kernel and modules.
The kernel I built the other day lacked NAT support. I rebuilt the kernel
starting the configuration process with the default configuration for a pmac32
linux-2.6.20/arch/powerpc/configs/pmac32_defconfig
The build was uneventful. The new kernel with NAT support boots up and runs.
-Joseph
(In reply to comment #29) > Try booting the non-working kernel with the kernel parameter: > > scsi_logging_level=0xe00000 > > I *think* this will log highlevel SCSI events. > (It's not documented very well.) Does not give any additional information for me. With both 2.6.19 and 2.6.20 kernels, I get the same broken behaviour, complaints about missing /dev/root. Maybe the following gives you a clue? Before the complaint about missing /dev/root, I get the following: - ... scsi driver output ... - trying to resume from /dev/sdb2 (my swap partition) - unable to access resume device (/dev/sdb2) Does this mean, the kernel "knows" there are scsi devices, but is unable to access them? FWIW, I just tried a Knoppix 5.1.1 DVD, which uses a 2.6.19 kernel. Boots up fine and is able to access the scsi disk. (In reply to comment #29) > Try booting the non-working kernel with the kernel parameter: > > scsi_logging_level=0xe00000 > > I *think* this will log highlevel SCSI events. > (It's not documented very well.) I tried various kernel boot params, but NONE of them gave additional scsi output on the console :-/ broken2.txt:Kernel command line: ro root=/dev/sdb3 console=ttyS0,9600n8 console=tty0 scsi_logging_level=0xe00000 1 broken3.txt:Kernel command line: ro root=/dev/sdb3 console=ttyS0,9600n8 console=tty0 scsi_logging_level=0xffffff 1 broken4.txt:Kernel command line: ro root=/dev/sdb3 console=ttyS0,9600n8 console=tty0 scsi_logging_level=0xffffffff 1 Just for kicks, I had tried to install the i586 kernel to see whether it makes a difference, but no. I am about to attach two log files from the external console. workinglog.txt is from booting the last rawhide fc6 kernel brokenlog.txt is from botting the latest rawhide fc7 kernel, currently i586 installed I'll also attach the diff between them. We get a kernel panic in the middle of scsi module init, I think. But there is another difference that surprises me. It appears the new kernel is not loading the IDE driver either? This might explain why the F7 test 1 LiveCD does not work on that system (which uses an IDE drive). I mentioned that in bug 218444 comment 12. Created attachment 147613 [details]
Kai's working fc6 boot log
Created attachment 147614 [details]
Kai's broken fc7 boot log
Created attachment 147615 [details]
Kai's boot log diff
diff -uw rpmbuild-kernel-working/SOURCES/kernel-2.6.18-i686.config rpmbuild-kernel-broken/SOURCES/kernel-2.6.20-i686.config gives -CONFIG_IDE=y -CONFIG_IDE_GENERIC=y +# CONFIG_IDE is not set Does it make sense that the latest rawhide kernel has IDE disabled? Looking at Joseph's working config file, he has CONFIG_IDE enabled. Created attachment 147622 [details]
Kai's initrd contents of broken kernel
00:11.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
DaveJ said the initrd might need to have pata_via, but it does not.
Created attachment 147623 [details]
Kai's mkinitrd -v log
mkinitrd -f -v /boot/initrd-2.6.20-1.2922.fc7.img 2.6.20-1.2922.fc7 2>&1 | tee
initrdlog
Created attachment 147625 [details]
Kai's mkinitrd -v --preload=sym53c8xx --preload=pata_via log
Created attachment 147626 [details]
Kai's working fc7 boot log w/ mkinitrd --preload=sym53c8xx --preload=pata_via
I have built a kernel from the 2.6.20-1.2922 source that boots and runs.
What I did:
* unpack the src RPM
* run 'rpmbuild -bp'
* move linux-2.6.20.ppc /usr/src/kernels
* run 'make mrproper'
* copy .config from successful 2.6.20 build
[see attachment #147560 [details]]
* run 'make'
The build fails to compiles drivers/md/md.c
because of Patch1793: linux-2.6-raid-autorun.patch
* compare current .config file with kernel-2.6.20-ppc-smp.config
==> notice differences in Multi-device support (RAID and LVM)
* run 'make menuconfig' using current .config file
alter Multi-device support (RAID and LVM) to match what is in
kernel-2.6.20-ppc-smp.config
* restart 'make'
==> build completes
* install kernel, System.map, modules
* generate initrd file
/sbin/new-kernel-pkg --package kernel-smp --mkinitrd --depmod --install
2.6.20-1.2922.smp
* reboot
-Joseph
Created attachment 147661 [details]
configuration file used to build 2.6.20-1.2922smp kernel and modules.
I was told that it is intended to have CONFIG_IDE disabled in the latest kernels, because CONFIG_ATA is supposed to replace it. As I said earlier: (In reply to comment #44) > Created an attachment (id=147625) [edit] > Kai's mkinitrd -v --preload=sym53c8xx --preload=pata_via log these options, in addition to standard mkinitrd options, produced a booting system for me. However, there are many different modules that a system might require on boot, and you might not know the name of the module to preload. Playing with mkinitrd, I found that giving options --force-ide-probe --force-scsi-probe also added the missing pata_via and libata modules to the initrd image that I require to boot. So, I'm not a kernel nor a mkinitrd expert, so I don't know if the following is a gneneral reasonable advice. But maybe you could try to produce an initrd image using the force options, and see if you can boot that way. If you're curious whether it makes any difference, you could compare the output of "mkinitrd -v" runs with and without the force options. Regenerating initrd has not helped. Neither choice of additional mkinitrd options: * --preload=sym53c8xx --preload=pata_via * --force-ide-probe --force-scsi-probe results in a bootable 2.6.20-1.2922.fc7smp kernel on my system. -Joseph For what it's worth... I built 2.6.20-git4 using the source from kernel.org. After patching asm-powerpc/io.h, the kernel and modules build and run. -Joseph Build 2.6.20-git5 using the source form kernel.org. The saga continues... -Joseph More of the same... The problem persists with 2.6.20-1.2925.fc7smp. -Joseph A "me too" report, but a 32-bit x86 box with aic7xxx scsi controller, and --preload aic7xxx doesn't help any. Will poke box more when I can actually get into the office... 2.6.20-1.2930.fc7smp: more of the same. -Joseph I'm seeing two different problems comparing the console output from 2.6.19-1.2895.fc6 and 2.6.20-1.2925.fc7 (will attach both momentarily). First up, with the fc7 kernel, only one of the two scsi drives in this system even shows up in the console output. Second, with the fc7 kernel, the one drive that is showing up in the console output is never assigned to sda like it is under fc6. Under the fc6 kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 <Adaptec aic7890/91 Ultra2 SCSI adapter> aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi 0:0:0:0: Direct-Access ZJCS ZJCS2-36GB S5BS PQ: 0 ANSI: 3 scsi0:A:0:0: Tagged Queuing enabled. Depth 4 target0:0:0: Beginning Domain Validation target0:0:0: wide asynchronous target0:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63) target0:0:0: Domain Validation skipping write tests target0:0:0: Ending Domain Validation SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB) sda: Write Protect is off SCSI device sda: drive cache: write back SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB) sda: Write Protect is off SCSI device sda: drive cache: write back sda: sda1 sda2 sda3 sda4 < sda5 sda6 > sd 0:0:0:0: Attached scsi disk sda Under the fc7 kernel: Loading aic7xxx.ko module scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 <Adaptec aic7890/91 Ultra2 SCSI adapter> aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi 0:0:0:0: Direct-Access ZJCS ZJCS2-36GB S5BS PQ: 0 ANSI: 3 scsi0:A:0:0: Tagged Queuing enabled. Depth 4 target0:0:0: Beginning Domain Validation target0:0:0: wide asynchronous target0:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63) target0:0:0: Domain Validation skipping write tests target0:0:0: Ending Domain Validation Loading uhci-hcd.ko module [... and so on, without any mention of sda ...] Created attachment 148114 [details]
Working FC6 kernel boot log, aic7xxx adapter
Created attachment 148115 [details]
Failing FC7 kernel boot log, aic7xxx adapter
I'm changing this to mkinitrd, because I think that's more likely to get this on the radar of the people who can address the problem. For me, running mkinitrd with "--with=libata --with=ata_piix" fixes the issue. Ah, I stupidly missed the part where Kai got his system working on an fc7 kernel w/some extra mkinitrd flags to include more ata bits. Trying something similar here now... Okay, with ata_piix added to the mix, I get a bit further -- at least both scsi disks are seen, but still kernel panicking. Why ata bits are required to see scsi disks seems, uh, wrong, but... Will poke at it more later... Last bits of boot log: SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB) sda: Write Protect is off md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda:Trying to resume from LABEL=SWAP-sda3 sda1 sda2 sda3 sda4 < sda5 sda6 > sd 0:0:0:0: Attached scsi disk sda SCSI device sdb: 71687340 512-byte hdwr sectors (36704 MB) Unable to access resume device (LABEL=SWAP-sda3) Creating root device. sdb: Write Protect is off SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA SCSI device sdb: 71687340 512-byte hdwr sectors (36704 MB) sdb: Write Protect is off SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO or FUA sdb: sdb1 sdb2 sdb3 sdb4 <Mounting root filesystem. sdb5 sdb6 > sd 0:0:8:0: Attached scsi disk sdb EXT3-fs: unable to read superblock mount: error mounting /dev/root on /sysroot as ext3: Invalid argument Setting up other filesystems. Setting up new root fs setuproot: moving /dev failed: No such file or directory no fstab.sys, mounting internal defaults setuproot: error mounting /proc: No such file or directory setuproot: error mounting /sys: No such file or directory Switching to new root and running init.Kernel panic - not syncing: Attempted to kill init! unmounting old /dev unmounting old /proc unmounting old /sys switchroot: mount failed: No such file or directory For what it's worth... I am now running 2.6.20-git11 built using the source from kernel.org. I started with the default configuration for pmac32 and tweaked it in a couple of places. I did not change anything associated with SCSI or LVM. -Joseph I though of something to try: add some delay in the initscript on the initrd. It almost looks like the sd driver is still scanning when init tries to mount the root fs. Chuck, you might be on to something there. I notice a definite delay while libata fires up -- a delay that doesn't happen with that not included. More of the same... The problem persists with 2.6.20-1.2932.fc7smp. -Joseph (In reply to comment #62) > I though of something to try: add some delay in the initscript > on the initrd. It almost looks like the sd driver is still scanning when > init tries to mount the root fs. Bingo. Added a bit of a delay after loading modules in the initrd, and now everything is coming up as expected with the fc7 kernel. It looks like there's already some code that's *supposed* to do this for you in the initrd: echo Waiting for driver initialization. stabilized --hash --interval 250 /proc/scsi/scsi For whatever reason though, that doesn't appear to be cutting it. (A 5 second sleep right before the 'stabilized' bit is what got me booted). Was just looking at that too. "stabilized" is an undocumented nash builtin. Looks like Jeremy Katz and Peter Jones added the calls to the init script in versions 6.0.4 and 6.0.5 for pata and ahci/stat_*. This appears to read the given file (in this case, /proc/scsi/scsi) and loop until it stops changing. What if you add a "stabilized" call just like the above to right after when the aic7xxx.ko module is loaded, instead of the sleep? Hrm... I suppose I could try throwing another one of those in there, but I'd figured the one that was already there should have been covering me for both ata_piix and aic7xxx. Here's a bit more of what my init looked like before adding the sleep: mkblkdevs echo "Loading scsi_mod.ko module" insmod /lib/scsi_mod.ko echo "Loading sd_mod.ko module" insmod /lib/sd_mod.ko echo "Loading scsi_transport_spi.ko module" insmod /lib/scsi_transport_spi.ko echo "Loading aic7xxx.ko module" insmod /lib/aic7xxx.ko echo "Loading libata.ko module" insmod /lib/libata.ko echo "Loading ata_piix.ko module" insmod /lib/ata_piix.ko echo Waiting for driver initialization. stabilized --hash --interval 250 /proc/scsi/scsi I'll throw another stabilized in after the aic7xxx insmod, leave the sleep out and see what happens... Okay, so for this round, I bumped to kernel 2932. I left out the ata_piix and libata modules, since there's nothing hooked up via ata in this system. I added the 'stabilized' command just after aic7xxx is inmod'd, and there's a noticeable pause there, but still no dice. Still fails to get the drives all the way up. ---------- Loading aic7xxx.ko module scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0 <Adaptec aic7890/91 Ultra2 SCSI adapter> aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs scsi 0:0:0:0: Direct-Access ZJCS ZJCS2-36GB S5BS PQ: 0 ANSI: 3 scsi0:A:0:0: Tagged Queuing enabled. Depth 4 Waiting for driv target0:0:0: Beginning Domain Validation er initialization. target0:0:0: wide asynchronous target0:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63) target0:0:0: Domain Validation skipping write tests target0:0:0: Ending Domain Validation scsi 0:0:4:0: CD-ROM PLEXTOR CD-R PX-W8220T 1.00 PQ: 0 ANSI: 2 target0:0:4: Beginning Domain Validation target0:0:4: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 8) target0:0:4: Domain Validation skipping write tests target0:0:4: Ending Domain Validation scsi 0:0:5:0: CD-ROM TOSHIBA DVD-ROM SD-M1201 1011 PQ: 0 ANSI: 2 target0:0:5: Beginning Domain Validation target0:0:5: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16) target0:0:5: Domain Validation skipping write tests target0:0:5: Ending Domain Validation scsi 0:0:8:0: Direct-Access IBM DDYS-T36950N S96H PQ: 0 ANSI: 3 scsi0:A:8:0: Tagged Queuing enabled. Depth 4 target0:0:8: Beginning Domain Validation target0:0:8: wide asynchronous target0:0:8: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63) target0:0:8: Domain Validation skipping write tests target0:0:8: Ending Domain Validation md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. Trying to resume from LABEL=SWAP-sda3 Unable to access resume device (LABEL=SWAP-sda3) Creating root device. Mounting root filesystem. EXT3-fs: unable to read superblock mount: error mounting /dev/root on /sysroot as ext3: Invalid argument Setting up other filesystems. Setting up new root fs setuproot: moving /dev failed: No such file or directory no fstab.sys, mounting internal defaults setuproot: error mounting /proc: No such file or directory setuproot: error mounting /sys: No such file or directory Switching to new root and running init.Kernel panic - not syncing: Attempted to kill init! Now that 'we' are all exploring the inner workings of initrd, I thought I would unpack initrd-2.6.20-git11 and initrd-2.6.20.1-2932fc7smp and compare the contents. The first thing that jumped out was that the git11 init file does not contain any scsi instructions. Hmmm... Why is that? A quick look at the kernel config file and the lib/modules/kernel/drivers/scsi directory shows that the scsi drivers needed by my system are built directly into the kernel. So... It would appear that the time to load the drivers for the fc7 kernel may well be the issue. -Joseph Also, see bug #228689 where SAN attached storage needs an 8 second delay. This seems to have regressed more, at least for me. As said before, I had been able to get a booting 2.6.20-1.2922 kernel using mkinitrd --preload=sym53c8xx --preload=pata_via After upgrading to 2.6.20-1.2936 the automatically created initrd does not boot. It stops after "attaching disk /dev/sdc". (Right before one would usually get "waiting for scsi driver init") I booted back into 2922 and tried the above mkinitrd again. This improves things a bit. The boot process goes on, even arrives at "switching to new root and running init". However, then my kernel stops again - which might be a different problem? The last things displayed are messages about USB HID, keyboard + mouse. While the kernel stops, it's not a "hard" stop. After nothing happened for 1 minute, I pressed "ctrl-alt-del", and got a "stopping all devices" and a reboot. FWIW, I compared the contents of the automatically created initrd-2936 and the contents of my manually created initrd-2936 (using --preload). @@ -33,49 +33,49 @@ mknod /dev/tty11 c 4 11 mknod /dev/tty12 c 4 12 mknod /dev/ttyS0 c 4 64 mknod /dev/ttyS1 c 4 65 mknod /dev/ttyS2 c 4 66 mknod /dev/ttyS3 c 4 67 echo Setting up hotplug. hotplug echo Creating block device nodes. mkblkdevs +echo "Loading scsi_mod.ko module" +insmod /lib/scsi_mod.ko +echo "Loading sd_mod.ko module" +insmod /lib/sd_mod.ko +echo "Loading scsi_transport_spi.ko module" +insmod /lib/scsi_transport_spi.ko +echo "Loading sym53c8xx.ko module" +insmod /lib/sym53c8xx.ko +echo "Loading libata.ko module" +insmod /lib/libata.ko +echo "Loading pata_via.ko module" +insmod /lib/pata_via.ko +echo Waiting for driver initialization. +stabilized --hash --interval 250 /proc/scsi/scsi echo "Loading uhci-hcd.ko module" insmod /lib/uhci-hcd.ko echo "Loading ohci-hcd.ko module" insmod /lib/ohci-hcd.ko echo "Loading ehci-hcd.ko module" insmod /lib/ehci-hcd.ko mount -t usbfs /proc/bus/usb /proc/bus/usb echo "Loading mbcache.ko module" insmod /lib/mbcache.ko echo "Loading jbd.ko module" insmod /lib/jbd.ko echo "Loading ext3.ko module" insmod /lib/ext3.ko -echo "Loading scsi_mod.ko module" -insmod /lib/scsi_mod.ko -echo "Loading sd_mod.ko module" -insmod /lib/sd_mod.ko -echo "Loading libata.ko module" -insmod /lib/libata.ko echo "Loading ata_generic.ko module" insmod /lib/ata_generic.ko -echo "Loading pata_via.ko module" -insmod /lib/pata_via.ko -echo Waiting for driver initialization. -stabilized --hash --interval 250 /proc/scsi/scsi -echo "Loading scsi_transport_spi.ko module" -insmod /lib/scsi_transport_spi.ko -echo "Loading sym53c8xx.ko module" -insmod /lib/sym53c8xx.ko mkblkdevs resume /dev/sdb2 echo Creating root device. mkrootdev -t ext3 -o defaults,ro sdc2 echo Mounting root filesystem. mount /sysroot echo Setting up other filesystems. setuproot echo Switching to new root and running init. switchroot kernel 2.6.20-1.2947 gives me exactly the same behaviour as described in comment 71 with 2.6.20-1.2936 (automatic initrd stops in the middle of scsi, preload initrd stops at USB HID). kernel 2.6.20-1.2949: More of the same. Sigh... -Joseph kernel 2.6.20-1.2953, kernel 2.6.20-1.2960: Ditto -Joseph *** Bug 228977 has been marked as a duplicate of this bug. *** kernel 2.6.20-1.2962: yada, yada, yada... -Joseph kernel 2.6.20-1.2966: no love... -Joseph kernel 2.6.20-1.2967: Still borked. -Joseph So, in lieu of a real fix, the following patch to mkinitrd works around the problem for me. A better fix would be to find the right file to pass to "stabilized" (as per comment #65, /proc/scsi/scsi isn't doing it), but for now, sleeping at least gets my system up. --- mkinitrd.orig 2007-03-08 12:42:48.000000000 -0500 +++ mkinitrd 2007-03-08 12:44:39.000000000 -0500 @@ -1357,6 +1357,10 @@ emit "echo Waiting for driver initialization." emit "stabilized --hash --interval 250 /proc/scsi/scsi" fi + if [ "$module" = "aic7xxx" ]; then + emit "echo Sleeping for 5 seconds because that seems to work." + emit "sleep 5" + fi done unset usb_mounted Tried it. Did not work on my PowerMac. -Joseph If you have a different scsi driver, that's to be expected. Looks like in your case it's sym53c8xx -- try this instead: --- mkinitrd.orig 2007-03-08 12:42:48.000000000 -0500 +++ mkinitrd 2007-03-08 12:44:39.000000000 -0500 @@ -1357,6 +1357,10 @@ emit "echo Waiting for driver initialization." emit "stabilized --hash --interval 250 /proc/scsi/scsi" fi + if [ "$module" == "aic7xxx" -o "$module" == "sym53c8xx" ]; then + emit "echo Sleeping for 5 seconds because that seems to work." + emit "sleep 5" + fi done unset usb_mounted Tried that also... :-) ... emit "echo Waiting for driver initialization." emit "stabilized --hash --interval 250 /proc/scsi/scsi" fi if [ "$module" = "aic7xxx" ]; then emit "echo Sleeping for 5 seconds because that seems to work." emit "sleep 5" fi if [ "$module" = "sym53c8xx" ]; then emit "echo Sleeping for 5 seconds because that seems to work." emit "sleep 5" fi ... Can you attach the init file from your mkinitrd? Created attachment 149620 [details]
init file extracted from initrd-2.6.20-1.2967.fc7smp.img
Maybe a 5 second delay is not long enough for an older powerMAC.
-Joseph
Possible. Or maybe your problem is actually different from the one most of the rest of us are seeing. Anyway, try something much bigger and see what happens. A ten second delay is what was required. I am now running 2.6.20-1.2967.fc7smp. Life is better. Showing that no good deed goes unpunished... The boot sequence initiated an SeLinux relabeling. Took a very long time. -Joseph Here's the latest status of my SCSI system: - default initrd img still does not boot - adding "just" a delay doesn't help me (I even got dropped into an emergency filesystem recovery shell) - I still need --preload to get my system pass SCSI init - combining --preload with the delay does not improve things for me What I said in comment 71 about 2.6.20-1.2922.fc7 is still my most recent success. With any newer kernel I still get the USB-HID failure I reported in comment 71. The "scsi delay" did not influence that. My current conclusion is: - my system needs --preload - my system does not need a delay - the USB-HID issue is a separate bug? This looks remarkably similar to bug #162685 from Fedora Core 3.... Any news on this from anyone @ red hat? It seems like this is a showstopper level bug for Fedora 7. I just read the help text for SCSI_SCAN_ASYNC again (which is probably the root cause of all this)... If you have built SCSI as modules, enabling this option can be a problem as the devices may not have been found by the time your system expects them to have been. You can load the scsi_wait_scan module to ensure that all scans have completed. If you build your SCSI drivers into the kernel, then everything will work fine if you say Y here. Peter, are we loading that module ? If not, perhaps we should :) Adding a me too. Upgraded to FC7 test, which went ok, but doesn't boot (also reported previously several times when trying to run recent rawhide kernels under FC6). I'm using Fusion MPT scsi. scsi_wait_scan works for me. It needs to be loaded after the scsi modules (I hacked my mkinitrd to run a findmodule for it if the current module is dm-mod). --- /sbin/mkinitrd.orig 2007-04-11 11:06:31.000000000 -0400 +++ /sbin/mkinitrd 2007-04-11 11:53:19.000000000 -0400 @@ -235,6 +235,9 @@ findmodule ieee1394 findmodule ohci1394 modName="sbp2" + elif [ "$modName" = "dm-mod" ]; then + findmodule scsi_wait_scan + modName="dm-mod" elif [ "$modName" = "gfs2" ]; then findmodule lock_nolock modName="gfs2" I did NOT need the stabilize line (which alone got the disk detected but the system then failed to find /dev/root anwyay). I suspect the pata/sata/sleep stuff people have been using is just accidentally solving the problem, by causing delays which may or may not be correct. Now I'm seeing udev take a long time to start up and several Mounting other filesystems: mount: /dev/sysfs already mounted or /sys busy mount: according to mtab, sysfs is already mounted on /sys but at least the system comes up now. James, I can confirm that your patch for mkinitrd works on a G4 PowerMac with 3 SCSI drives and ATTO controllers. -Joseph Works with my aic7xxx box as well. Works for me on aic7xxx too. Moving out of needinfo state; let's get this in, yeah? OK... we now have four data points. Should be enough... [:-)] -Joseph Eep. Minor correction... The patch in comment #92 does NOT work as-is on my aic7xxx box. Including scsi-wait-scan in the initrd works, but its not getting included by that patch, as there is no dm-mod getting pulled into my initrd to begin with. I'd suggest perhaps the following instead, which works for me: --- /sbin/mkinitrd.orig 2007-04-12 12:32:31.000000000 -0400 +++ /sbin/mkinitrd 2007-04-12 12:32:50.000000000 -0400 @@ -1004,6 +1004,7 @@ # RAID controllers with drivers in block/ findmodule $n done + findmodule scsi-wait-scan fi fi fi I believe that'll pull it in if we pull in any scsi modules period, rather than relying on having dm-mod. That, or slap that in around line 305: # need to handle prescsimods here -- they need to go _after_ scsi_mod if [ "$modName" = "scsi_mod" ]; then for n in $PRESCSIMODS ; do findmodule $n done findmodule scsi-wait-scan fi Jarod, Your suggested fix does *not* work on my system. -Joseph The fix in #97 does work on my system. The fix in #98 does not. -Joseph That's what I get for putting it in there w/o actually trying it first... #98 doesn't work for me either on another box, but #97 does. In any case, Peter actually has a better fix forthcoming that has been successfully tested on my box. "We" are hopeful... This has been a very long road to travel. -Joseph Hopefully this should be addressed in mkinitrd-6.0.9-1 . (In reply to comment #103) > Hopefully this should be addressed in mkinitrd-6.0.9-1 . Appears to work, although udev startup pauses for some minutes waiting on udev_settle. mkinitrd 6.0.9-1 and kernel 2.6.20-1.3079.fc7 works for me without need for any tweaks. Confirmed: mkinitrd-6.0.9.1 + kernel 2.6.20-1.3079.fc7smp works on my system. -Joseph Confirmed: mkinitrd-6.0.9.1 + kernel 2.6.20-1.3079.fc7 I was initially dropped to an emergency shell. Because I have both IDE and SCSI disks, my boot SCSI disk is now on a different /dev/sdX node. I had to manually adjust /etc/fstab My system boots up fine now. (The other issue about USB HID I had reported in comment 71 is now gone to!). *** Bug 236475 has been marked as a duplicate of this bug. *** Maybe I should file a different bug, but: Kernel 3084 gives me different device-to-/dev/sdX assignments than 3079 :-/ In the past the order of devices has always been stable for me. Now I can't boot, because fstab wants to mount from /dev/sdc, but suddenly my boot disk became /dev/sdb. Your fstab should contain LABEL= lines rather than hardcoded /dev names. *** Bug 228699 has been marked as a duplicate of this bug. *** So, this should be closed now, because kernel+mkinitrd are doing the right thing now.. correct? Yes... -Joseph |