Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 220470

Summary: 2.6.20 series kernels do not boot on systems with SCSI drives
Product: [Fedora] Fedora Reporter: Joseph Sacco <jsacco>
Component: mkinitrdAssignee: Peter Jones <pjones>
Status: CLOSED RAWHIDE QA Contact: David Lawrence <dkl>
Severity: urgent Docs Contact:
Priority: medium    
Version: rawhideCC: cebbert, davej, dwmw2, emeric.maschino, jarod, jmorris, kengert, mattdm, mgansser, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-09 21:54:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 150226    
Attachments:
Description Flags
boot ssequence [most of it...]
none
/var/log/messages [compresssed]
none
contents of initrd for kernel-2.6.19.2
none
contents of initrd for kernel-2.6.19-1.2917.gc7smp [does not boot]
none
configuration file used to build 2.6.20 kernel and modules.
none
configuration file used to build 2.6.20 kernel and modules.
none
Kai's working fc6 boot log
none
Kai's broken fc7 boot log
none
Kai's boot log diff
none
Kai's initrd contents of broken kernel
none
Kai's mkinitrd -v log
none
Kai's mkinitrd -v --preload=sym53c8xx --preload=pata_via log
none
Kai's working fc7 boot log w/ mkinitrd --preload=sym53c8xx --preload=pata_via
none
configuration file used to build 2.6.20-1.2922smp kernel and modules.
none
Working FC6 kernel boot log, aic7xxx adapter
none
Failing FC7 kernel boot log, aic7xxx adapter
none
init file extracted from initrd-2.6.20-1.2967.fc7smp.img none

Description Joseph Sacco 2006-12-21 16:06:56 UTC
PowerMac: dual G4 533MHZ, 1GB RAM, 3 SCSI drives

The 2.6.19 series of kernels do not boot on a 32-bit PowerMac with SCSI drives.
See attachment.

A similar problem was reported for lintel boxes has been resolved by updating
mkinitrd. 


-Joseph

Comment 1 Joseph Sacco 2006-12-21 16:06:57 UTC
Created attachment 144195 [details]
boot ssequence [most of it...]

Comment 2 Joseph Sacco 2006-12-22 16:27:39 UTC
The problem persists with 2.6.19-1.2891.fc7smp.

FWIW:
* 2.6.18-1.2849.fc6smp boots OK

* yaboot.conf entry for the 2.6.19 kernel is identical in form to the entry for
the 2.6.18 kernel

* replacing lvm2--2.02.17-1.fc7 -> lvm2-2.02.06-4 did not help
[LVM is being used]

* replacing sym53c8xx.ko with the version from 2.6.18-1.2849fc6smp did not help


-Joseph

Comment 3 Joseph Sacco 2006-12-27 13:54:22 UTC
The source for kernel-2.6.19.1, downloaded from kernel.org, builds and runs.  

A kernel built using the config file from kernel-2.6.19-1.2891.fc7 successfully
booted up and and ran.  I noticed that the config file lacked some of the
network modules for NAT.  I rebuilt the kernel using the config file from
2.6.18-1.2868.fc6smp.


-Joseph

Comment 4 Joseph Sacco 2007-01-04 14:46:42 UTC
The problem persists with 2.6.19-1.2904.fc7smp.

-Joseph

Comment 5 Joseph Sacco 2007-01-08 16:54:39 UTC
The problem persists with 2.6.19-1.2906.fc7smp.

-Joseph

Comment 6 Joseph Sacco 2007-01-11 15:25:55 UTC
The problem persists with 2.6.19-1.2909.fc7cmp.

-Joseph

Comment 7 Joseph Sacco 2007-01-13 18:45:40 UTC
The problem persists with 2.6.19-1.2911.fc7smp.

-Joseph

Comment 8 Émeric Maschino 2007-01-14 16:42:16 UTC
Hi! I have the exact same problem with a 64-bit Intel Itanium workstation (hp
workstation zx6000) sporting a SCSI drive subsystem. No problem with the 2.6.18
series kernels. Since this problem seems not to be limited to 32-bit PowerMac
systems with SCSI drives, could you please change the bug title accordingly? Thanks.

Comment 9 Joseph Sacco 2007-01-14 17:13:00 UTC
OK... Done.

-Joseph

Comment 10 Émeric Maschino 2007-01-15 22:50:45 UTC
Thanks. The problem persists with kernel 2.6.19-1.2912.fc7 on my Itanium
workstation.

Comment 11 Joseph Sacco 2007-01-15 22:57:34 UTC
Same here... The problem persists with 2.6.19-1.2912.fc7smp.

-Joseph

Comment 12 Joseph Sacco 2007-01-16 19:23:23 UTC
For what it's worth... The current stable kernel source from kernel.org
[10Jan07] linux-2.6.19.2, builds and runs.

That being said, I have renamed this bug report "2.6.20" series kernels do not
boot on systems with SCSI drives.


-Joseph

Comment 13 Joseph Sacco 2007-01-25 14:22:48 UTC
The problem persists with 2.6.19-1.2913.fc7smp.

-Joseph

Comment 14 Joseph Sacco 2007-01-28 20:05:40 UTC
The problem persists with 2.6.19-1.2914.fc7smp.

Sigh...

-Joseph

Comment 15 Joseph Sacco 2007-01-31 15:21:51 UTC
The problem persists with 2.6.19-1.2916.fc7smp.

-Joseph

Comment 16 Joseph Sacco 2007-02-01 15:26:44 UTC
The problem persists with 2.6.19-1.2917.fc7smp.

-Joseph

Comment 17 Chuck Ebbert 2007-02-01 15:41:09 UTC
Please attach bootlog from a working kernel.

Comment 18 Joseph Sacco 2007-02-01 15:55:26 UTC
Created attachment 147108 [details]
/var/log/messages [compresssed]

Attachment contains a compressed copy of the boot sequence recorded in
/var/log/messages.

-Joseph

Comment 19 Chuck Ebbert 2007-02-01 16:39:49 UTC
That's really strange: 2.6.20 doesn't detect any SCSI drives.


Comment 20 Joseph Sacco 2007-02-01 17:17:50 UTC
Created attachment 147122 [details]
contents of initrd for kernel-2.6.19.2

Comment 21 Joseph Sacco 2007-02-01 17:18:45 UTC
Created attachment 147123 [details]
contents of initrd for kernel-2.6.19-1.2917.gc7smp [does not boot]

Comment 22 Joseph Sacco 2007-02-01 17:22:45 UTC
Chuck,

Welcome to my world... :-)

The initrd files are compress CPIO archives.  I have listings for two:

kernel-2.6.19.2 ==> boots
kernel-2.6.19-1.2917.fcsmp ==> does *not* boot

I see that both contain the SCSI modules. The question now is why isn't the
2.6.19-1.2917 ramdisk loading the SCSI drivers?

-Joseph

Comment 23 Émeric Maschino 2007-02-02 08:28:59 UTC
Even better than that. I did a fresh reinstall on my Itanium workstation last 
night: the system doesn't boot and no initrd file was created! The elilo.conf 
file (Linux loader configuration file on Itanium systems) only contains the 
following lines:

image=vmlinuz-2.6.19-1.2917.fc7
    label=linux
    read-only
    root=/dev/VolGroup00/LogVol00
    append="rhgb quiet"

And the EFI subsystem only stores the vmlinux-2.6.19-1.2917.fc7, elilo.efi and 
elilo.conf files.

Comment 24 Matthew Miller 2007-02-02 12:13:22 UTC
Try running mkinitrd with various forced options. I've been having to do this on
x86_64 for a while.

Comment 25 Émeric Maschino 2007-02-02 15:19:35 UTC
(In reply to comment #24)
> Try running mkinitrd with various forced options. I've been having to do this 
on
> x86_64 for a while.

Are you thinking of a particular option I should try?

Comment 26 Émeric Maschino 2007-02-02 19:45:11 UTC
(In reply to comment #25)

> Are you thinking of a particular option I should try?

Well, I simply run

mkinitrd /boot/efi/efi/redhat/initrd-2.6.19-1.2917.fc7.img 2.6.19-1.2917.fc7

and it did the trick for me.

So it seems that something necessary was performed during the install process. I
know, this is not always a possible option ;-)

Comment 27 Joseph Sacco 2007-02-02 21:02:25 UTC
Regenerating initrd using:

   * mkinitrd with default arguments 

or

   *new-kernel-pkg --package kernel --mkinitrd --depmod --install 2.6.19-1.a.b.c

does not resolve the problem on my linuxPPC.


-Joseph

Comment 28 Joseph Sacco 2007-02-05 17:39:16 UTC
The problem persists with 2.6.20-1.2922.fc7smp.  Sigh...

-Joseph

Comment 29 Chuck Ebbert 2007-02-05 20:10:09 UTC
Try booting the non-working kernel with the kernel parameter:

    scsi_logging_level=0xe00000

I *think* this will log highlevel SCSI events.
(It's not documented very well.)


Comment 30 Kai Engert (:kaie) (inactive account) 2007-02-05 20:49:43 UTC
Is this the same issue as bug 218444?
(Because the attached logfile also lists a missing /dev/root as the problem)


Comment 31 Matthew Miller 2007-02-05 20:52:40 UTC
It may be the same issue, or a different issue with the same symptoms. Actually,
we may be talking about a dozen different issues which appear similar. *sigh*.

Comment 32 Joseph Sacco 2007-02-06 18:46:58 UTC
Created attachment 147502 [details]
configuration file used to build 2.6.20 kernel and modules.

Comment 33 Joseph Sacco 2007-02-06 18:48:37 UTC
I have managed to build a 2.6.20 kernel that boots and runs on a 32-bit PowerMAC
with SCSI drives.

What I did:
* fetch and unpack the 2.6.20 tarball from kernel.org

 * run 'make menuconfig'
I started with an config file that worked with the 2.6.19 series kernels and
made two manual changes:
- reduce number of CPU's to 2
- configure the 8250 serial driver to load as a module
[see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=155895]

* run 'make'
The build completes for the kernel but fails for the modules.
     Building modules, stage 2.
     MODPOST 1215 modules
   WARNING: "eth_io_copy_and_sum" [drivers/net/smc-ultra.ko] undefined!

 The problem is a change in an include file

   asm-ppc/io.h

which I patched after looking at the previous version:

--- io.h-       2007-02-05 21:52:16.000000000 -0500
+++ io.h        2007-02-06 09:47:30.000000000 -0500
@@ -729,6 +729,7 @@
 }
 
 #define page_to_bus(page)      (page_to_phys(page) + PCI_DRAM_OFFSET)
+#define eth_io_copy_and_sum(a,b,c,d)            eth_copy_and_sum((a),(void
__force *)(void __iomem *)(b),(c),(d))
 
 #endif /* CONFIG_PPC32 */

[The latest rawhide SRPM also patched this file among other things.]

With this change the module build completes.

* run 'make modules_install'

* generate initrd file
/sbin/new-kernel-pkg --package kernel --mkinitrd --depmod --install 2.6.20

* reboot and watch the fun...

==> Success!!!

Discussion
----------
So what's different?  The first thing that comes to mind is mkinitrd was updated
today.  However, I do not believe that mkinitrd was the problem:

* I previously built 2.6.19.2 under rawhide using the kernel.org source
* I was able to build 2.6.20 under FC6.
* Regenerating initrd files for both 2.6.19-1.2917.fc7smp and
2.6.20-1.2922.fc7smp did not allow them to boot.

So... What else could it be? Configuration? Patches?

I have attached a copy of the configuration file I used.  Maybe sharper eyes can
see something I missed when comparing it with the rawhide configuration files.

-Joseph


Comment 34 Joseph Sacco 2007-02-07 14:24:35 UTC
Created attachment 147560 [details]
configuration file used to build 2.6.20 kernel and modules.

The kernel I built the other day lacked NAT support. I rebuilt the kernel
starting the configuration process with the default configuration for a pmac32

    linux-2.6.20/arch/powerpc/configs/pmac32_defconfig

The build was uneventful.  The new kernel with NAT support boots up and runs.

-Joseph

Comment 35 Kai Engert (:kaie) (inactive account) 2007-02-07 22:37:49 UTC
(In reply to comment #29)
> Try booting the non-working kernel with the kernel parameter:
> 
>     scsi_logging_level=0xe00000
> 
> I *think* this will log highlevel SCSI events.
> (It's not documented very well.)


Does not give any additional information for me.

With both 2.6.19 and 2.6.20 kernels, I get the same broken behaviour, complaints
about missing /dev/root.

Maybe the following gives you a clue? Before the complaint about missing
/dev/root, I get the following:

- ... scsi driver output ...
- trying to resume from /dev/sdb2 (my swap partition)
- unable to access resume device (/dev/sdb2)

Does this mean, the kernel "knows" there are scsi devices, but is unable to
access them?


FWIW, I just tried a Knoppix 5.1.1 DVD, which uses a 2.6.19 kernel. Boots up
fine and is able to access the scsi disk.


Comment 36 Kai Engert (:kaie) (inactive account) 2007-02-07 23:37:26 UTC
(In reply to comment #29)
> Try booting the non-working kernel with the kernel parameter:
> 
>     scsi_logging_level=0xe00000
> 
> I *think* this will log highlevel SCSI events.
> (It's not documented very well.)


I tried various kernel boot params, but NONE of them gave additional scsi output
on the console :-/

broken2.txt:Kernel command line: ro root=/dev/sdb3 console=ttyS0,9600n8
console=tty0 scsi_logging_level=0xe00000 1

broken3.txt:Kernel command line: ro root=/dev/sdb3 console=ttyS0,9600n8
console=tty0 scsi_logging_level=0xffffff 1

broken4.txt:Kernel command line: ro root=/dev/sdb3 console=ttyS0,9600n8
console=tty0 scsi_logging_level=0xffffffff 1


Comment 37 Kai Engert (:kaie) (inactive account) 2007-02-07 23:42:12 UTC
Just for kicks, I had tried to install the i586 kernel to see whether it makes a
difference, but no.

I am about to attach two log files from the external console.
workinglog.txt is from booting the last rawhide fc6 kernel
brokenlog.txt is from botting the latest rawhide fc7 kernel, currently i586
installed

I'll also attach the diff between them.
We get a kernel panic in the middle of scsi module init, I think.

But there is another difference that surprises me.
It appears the new kernel is not loading the IDE driver either?

This might explain why the F7 test 1 LiveCD does not work on that system (which
uses an IDE drive). I mentioned that in bug 218444 comment 12.



Comment 38 Kai Engert (:kaie) (inactive account) 2007-02-07 23:45:56 UTC
Created attachment 147613 [details]
Kai's working fc6 boot log

Comment 39 Kai Engert (:kaie) (inactive account) 2007-02-07 23:46:56 UTC
Created attachment 147614 [details]
Kai's broken fc7 boot log

Comment 40 Kai Engert (:kaie) (inactive account) 2007-02-07 23:47:44 UTC
Created attachment 147615 [details]
Kai's boot log diff

Comment 41 Kai Engert (:kaie) (inactive account) 2007-02-08 00:30:56 UTC
diff -uw rpmbuild-kernel-working/SOURCES/kernel-2.6.18-i686.config
rpmbuild-kernel-broken/SOURCES/kernel-2.6.20-i686.config

gives

-CONFIG_IDE=y
-CONFIG_IDE_GENERIC=y
+# CONFIG_IDE is not set

Does it make sense that the latest rawhide kernel has IDE disabled?


Looking at Joseph's working config file, he has CONFIG_IDE enabled.


Comment 42 Kai Engert (:kaie) (inactive account) 2007-02-08 01:25:52 UTC
Created attachment 147622 [details]
Kai's initrd contents of broken kernel

00:11.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)

DaveJ said the initrd might need to have pata_via, but it does not.

Comment 43 Kai Engert (:kaie) (inactive account) 2007-02-08 01:42:12 UTC
Created attachment 147623 [details]
Kai's mkinitrd -v log

mkinitrd -f -v	/boot/initrd-2.6.20-1.2922.fc7.img 2.6.20-1.2922.fc7 2>&1 | tee
initrdlog

Comment 44 Kai Engert (:kaie) (inactive account) 2007-02-08 01:54:53 UTC
Created attachment 147625 [details]
Kai's mkinitrd -v --preload=sym53c8xx --preload=pata_via log

Comment 45 Kai Engert (:kaie) (inactive account) 2007-02-08 01:56:56 UTC
Created attachment 147626 [details]
Kai's working fc7 boot log w/ mkinitrd --preload=sym53c8xx --preload=pata_via

Comment 46 Joseph Sacco 2007-02-08 15:56:18 UTC
I have built a kernel from the 2.6.20-1.2922 source that boots and runs.

What I did:
* unpack the src RPM
* run 'rpmbuild -bp'
* move linux-2.6.20.ppc /usr/src/kernels
* run 'make mrproper'
* copy .config from successful 2.6.20 build
[see attachment #147560 [details]]
* run 'make'
The build  fails to compiles drivers/md/md.c
because of Patch1793: linux-2.6-raid-autorun.patch

* compare current .config file with kernel-2.6.20-ppc-smp.config
==> notice differences in Multi-device support (RAID and LVM)

* run 'make menuconfig' using current .config file
alter Multi-device support (RAID and LVM) to match what is in
kernel-2.6.20-ppc-smp.config

* restart 'make'
==> build completes

* install kernel, System.map, modules

* generate initrd file
/sbin/new-kernel-pkg --package kernel-smp --mkinitrd --depmod --install
2.6.20-1.2922.smp

* reboot

-Joseph





Comment 47 Joseph Sacco 2007-02-08 15:59:25 UTC
Created attachment 147661 [details]
configuration file used to build 2.6.20-1.2922smp kernel and modules.

Comment 48 Kai Engert (:kaie) (inactive account) 2007-02-09 01:09:49 UTC
I was told that it is intended to have CONFIG_IDE disabled in the latest
kernels, because CONFIG_ATA is supposed to replace it.

As I said earlier:

(In reply to comment #44)
> Created an attachment (id=147625) [edit]
> Kai's mkinitrd -v --preload=sym53c8xx --preload=pata_via log


these options, in addition to standard mkinitrd options, produced a booting
system for me. However, there are many different modules that a system might
require on boot, and you might not know the name of the module to preload.

Playing with mkinitrd, I found that giving options
  --force-ide-probe --force-scsi-probe
also added the missing pata_via and libata modules to the initrd image that I
require to boot.

So, I'm not a kernel nor a mkinitrd expert, so I don't know if the following is
a gneneral reasonable advice. But maybe you could try to produce an initrd image
using the force options, and see if you can boot that way.

If you're curious whether it makes any difference, you could compare the output
of "mkinitrd -v" runs with and without the force options.


Comment 49 Joseph Sacco 2007-02-09 16:09:33 UTC
Regenerating initrd has not helped. 

Neither choice of additional mkinitrd options:

* --preload=sym53c8xx --preload=pata_via
* --force-ide-probe --force-scsi-probe

results in a bootable 2.6.20-1.2922.fc7smp kernel on my system.

-Joseph

Comment 50 Joseph Sacco 2007-02-10 03:37:37 UTC
For what it's worth... I built 2.6.20-git4 using the source from kernel.org. 
After patching asm-powerpc/io.h,  the kernel and modules build and run.

-Joseph

Comment 51 Joseph Sacco 2007-02-10 22:19:13 UTC
Build 2.6.20-git5 using the source form kernel.org.  The saga continues...


-Joseph

Comment 52 Joseph Sacco 2007-02-14 14:46:19 UTC
More of the same... The problem persists with 2.6.20-1.2925.fc7smp.

-Joseph

Comment 53 Jarod Wilson 2007-02-14 21:07:31 UTC
A "me too" report, but a 32-bit x86 box with aic7xxx scsi controller, and --preload aic7xxx doesn't help 
any. Will poke box more when I can actually get into the office...

Comment 54 Joseph Sacco 2007-02-15 14:54:21 UTC
2.6.20-1.2930.fc7smp: more of the same.

-Joseph

Comment 55 Jarod Wilson 2007-02-15 15:54:24 UTC
I'm seeing two different problems comparing the console output from
2.6.19-1.2895.fc6 and 2.6.20-1.2925.fc7 (will attach both momentarily).

First up, with the fc7 kernel, only one of the two scsi drives in this system
even shows up in the console output.

Second, with the fc7 kernel, the one drive that is showing up in the console
output is never assigned to sda like it is under fc6.


Under the fc6 kernel:

scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec aic7890/91 Ultra2 SCSI adapter>
        aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi 0:0:0:0: Direct-Access     ZJCS     ZJCS2-36GB       S5BS PQ: 0 ANSI: 3
scsi0:A:0:0: Tagged Queuing enabled.  Depth 4
 target0:0:0: Beginning Domain Validation
 target0:0:0: wide asynchronous
 target0:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63)
 target0:0:0: Domain Validation skipping write tests
 target0:0:0: Ending Domain Validation
SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB)
sda: Write Protect is off
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
sd 0:0:0:0: Attached scsi disk sda


Under the fc7 kernel:

Loading aic7xxx.ko module
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec aic7890/91 Ultra2 SCSI adapter>
        aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi 0:0:0:0: Direct-Access     ZJCS     ZJCS2-36GB       S5BS PQ: 0 ANSI: 3
scsi0:A:0:0: Tagged Queuing enabled.  Depth 4
 target0:0:0: Beginning Domain Validation
 target0:0:0: wide asynchronous
 target0:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63)
 target0:0:0: Domain Validation skipping write tests
 target0:0:0: Ending Domain Validation
Loading uhci-hcd.ko module
[... and so on, without any mention of sda ...]

Comment 56 Jarod Wilson 2007-02-15 16:01:44 UTC
Created attachment 148114 [details]
Working FC6 kernel boot log, aic7xxx adapter

Comment 57 Jarod Wilson 2007-02-15 16:02:50 UTC
Created attachment 148115 [details]
Failing FC7 kernel boot log, aic7xxx adapter

Comment 59 Matthew Miller 2007-02-15 16:24:38 UTC
I'm changing this to mkinitrd, because I think that's more likely to get this on
the radar of the people who can address the problem.


For me, running mkinitrd with "--with=libata --with=ata_piix" fixes the issue.

Comment 60 Jarod Wilson 2007-02-15 16:42:39 UTC
Ah, I stupidly missed the part where Kai got his system working on an fc7 kernel
w/some extra mkinitrd flags to include more ata bits. Trying something similar
here now...

Okay, with ata_piix added to the mix, I get a bit further -- at least both scsi
disks are seen, but still kernel panicking. Why ata bits are required to see
scsi disks seems, uh, wrong, but... Will poke at it more later...


Last bits of boot log:
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
SCSI device sda: 71687340 512-byte hdwr sectors (36704 MB)
sda: Write Protect is off
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
 sda:Trying to resume from LABEL=SWAP-sda3
 sda1 sda2 sda3 sda4 < sda5 sda6 >
sd 0:0:0:0: Attached scsi disk sda
SCSI device sdb: 71687340 512-byte hdwr sectors (36704 MB)
Unable to access resume device (LABEL=SWAP-sda3)
Creating root device.
sdb: Write Protect is off
SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
SCSI device sdb: 71687340 512-byte hdwr sectors (36704 MB)
sdb: Write Protect is off
SCSI device sdb: write cache: enabled, read cache: enabled, doesn't support DPO
or FUA
 sdb: sdb1 sdb2 sdb3 sdb4 <Mounting root filesystem.
 sdb5 sdb6 >
sd 0:0:8:0: Attached scsi disk sdb
EXT3-fs: unable to read superblock
mount: error mounting /dev/root on /sysroot as ext3: Invalid argument
Setting up other filesystems.
Setting up new root fs
setuproot: moving /dev failed: No such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
Switching to new root and running init.Kernel panic - not syncing: Attempted to
kill init!

unmounting old /dev
unmounting old /proc
unmounting old /sys
switchroot: mount failed: No such file or directory


Comment 61 Joseph Sacco 2007-02-15 19:54:22 UTC
For what it's worth... I am now running 2.6.20-git11 built using the source from
kernel.org. I started with the default configuration for pmac32 and tweaked it
in a couple of places.  I did not change anything associated with SCSI or LVM.


-Joseph


Comment 62 Chuck Ebbert 2007-02-15 19:55:19 UTC
I though of something to try: add some delay in the initscript
on the initrd. It almost looks like the sd driver is still scanning when
init tries to mount the root fs.

Comment 63 Matthew Miller 2007-02-15 20:25:03 UTC
Chuck, you might be on to something there. I notice a definite delay while
libata fires up -- a delay that doesn't happen with that not included.

Comment 64 Joseph Sacco 2007-02-16 14:10:02 UTC
More of the same... The problem persists with 2.6.20-1.2932.fc7smp.

-Joseph

Comment 65 Jarod Wilson 2007-02-16 14:34:30 UTC
(In reply to comment #62)
> I though of something to try: add some delay in the initscript
> on the initrd. It almost looks like the sd driver is still scanning when
> init tries to mount the root fs.

Bingo. Added a bit of a delay after loading modules in the initrd, and now
everything is coming up as expected with the fc7 kernel. It looks like there's
already some code that's *supposed* to do this for you in the initrd:

echo Waiting for driver initialization.
stabilized --hash --interval 250 /proc/scsi/scsi

For whatever reason though, that doesn't appear to be cutting it. (A 5 second
sleep right before the 'stabilized' bit is what got me booted).

Comment 66 Matthew Miller 2007-02-16 14:45:49 UTC
Was just looking at that too. "stabilized" is an undocumented nash builtin.
Looks like Jeremy Katz and Peter Jones added the calls to the init script in
versions 6.0.4 and 6.0.5 for pata and ahci/stat_*. This appears to read the
given file  (in this case, /proc/scsi/scsi) and loop until it stops changing.

What if you add a "stabilized" call just like the above to right after when the
aic7xxx.ko module is loaded, instead of the sleep?

Comment 67 Jarod Wilson 2007-02-16 14:57:05 UTC
Hrm... I suppose I could try throwing another one of those in there, but I'd
figured the one that was already there should have been covering me for both
ata_piix and aic7xxx. Here's a bit more of what my init looked like before
adding the sleep:

mkblkdevs
echo "Loading scsi_mod.ko module"
insmod /lib/scsi_mod.ko 
echo "Loading sd_mod.ko module"
insmod /lib/sd_mod.ko 
echo "Loading scsi_transport_spi.ko module"
insmod /lib/scsi_transport_spi.ko 
echo "Loading aic7xxx.ko module"
insmod /lib/aic7xxx.ko 
echo "Loading libata.ko module"
insmod /lib/libata.ko 
echo "Loading ata_piix.ko module"
insmod /lib/ata_piix.ko 
echo Waiting for driver initialization.
stabilized --hash --interval 250 /proc/scsi/scsi


I'll throw another stabilized in after the aic7xxx insmod, leave the sleep out
and see what happens...

Comment 68 Jarod Wilson 2007-02-16 15:31:17 UTC
Okay, so for this round, I bumped to kernel 2932. I left out the ata_piix and
libata modules, since there's nothing hooked up via ata in this system. I added
the 'stabilized' command just after aic7xxx is inmod'd, and there's a noticeable
pause there, but still no dice. Still fails to get the drives all the way up.

----------
Loading aic7xxx.ko module
scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 7.0
        <Adaptec aic7890/91 Ultra2 SCSI adapter>
        aic7890/91: Ultra2 Wide Channel A, SCSI Id=7, 32/253 SCBs

scsi 0:0:0:0: Direct-Access     ZJCS     ZJCS2-36GB       S5BS PQ: 0 ANSI: 3
scsi0:A:0:0: Tagged Queuing enabled.  Depth 4
Waiting for driv target0:0:0: Beginning Domain Validation
er initialization.
 target0:0:0: wide asynchronous
 target0:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63)
 target0:0:0: Domain Validation skipping write tests
 target0:0:0: Ending Domain Validation
scsi 0:0:4:0: CD-ROM            PLEXTOR  CD-R   PX-W8220T 1.00 PQ: 0 ANSI: 2
 target0:0:4: Beginning Domain Validation
 target0:0:4: FAST-10 SCSI 10.0 MB/s ST (100 ns, offset 8)
 target0:0:4: Domain Validation skipping write tests
 target0:0:4: Ending Domain Validation
scsi 0:0:5:0: CD-ROM            TOSHIBA  DVD-ROM SD-M1201 1011 PQ: 0 ANSI: 2
 target0:0:5: Beginning Domain Validation
 target0:0:5: FAST-20 SCSI 20.0 MB/s ST (50 ns, offset 16)
 target0:0:5: Domain Validation skipping write tests
 target0:0:5: Ending Domain Validation
scsi 0:0:8:0: Direct-Access     IBM      DDYS-T36950N     S96H PQ: 0 ANSI: 3
scsi0:A:8:0: Tagged Queuing enabled.  Depth 4
 target0:0:8: Beginning Domain Validation
 target0:0:8: wide asynchronous
 target0:0:8: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 63)
 target0:0:8: Domain Validation skipping write tests
 target0:0:8: Ending Domain Validation
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
Trying to resume from LABEL=SWAP-sda3
Unable to access resume device (LABEL=SWAP-sda3)
Creating root device.
Mounting root filesystem.
EXT3-fs: unable to read superblock
mount: error mounting /dev/root on /sysroot as ext3: Invalid argument
Setting up other filesystems.
Setting up new root fs
setuproot: moving /dev failed: No such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: No such file or directory
setuproot: error mounting /sys: No such file or directory
Switching to new root and running init.Kernel panic - not syncing: Attempted to
kill init!


Comment 69 Joseph Sacco 2007-02-16 15:32:49 UTC
Now that 'we' are all exploring the inner workings of initrd, I thought I would
unpack initrd-2.6.20-git11 and initrd-2.6.20.1-2932fc7smp and compare the contents. 

The first thing that jumped out was that the git11 init file does not contain
any scsi instructions. Hmmm... Why is that? A quick look at the kernel config
file and the lib/modules/kernel/drivers/scsi directory shows that the scsi
drivers needed by my system are built directly into the kernel.

So... It would appear that the time to load the drivers for the fc7 kernel may
well be the issue.

-Joseph






Comment 70 Chuck Ebbert 2007-02-16 17:41:41 UTC
Also, see bug #228689 where SAN attached storage needs an 8 second delay.


Comment 71 Kai Engert (:kaie) (inactive account) 2007-02-27 01:44:42 UTC
This seems to have regressed more, at least for me.

As said before, I had been able to get a booting 2.6.20-1.2922 kernel using
  mkinitrd --preload=sym53c8xx --preload=pata_via

After upgrading to 2.6.20-1.2936 the automatically created initrd does not boot.
It stops after "attaching disk /dev/sdc".
(Right before one would usually get "waiting for scsi driver init")

I booted back into 2922 and tried the above mkinitrd again.
This improves things a bit.
The boot process goes on, even arrives at "switching to new root and running init".

However, then my kernel stops again - which might be a different problem?
The last things displayed are messages about USB HID, keyboard + mouse.
While the kernel stops, it's not a "hard" stop.
After nothing happened for 1 minute, I pressed "ctrl-alt-del", and got a
"stopping all devices" and a reboot.

FWIW, I compared the contents of the automatically created initrd-2936 and the
contents of my manually created initrd-2936 (using --preload).

@@ -33,49 +33,49 @@
 mknod /dev/tty11 c 4 11
 mknod /dev/tty12 c 4 12
 mknod /dev/ttyS0 c 4 64
 mknod /dev/ttyS1 c 4 65
 mknod /dev/ttyS2 c 4 66
 mknod /dev/ttyS3 c 4 67
 echo Setting up hotplug.
 hotplug
 echo Creating block device nodes.
 mkblkdevs
+echo "Loading scsi_mod.ko module"
+insmod /lib/scsi_mod.ko
+echo "Loading sd_mod.ko module"
+insmod /lib/sd_mod.ko
+echo "Loading scsi_transport_spi.ko module"
+insmod /lib/scsi_transport_spi.ko
+echo "Loading sym53c8xx.ko module"
+insmod /lib/sym53c8xx.ko
+echo "Loading libata.ko module"
+insmod /lib/libata.ko
+echo "Loading pata_via.ko module"
+insmod /lib/pata_via.ko
+echo Waiting for driver initialization.
+stabilized --hash --interval 250 /proc/scsi/scsi
 echo "Loading uhci-hcd.ko module"
 insmod /lib/uhci-hcd.ko
 echo "Loading ohci-hcd.ko module"
 insmod /lib/ohci-hcd.ko
 echo "Loading ehci-hcd.ko module"
 insmod /lib/ehci-hcd.ko
 mount -t usbfs /proc/bus/usb /proc/bus/usb
 echo "Loading mbcache.ko module"
 insmod /lib/mbcache.ko
 echo "Loading jbd.ko module"
 insmod /lib/jbd.ko
 echo "Loading ext3.ko module"
 insmod /lib/ext3.ko
-echo "Loading scsi_mod.ko module"
-insmod /lib/scsi_mod.ko
-echo "Loading sd_mod.ko module"
-insmod /lib/sd_mod.ko
-echo "Loading libata.ko module"
-insmod /lib/libata.ko
 echo "Loading ata_generic.ko module"
 insmod /lib/ata_generic.ko
-echo "Loading pata_via.ko module"
-insmod /lib/pata_via.ko
-echo Waiting for driver initialization.
-stabilized --hash --interval 250 /proc/scsi/scsi
-echo "Loading scsi_transport_spi.ko module"
-insmod /lib/scsi_transport_spi.ko
-echo "Loading sym53c8xx.ko module"
-insmod /lib/sym53c8xx.ko
 mkblkdevs
 resume /dev/sdb2
 echo Creating root device.
 mkrootdev -t ext3 -o defaults,ro sdc2
 echo Mounting root filesystem.
 mount /sysroot
 echo Setting up other filesystems.
 setuproot
 echo Switching to new root and running init.
 switchroot


Comment 72 Kai Engert (:kaie) (inactive account) 2007-02-27 02:06:05 UTC
kernel 2.6.20-1.2947 gives me exactly the same behaviour as described in comment
71 with 2.6.20-1.2936
(automatic initrd stops in the middle of scsi, preload initrd stops at USB HID).


Comment 73 Joseph Sacco 2007-02-28 20:14:34 UTC
kernel 2.6.20-1.2949: 

     More of the same. 

Sigh...

-Joseph

Comment 74 Joseph Sacco 2007-03-03 16:23:33 UTC
kernel 2.6.20-1.2953, kernel 2.6.20-1.2960:

   Ditto


-Joseph

Comment 75 Matthew Miller 2007-03-04 19:00:58 UTC
*** Bug 228977 has been marked as a duplicate of this bug. ***

Comment 76 Joseph Sacco 2007-03-04 19:09:14 UTC
kernel 2.6.20-1.2962:

       yada, yada, yada...



-Joseph

Comment 77 Joseph Sacco 2007-03-06 14:58:39 UTC
kernel 2.6.20-1.2966:

       no love...



-Joseph



Comment 78 Joseph Sacco 2007-03-08 15:39:28 UTC
kernel 2.6.20-1.2967:

   Still borked.


-Joseph

Comment 79 Matthew Miller 2007-03-08 17:58:42 UTC
So, in lieu of a real fix, the following patch to mkinitrd works around the
problem for me. A better fix would be to find the right file to pass to
"stabilized" (as per comment #65, /proc/scsi/scsi isn't doing it), but for now,
sleeping at least gets my system up.

--- mkinitrd.orig       2007-03-08 12:42:48.000000000 -0500
+++ mkinitrd    2007-03-08 12:44:39.000000000 -0500
@@ -1357,6 +1357,10 @@
         emit "echo Waiting for driver initialization."
         emit "stabilized --hash --interval 250 /proc/scsi/scsi"
     fi
+    if [ "$module" = "aic7xxx" ]; then
+       emit "echo Sleeping for 5 seconds because that seems to work."
+       emit "sleep 5"
+    fi
 done
 unset usb_mounted
 


Comment 80 Joseph Sacco 2007-03-08 19:28:35 UTC
Tried it.  Did not work on my PowerMac.

-Joseph

Comment 81 Matthew Miller 2007-03-08 19:34:56 UTC
If you have a different scsi driver, that's to be expected. Looks like in your
case it's sym53c8xx -- try this instead:

--- mkinitrd.orig       2007-03-08 12:42:48.000000000 -0500
+++ mkinitrd    2007-03-08 12:44:39.000000000 -0500
@@ -1357,6 +1357,10 @@
         emit "echo Waiting for driver initialization."
         emit "stabilized --hash --interval 250 /proc/scsi/scsi"
     fi
+    if [ "$module" == "aic7xxx" -o "$module" == "sym53c8xx" ]; then
+       emit "echo Sleeping for 5 seconds because that seems to work."
+       emit "sleep 5"
+    fi
 done
 unset usb_mounted

Comment 82 Joseph Sacco 2007-03-08 19:39:12 UTC
Tried that also... :-)

                                  ...

        emit "echo Waiting for driver initialization."
        emit "stabilized --hash --interval 250 /proc/scsi/scsi"
    fi
    if [ "$module" = "aic7xxx" ]; then
       emit "echo Sleeping for 5 seconds because that seems to work."
       emit "sleep 5"
    fi
    if [ "$module" = "sym53c8xx" ]; then
       emit "echo Sleeping for 5 seconds because that seems to work."
       emit "sleep 5"
    fi

                           ...

Comment 83 Matthew Miller 2007-03-08 19:45:46 UTC
Can you attach the init file from your mkinitrd?

Comment 84 Joseph Sacco 2007-03-08 20:17:33 UTC
Created attachment 149620 [details]
init file extracted from initrd-2.6.20-1.2967.fc7smp.img

Maybe a 5 second delay is not long enough for an older powerMAC.

-Joseph

Comment 85 Matthew Miller 2007-03-08 20:21:45 UTC
Possible. Or maybe your problem is actually different from the one most of the
rest of us are seeing. Anyway, try something much bigger and see what happens.

Comment 86 Joseph Sacco 2007-03-08 21:14:00 UTC
A ten second delay is what was required. I am now running 2.6.20-1.2967.fc7smp.
Life is better.

Showing that no good deed goes unpunished... The boot sequence initiated an
SeLinux relabeling. Took a very long time.


-Joseph

Comment 87 Kai Engert (:kaie) (inactive account) 2007-03-08 23:04:33 UTC
Here's the latest status of my SCSI system:

- default initrd img still does not boot
- adding "just" a delay doesn't help me (I even got dropped into an emergency
filesystem recovery shell)
- I still need --preload to get my system pass SCSI init
- combining --preload with the delay does not improve things for me

What I said in comment 71 about 2.6.20-1.2922.fc7 is still my most recent success.

With any newer kernel I still get the USB-HID failure I reported in comment 71.
The "scsi delay" did not influence that.

My current conclusion is:
- my system needs --preload
- my system does not need a delay
- the USB-HID issue is a separate bug?


Comment 88 Matthew Miller 2007-03-21 02:10:48 UTC
This looks remarkably similar to bug #162685 from Fedora Core 3....

Comment 89 Matthew Miller 2007-03-28 14:57:06 UTC
Any news on this from anyone @ red hat? It seems like this is a showstopper
level bug for Fedora 7.

Comment 90 Dave Jones 2007-04-04 05:13:58 UTC
I just read the help text for SCSI_SCAN_ASYNC again (which is probably the root
cause of all this)...

          If you have built SCSI as modules, enabling this option can
          be a problem as the devices may not have been found by the
          time your system expects them to have been.  You can load the
          scsi_wait_scan module to ensure that all scans have completed.
          If you build your SCSI drivers into the kernel, then everything
          will work fine if you say Y here.

Peter, are we loading that module ? If not, perhaps we should :)

Comment 91 James Morris 2007-04-11 14:15:58 UTC
Adding a me too.  Upgraded to FC7 test, which went ok, but doesn't boot (also
reported previously several times when trying to run recent rawhide kernels
under FC6).  I'm using Fusion MPT scsi.

Comment 92 James Morris 2007-04-11 16:03:53 UTC
scsi_wait_scan works for me.

It needs to be loaded after the scsi modules (I hacked my mkinitrd to run a
findmodule for it if the current module is dm-mod).

--- /sbin/mkinitrd.orig 2007-04-11 11:06:31.000000000 -0400
+++ /sbin/mkinitrd      2007-04-11 11:53:19.000000000 -0400
@@ -235,6 +235,9 @@
         findmodule ieee1394
         findmodule ohci1394
         modName="sbp2"
+    elif [ "$modName" = "dm-mod" ]; then
+        findmodule scsi_wait_scan
+        modName="dm-mod"
     elif [ "$modName" = "gfs2" ]; then
         findmodule lock_nolock
         modName="gfs2"


I did NOT need the stabilize line (which alone got the disk detected but the
system then failed to find /dev/root anwyay).  I suspect the pata/sata/sleep
stuff people have been using is just accidentally solving the problem, by
causing delays which may or may not be correct.

Now I'm seeing udev take a long time to start up and several

Mounting other filesystems:  mount: /dev/sysfs already mounted or /sys busy
mount: according to mtab, sysfs is already mounted on /sys

but at least the system comes up now.


Comment 93 Joseph Sacco 2007-04-11 17:41:11 UTC
James,

I can confirm that your patch for mkinitrd works on a G4 PowerMac with 3 SCSI
drives and ATTO controllers.

-Joseph

Comment 94 Jarod Wilson 2007-04-11 17:43:09 UTC
Works with my aic7xxx box as well.

Comment 95 Matthew Miller 2007-04-11 18:16:20 UTC
Works for me on aic7xxx too. Moving out of needinfo state; let's get this in, yeah?

Comment 96 Joseph Sacco 2007-04-11 19:12:20 UTC
OK... we now have four data points. Should be enough... [:-)]

-Joseph

Comment 97 Jarod Wilson 2007-04-12 16:34:55 UTC
Eep. Minor correction... The patch in comment #92 does NOT work as-is on my
aic7xxx box. Including scsi-wait-scan in the initrd works, but its not getting
included by that patch, as there is no dm-mod getting pulled into my initrd to
begin with. I'd suggest perhaps the following instead, which works for me:

--- /sbin/mkinitrd.orig 2007-04-12 12:32:31.000000000 -0400
+++ /sbin/mkinitrd      2007-04-12 12:32:50.000000000 -0400
@@ -1004,6 +1004,7 @@
     # RAID controllers with drivers in block/
                 findmodule $n
             done
+        findmodule scsi-wait-scan
         fi
     fi
 fi

I believe that'll pull it in if we pull in any scsi modules period, rather than
relying on having dm-mod.

Comment 98 Jarod Wilson 2007-04-12 16:37:13 UTC
That, or slap that in around line 305:

    # need to handle prescsimods here -- they need to go _after_ scsi_mod
    if [ "$modName" = "scsi_mod" ]; then
        for n in $PRESCSIMODS ; do
            findmodule $n
        done
    findmodule scsi-wait-scan
    fi


Comment 99 Joseph Sacco 2007-04-12 17:10:04 UTC
Jarod,

Your suggested fix does *not* work on my system.

-Joseph

Comment 100 Joseph Sacco 2007-04-12 17:23:33 UTC
The fix in #97 does work on my system. The fix in #98 does not.

-Joseph

Comment 101 Jarod Wilson 2007-04-12 19:29:53 UTC
That's what I get for putting it in there w/o actually trying it first... #98
doesn't work for me either on another box, but #97 does. In any case, Peter
actually has a better fix forthcoming that has been successfully tested on my box.

Comment 102 Joseph Sacco 2007-04-12 20:10:32 UTC
"We" are hopeful... This has been a very long road to travel.

-Joseph

Comment 103 Peter Jones 2007-04-16 22:27:12 UTC
Hopefully this should be addressed in mkinitrd-6.0.9-1 .  

Comment 104 James Morris 2007-04-17 15:47:36 UTC
(In reply to comment #103)
> Hopefully this should be addressed in mkinitrd-6.0.9-1 .  

Appears to work, although udev startup pauses for some minutes waiting on
udev_settle.

Comment 105 Jarod Wilson 2007-04-17 15:52:59 UTC
mkinitrd 6.0.9-1 and kernel 2.6.20-1.3079.fc7 works for me without need for any
tweaks.

Comment 106 Joseph Sacco 2007-04-17 19:07:37 UTC
Confirmed: 
   
     mkinitrd-6.0.9.1 + kernel 2.6.20-1.3079.fc7smp works on my system.


-Joseph

Comment 107 Kai Engert (:kaie) (inactive account) 2007-04-17 20:27:24 UTC
Confirmed: mkinitrd-6.0.9.1 + kernel 2.6.20-1.3079.fc7

I was initially dropped to an emergency shell.
Because I have both IDE and SCSI disks, my boot SCSI disk is now on a different
/dev/sdX node.

I had to manually adjust /etc/fstab 
My system boots up fine now.

(The other issue about USB HID I had reported in comment 71 is now gone to!).


Comment 108 David Woodhouse 2007-04-18 22:46:02 UTC
*** Bug 236475 has been marked as a duplicate of this bug. ***

Comment 109 Kai Engert (:kaie) (inactive account) 2007-04-20 15:38:42 UTC
Maybe I should file a different bug, but:

Kernel 3084 gives me different device-to-/dev/sdX assignments than 3079 :-/

In the past the order of devices has always been stable for me.
Now I can't boot, because fstab wants to mount from /dev/sdc, but suddenly my
boot disk became /dev/sdb.



Comment 110 Dave Jones 2007-04-23 16:11:14 UTC
Your fstab should contain LABEL= lines rather than hardcoded /dev names.

Comment 111 Dave Jones 2007-04-23 19:03:47 UTC
*** Bug 228699 has been marked as a duplicate of this bug. ***

Comment 112 Will Woods 2007-05-08 21:49:25 UTC
So, this should be closed now, because kernel+mkinitrd are doing the right thing
now.. correct?

Comment 113 Joseph Sacco 2007-05-08 22:03:14 UTC
Yes...


-Joseph