Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 254007
Summary: | [sata_sil] Can't boot because newer kernels can't access SATA disk | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | David A. De Graaf <dad> | ||||
Component: | kernel | Assignee: | Jeff Garzik <jgarzik> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | urgent | ||||||
Version: | 7 | CC: | cebbert, chris.brown, davej, juha.anon, kevin, peterm | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-01-13 18:25:51 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 172490 | ||||||
Attachments: |
|
Description
David A. De Graaf
2007-08-23 15:58:32 UTC
Hello, I'm reviewing this bug as part of the kernel bug triage project, an attempt to isolate current bugs in the fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am CC'ing myself to this bug and will try and assist you in resolving it if I can. There hasn't been much activity on this bug for a while. Could you tell me if you are still having problems with the latest kernel? If the problem no longer exists then please close this bug or I'll do so in a few days if there is no additional information lodged. I am sad to report that this bug remains alive and well. I have just added the latest kernel - vmlinuz-2.6.22.9-91.fc7 It, too, cannot access my SATA disk, which contains important but non-system data files. I have not tried unplugging the sata cable, but I'm pretty sure that would assuage the problem, but wouldn't allow access to those important filesystems. Please DO NOT close this bugzilla. This is a show-stopper for me. I'm stuck with vmlinuz-2.6.21-1.3194.fc7, the only Fedora 7 kernel that works. What additional data can I provide? Okay, thanks for the update. I'm re-assigning this to the SATA maintainer who may wish to review this further. I'm also adding a F8 blocker bug as it might prevent a successful install of the next version of Fedora. To confirm this, it would be helpful if you could download the latest live cd version from: http://torrent.fedoraproject.org/torrents//rawhide-i386-Live-20070925.torrent and see if this detects your disk. I'd love to try the latest live cd. I spent all day yesterday trying to get a bittorrent download to finish. I restarted it at 1;30, and again it stopped short of completion. Now 11 hours later, I have Size: 694.6 MB (728,258,633 bytes) Transferred: 694.6 MB (728,242,249 bytes) The actual files received are: -rw-rw-r-- 1 dad dad 0 2007-10-02 13:32 SHA1SUM -rw-rw-r-- 1 dad dad 728258560 2007-10-02 17:09 rawhide-i386-Live-20070925.iso The last ~16K bytes never arrive. I can't find any non-bittorrent way (ftp, rsync, http) to obtain this file. Is there one? Thanks for adding the "F8 blocker bug" label. It seems appropriate. Created attachment 215111 [details]
photos of F8 booting
Please don't attach .gz files, nobody can view them. And jpegs are already compressed... F8 Test 3 is out any day now which will be available over ftp, http... I have booted F8 Test 3 Live CD (Fedora-7.92-Live-i686.iso). During the detection phase it failed to detect my third disk, which is a SATA drive, but apparently did detect the two ATA drives. I will refrain from posting photos of the extensive error messages. They were similar to what I've seen with the F7 sata driver. The GUI interface came up correctly and I did fdisk -l which correctly displayed /dev/sda and /dev/sdb, but not /dev/sdc. fdisk -l also listed two other disks that are a mystery to me: /dev/dm-0: 4294 MB /dev/dm-1: 4294 MB Sadly, the sata driver in F8T3 is still broken and unable to see my SATA disk. Odd. My main test machine here uses sata_sil and works great... from the bootup messages: libata version 2.21 loaded. sata_sil 0000:00:12.0: version 2.3 ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22 scsi0 : sata_sil scsi1 : sata_sil ata1: SATA max UDMA/100 cmd 0xffffc200001f8080 ctl 0xffffc200001f808a bmdma 0xffffc200001f8000 irq 22 ata2: SATA max UDMA/100 cmd 0xffffc200001f80c0 ctl 0xffffc200001f80ca bmdma 0xffffc200001f8008 irq 22 input: PS/2 Logitech Mouse as /class/input/input1 usb 2-4: new full speed USB device using ohci_hcd and address 2 usb 2-4: configuration #1 chosen from 1 choice ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata1.00: ATA-7: SAMSUNG SP2504C, VT100-33, max UDMA7 ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32) ata1.00: configured for UDMA/100 ata2: SATA link down (SStatus 0 SControl 300) scsi 0:0:0:0: Direct-Access ATA SAMSUNG SP2504C VT10 PQ: 0 ANSI: 5 sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 sd 0:0:0:0: [sda] Attached SCSI disk lspci: 00:12.0 IDE interface: ATI Technologies Inc 4379 Serial ATA Controller uname: 2.6.23-6.fc8 Happy to provide further info if it would help track this down. I think I have this problem now in Fedora 8 Test 3. But I got it only after todays updates, which included a kernel change. I think the update was the cause of the problem, because there were nothing like that before. And now it's repeatable. My system now fails to reboot every second time with something like this: Red Hat nash version 6.0.19 Starting handlers: [<f88d558c>] (ata_interrupt 0x0/0x1c0 [libata]) Disabling IRQ #22 ata3.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata 3.01: cmd c8/00:08:00:00:00/00:00:00:00:00/f0 tag 0 cdb 0x0 data 4096 in ata3.00: revaluation failed (errno=-5) ... I can repeat that with exactly the same result. But every second time I can boot the system. Before the above booting log, there's a message about a BIOS bug, about a memory area, I think: I haven't quite catched it yet, but will do so if it would help. I have both ATA and SATA disks in my system: lspci reports this: 00:00.0 Host bridge: Intel Corporation 82975X Memory Controller Hub (rev c0) 00:01.0 PCI bridge: Intel Corporation 82975X PCI Express Root Port (rev c0) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 01) 00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 5 (rev 01) 00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express Port 6 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI Controller #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01) 00:1f.1 IDE interface: Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) 00:1f.2 IDE interface: Intel Corporation 82801GB/GR/GH (ICH7 Family) SATA IDE Controller (rev 01) 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 01) 01:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 02:00.0 SATA controller: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 02:00.1 IDE interface: JMicron Technologies, Inc. JMicron 20360/20363 AHCI Controller (rev 02) 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 20) 04:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8053 PCI-E Gigabit Ethernet Controller (rev 20) 06:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce 7600 GS] (rev a1) -- Juha We've been unable to reproduce this bug. It might be because of the combination of sata_sil and pata_amd? Have you tried anything more recent than Test3? My latest test was of F8Test3, which failed to detect and initialize the SATA third disk. I'd be happy to test whatever you point me to, but I've not been following the rapid evolution toward F8. My machine that won't boot any Fedora kernel escept the original distribution - vmlinuz-2.6.21-1.3194.fc7 - has 5 (!) ATA devices: ata1, master - Maxtor 6Y120L0 122.9 GB ata1, slave - Pioneer DVD-RW, DVR-105 ata2, master - Seagate ST310240A 102 GB ata2, slave - NEC CD-ROM, drive: 28G Rev: 3.24 ata3, master - WDC WD1200JB-00E, 120.0 GB The motherboard, ABIT NF7-S, has the usual 2 Ultra DMA 33/66/100/133 IDE connectors, plus two SATA 150 MB/s data channels. It came with a "Serillel" adapter which claims to allow "Serial ATA RAID Now!". This adapter plugs directly onto an ATA disk and has a seral socket. A normal SATA cable connects from this socket to the mobo SATA socket. Purportedly, this converts an ATA drive to SATA. It is this device that gives rise to this bugzilla report, since it works fine with kernel 2.6.21-1.3194.fc7, but not with any newer kernel. I have just now tried to swap the NEC CD-ROM and the WD hard drive so that all three hard drives are on the IDE ports and the CD-ROM uses the "Serillel" adapter. This did not work - the 2.6.21-1.3194.fc7 kernel drops error messages right after it starts, eg, ata3: COMRESET failed (device not ready) [3 of these] ata3: reset failed, giving up Thereafter, all three hard drives are properly detected and mounted, but the CD-ROM is not available. The newest kernel, 2.6.22.9-91, produces similar but more extensive errors and cannot access the Serillel-adapted device. I have no further info about this "Serillel" adapter, and I see that ABIT has undergone a "corporate restructuring". I will probably solve this problem by the purchase of a PCI card with additional IDE ports. If I am the only one with this problem, it seems unproductive to spend any more effort on this bugzilla report. Of course, the intellectual question remains - what change in the kernel makes it unable to detect and initialize a pseudo-SATA disk? As I mentioned in a comment above (on 2007-10-18 06:35 EST) I have something of this kind. Right now with the latest Fedora 8 Test 3 update from today (with kernel vmlinuz-2.6.23.1-35.fc8). But for me, it always succeeds to boot when I try it a second time. That may help getting diagnostics from my system. After the Red Hat nash version 0.6.19 starting handers: I get the text: Reading all physical volumes. ... when it succeeds. When it fails I get: [<If88d55825>] (ata_interrupt+0x0/ox1be [libata]) ... and it doesn't seem to be able to read my SATA disks. Changing kernels can make the problem seems to disappear. But not permanently. With the original ISO image, I had no problems of this kind at all. I got it at an update. It disappeared after a later update. But now it's back. (I currently have the problem with two different versions of the kernel, since I'm sometimes running the KDE version, and there I currently have kernel: vmlinuz-2.6.23.1-31.fc8) -- Juha Does adding "pci=nomsi,nommconf" to the kernel boot options make a difference? I made some tries with and without those kernel options and think there's a difference in the frequency of success. But it can fail also with those options. With the kernel options: booted, booted, booted, booted, failed, failed. Without them: booted, failed, booted, failed. It would need more tries to be sure about any change. I don't think there has been two failures in direct sequence without any extra options: it has always booted on the next try after a failure. -- Juha I made some tries with and without those kernel options and think there's a difference in the frequency of success. But it can fail also with those options. With the kernel options: booted, booted, booted, booted, failed, failed. Without them: booted, failed, booted, failed. It would need more tries to be sure about any change. I don't think there has been two failures in direct sequence without any extra options: it has always booted on the next try after a failure. -- Juha I'm sorry to report that adding the kernel boot option: "pci=nomsi,nommconf" had no effect. The error messages were unchanged with kernel 2.6.22.9-91, eg, when nash runs it reports: ata3.00: revalidation failed (errno=-5) ata3.00: failed to set xfermode(err_mask=0x40) ata3.00: failed to set xfermode(err_mask=0x40) ata3.00: exception Emask 0x10 SAct 0xo SErrr 0x0 action 0x2 frozen and the disk connected with the Serillel converter cannot be accessed. I'm afraid the Serillel ATA-to-SATA converter is not usable. It is consigned to my junk bin. I have, however, purchased a Creative I/O Ultra ATA IDE Controller pci card for $15.99 and it works perfectly. I have five ATA devices connected: 3 disks, a CD-ROM and a DVD-RW. All are working well. Closing NOTABUG as the original reporters indicates it may have been faulty hardware and others have tried to reproduce and have failed. Please re-open if I have somehow misunderstood the above comment and thank you for filing the bug originally. |