Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 103821
Summary: | loading firewire drivers while external hard disk is connected causes several problems | ||
---|---|---|---|
Product: | [Retired] Red Hat Raw Hide | Reporter: | Alexandre Oliva <aoliva> |
Component: | kernel | Assignee: | Matthew Galgoci <mgalgoci> |
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 1.0 | CC: | benhsu, laurivan, maxer1, riel |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | kernel-2.4.22-1.2105.nptl | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2004-02-16 22:33:49 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 100643, 106926 | ||
Attachments: |
Description
Alexandre Oliva
2003-09-05 12:42:09 UTC
Ditto what aoliva said. Created attachment 94792 [details]
Message to 1394 list containing patch
A patch was recently posted to the 1394-devel mailing list for this bug. I
applied this patch to my machine and here are the results:
- machine boots cleanly with 1394 drive turned on
- drive detected with rescan-scsi-bus.sh
- reads and writes on drive do not show any warnings or lockups.
To give proper credit to the author (Sergey Vlasov), I am attaching the
original email
Created attachment 94793 [details]
patch itself
Should be fixed in beta2 ? $ cat /home/davej/firewire.diff | patch -p1 patching file drivers/ieee1394/nodemgr.c Reversed (or previously applied) patch detected! Assume -R? [n] n Apply anyway? [n] n Skipping patch. 1 out of 1 hunk ignored -- saving rejects to file drivers/ieee1394/nodemgr.c.rej I'm sorry, but I would have to beg to differ. I was able to reproduce this with the default kernel in test2, and I applied the patch manually (with vi and not patch) so I know that piece of code did not have the patch 2061 certainly fails in just the same way as Severn1. This was fixed in 2.4.22.1.2064, which unfortunatly was too late for beta2 (for the record, after the bugzilla database was restored without this transaction) Just tried 2075, the problem is still there. Reading from /proc/bus/ieee1394/devices still hangs, kudzu still hangs, loading sbp2 after ohci1394 still gets stuck in initializing state, and load is still stuck at >=1.0 if sbp2 is loaded before ohci1394. Curioser and curioser. I saw this bug with the binary 2.4.22-1.2082 kernel, but when I installed the kernel-source rpm and recompiled with the default i686 config file (with SMP turned on to work around another bug) I then was able to boot with my 1394 drive turned on. Created attachment 95129 [details]
mkinitrd patch that enables boot-time loading of firewire drivers without hanging sbp2
Kernel 2088 still has the same problem. This patch to mkinitrd is the hack
I've been using to work around part of the problem (the hang at boot time), but
the following problems are still present:
- throughput is limited to 5MB/s, instead of 17MB/s as in Shrike
- it's probably operating without DMA (how do I tell?)
- there's still some kernel thread that gets the system load stuck at >= 1.0
- reading from /proc/bus/firewire still hangs. This requires kudzu to be
disabled *and* nofirewire to be added to the kernel command line. The latter
doesn't effectively disable firewire, since the modules have already been
loaded.
As for the mkinird patch itself, the changes for insmod -k and the reformatting
of usb-storage can probably be taken out, but the rest of the patch would be
very nice to have in the next mkinitrd build, at least until the kernel sbp2
module is fixed so as to not hang if loaded after ohci1394 when there are
firewire devices conected.
Still no improvements in 2097. Created attachment 95302 [details]
fix problems in sbp2 module
This patch fixes all of the problems I'd run into when sbp2.o is loaded when
there are firewire devices already connected to the bus. It no longer hangs if
loaded after ohci1394, throughput is back to the expected range, reading from
/proc/bus/firewire displays the correct information and kudzu no longer hangs.
I won't pretend to understand why the semaphore was down()ed twice before, and
why it's ok to down() it only once now, but this is certainly an improvement.
The mkinitrd patch was broken up in smaller pieces in bug 103665. The only one that is really needed now in order for firewire devices to be visible when raid devices needed for the root filesystem are started is mkinitrd.01-sbp2-rescan.patch. Ben Collins gives me reasons to consider this patch wrong, so I take it back. I'm investigating further. Created attachment 95330 [details]
proper fix for the sbp2-hangs-on-load problem
We were freeing the packet data structure before the thread we woke up had a
chance to look at the semaphore. This patch fixes the problem properly, and I
guess Ben Collins actually likes it, because he sent me a very similar patch
just as I finished to test mine :-)
*** Bug 101901 has been marked as a duplicate of this bug. *** Works for me now on detecting my lone BUSLink firewire device. No hang on kernel boot with deviced plugged. Works very nicely on kernel 2.4.22-1.2108.nptl.i686! Good work! Well folks 2115 worked since rawhide push but... 2.4.22-1.2115.nptl.i686 broke today! Can't leave CDRW hot plugged on boot, I can use rescan-scsi-bus after booting then plugging. Can't figure it out. Nothing has changed on my system at all. No packages were changed since the last push to rawhide. `Can't leave' meaning what? Does it hang? Does it just fail to be brought up? I had some problems with sbp2 failing to load into devices with sbp2 loaded after ohci1394. Arranging for sbp2 to be loaded before ohci1394 fixes it, but I won't even pretend to understand why. This problem has always happend to me with these modules loaded from initrd.img, where hotplug just doesn't work, but I can't tell whether that's related. I believe I was very clear on my comment. With 2105.nptl all the way through and including 2115.nptl installed a few days back I could leave my BUSLink Firewire CDRW drive plugged. Now all of a sudden the boot process hangs just as before. If I leave my drive unplugged and wait till after that point in the kernel where ohci loads, then I can successfully use the bourne script rescan-scsi-bus to successfully detect the drive and use it... go figure. The only addition I have added recently was BitTorrent-3.3-1. BitTorrent 3.3-1 hoses Firewire detection in kernel 2115.nptl I did an rpm -e on BitTorrent-3.3-1 that I built last night from source and voila... FIXED This bug is back in kernel-2.6.0-0.test11.1.13, and is only detected
reliably with slab poisoning enabled. Attachment 95330 [details] applies
cleanly and fixes the problem. I wish Ben Collins would merge the fix
into 2.6 before it's too late...
I just tested 2.6.1-rc1 vanilla and it seems to work Confirmed fixed in FC2test1. |