Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 379341
Description
tengel
2007-11-13 05:07:45 UTC
Created attachment 256241 [details]
working syslog
Created attachment 256251 [details]
crash syslog
Created attachment 256271 [details]
working dmesg
Created attachment 256281 [details]
crash dmesg
Created attachment 256291 [details]
working lsusb -v showing BlackBerry device
Created attachment 256301 [details]
working lsmod after fresh boot
Created attachment 256311 [details]
crash lsmod after fresh boot
Created attachment 256321 [details]
crash udevd strace -p output showing loop/hang
Created attachment 256351 [details]
working full output of lsusb -v
previous attachment comment states it includes hub info, mistake on my part -
that's the BlackBerry only. this attachment is the full lsusb -v output with
the device plugged directly into the laptop as well as the hub.
and what does udev do, to crash the BlackBerry?? to test if udev starts programs which kill the blackberry you may try the following: 1. stop the haldaemon # service haldaemon stop If your Blackberry does not crash now, it is something hal triggered. Maybe automounting your BlackBerry via gnome-volume-manager. Please reassign to component hal. 2. If your BlackBerry still crashes, kill the udev daemon before you plug in the blackberry. # killall udevd If your BlackBerry still crashes, you may assign to component kernel. Thank you. I just tested both methods as requested - killall udevd "fixes" the problem, my BB does not crash. The haldaemon test did not change the circumstance, it still crashed. So, I think that means it's udevd right? Running, crash. Not running, no crash. Created attachment 257091 [details]
dell inspiron xps gen 2 + d-link dub-h7 lsusb -v output
I have reproduced the issue (100%) using different gear here at work but
connected in the same manner - a direct plugin to the laptop works fine, but a
plug into the external hub causes a device crash and udevd loop.
laptop: Dell Inspiron XPS Gen 2
usb hub: D-Link USB 2.0 7-port hub (model DUB-H7)
I am attaching the lsusb -v output of this laptop (hostname "ironclad") with
the D-Link hub plugged in for comparison to the Thinkpad + Dynex hub output.
FYI: both of these hubs work in F7: the Dynex was working with the Thinkpad T43
previously before the upgrade, and I just plugged in the D-Link hub to my
workstation F7 install (Dell Precision 450) and the device works fine on the
hub.
try this: # mv /etc/udev/rules.d/60-persistent-storage.rules \ /etc/udev/rules.d/60-persistent-storage.rules.bak and see if it crashes. I am reassigning (temporarily?) to component kernel for more input. tested it, it still crashes in the same manner with udevd gone wild. I can try on the other setup tomorrow as well. also tested moving the 60-persistent-storage.rules on the second test setup (Dell XPS + D-Link hub) and it also does not help, the device still crashes and udevd loops. please let me know if there's any other testing or info I can provide. can you strace the "wild" udevd to see about what it is looping? it's attached above already, here's the link: https://bugzilla.redhat.com/attachment.cgi?id=256321 Seems like the Blackberry is connecting/disconnecting the whole time. This is why udev is looping, because it is receiving add/remove events the whole time. Can you please "strace -f" udevd so, that I can see what the child processes (clone()) are doing to the BlackBerry? actually it's not reconnecting repeatedly -- the result of that udevd loop are after the device crash, meaning the power has cycled on my device (if you're a BlackBerry user, the red LED is in it's initial boot state while it's checking ROM before booting) and I've immediately unplugged the device. the crash is of the hard variety (BB's can reboot in hard or soft reboot mode, this is a hard - as if the battery were physically pulled) that I see when a bad command has been sent over the USB channel. I work with the barry project and have crashed it sort of the same way while testing alpha code that might accidentally send bad commands to the device. this USB hub crash "feels" the same, like a bad command has been sent. The crash is instantaneous - the second (microsecond) I plug in the device it crashes and is no longer talking to the laptop(s), so there's no possible way it can still be connecting/disconnecting as I've removed the cable for safety, don't want to damage my device accidentally. I'll run a strace -f -p <udevd PID> for you on the Dell XPS + D-Link laptop and post shortly. I'll make sure the trace is running and logging before plugging in the BlackBerry so that it (hopefully!) provides as much info as as possible for you. In fact, I'll have it running before even plugging in the external HUB - that should provide hopefully even more information of interest. Created attachment 260161 [details]
strace -f -p <pidof udevd> output during test
log is ~14meg, zipping for better attachment portability
this is the output of "strace -f -p 531 1>f8_ironclad_udevd_debug.txt 2>&1"
launched before plugging in the external D-Link USB hub. After connecting the
hub I paused for roughly 10 seconds before plugging in the BlackBerry which
crashed immediately. I unplugged the device and let it log the looping udevd
for roughly another 5 seconds to capture that output.
ok, test: 1. kill udevd 2. plug in the Berry 3. # modprobe berry_charge # modprobe scsi_mod # modprobe usb-storage does it crash the Berry? fantastic! it appears to be specifically berry_charge and only when used through the hub, and *only* if it runs before usb_storage! I am unable to unload scsi_mod as I have a SATA harddrive needing it, but that doesn't seem to be the problem or related. Crash with USB hub: # killall udevd # modprobe berry_charge Working with USB hub: # killall udevd # modprobe usb_storage # modprobe berry_charge Working without USB hub (direct plugin to laptop): # killall udevd # modprobe berry_charge 3 attachments coming with the output of syslog/dmesg for each case above showing what's working and what's crashing. yay for making progress. :) Created attachment 261531 [details]
modprobe berry_charge while plugged into the hub - crash
The syslog/dmesg result of:
# killall udevd
# modprobe berry_charge
...while plugged into the external USB 2.0 hub.
Created attachment 261541 [details]
modprobe berry_charge while plugged directly into laptop (works)
The output of:
# killall udevd
# modprobe berry_charge
...while plugged directly into the laptop. working.
Created attachment 261551 [details]
modprobe usb_storage then berry_charge on hub (working)
Output of:
# killall udevd
# modprobe usb_storage
# modprobe berry_charge
...while plugged into the external USB hub. working!
you may create a preinstall usb_storage line for berry_charge in /etc/modprobe.conf like this: install berry_charge { /sbin/modprobe usb_storage; } && /sbin/modprobe --first-time --ignore-install berry_charge This stops it from crashing when I plug in, but berry_charge no longer works now; my device is not properly adjusted to 500mA as usual. The USB Mass Storage does mount the SD card, however. I have the same problem with a slightly different setup. I plug my blackberry 8703e (no usb storage) directly into my laptop and udevd goes into the exact same loop described and straced here. If udevd is started after berry_charge is loaded and the device plugged in, there is no problem. But if udevd is already running and berry_charge gets loaded after the device is plugged in, then there's the problem. If I disable berry_charge so it doesn't autoload, have udevd running, and plug in the device, there's no problem. But as soon as I load berry_charge, udevd starts looping. I have had this problem since F7. One other problem with berry_charge is that it takes over the device, so I can't talk to it. So berry_charge is just going to stay disabled for now. So what's happened here, nothing? As soon as we figured out it's not udev and not Harald's bug, I see death silence. Hello, I'm working on the bug triage project, and just saw your last comment. One thing that you may want to try, not sure if it will have any impact or not, is to add usbcore.autosuspend=-1 to the kernel command line. It seems that there are references in this BZ to udev becoming confused/looping due to connects and disconnects. Let me know if this helps or changes the behavior at all. Hi, thanks for the help - I'm afraid it didn't help (or hurt, no change) the situation. I first rebooted and confirmed the old behaviour was still happening with the current RPMs (yes), then rebooted with the commandline usbcore and verified /proc/cmdline that it was "heard". Plugging in my device to the external hub still caused the same device crash and the /var/log/messages looks the same - no change in the output there either. No output was registered in /var/log/dmesg. As of this writing we're on: kernel-2.6.23.9-85.fc8 udev-116-3.fc8 This is the same laptop/hub in the same original configuration as reported in the beginning, no changes in gear. No, I take that back - I upgraded from 512mb to 1gig RAM but I don't think that matters here. Adding some data to this bug to flag as USB. Does adding 'usbcore.autosuspend=-1' to the kernel boot options help? See https://fedoraproject.org/wiki/KernelCommonProblems Hi Chuck, please refer to the comments two posts right above yours where this was tried - it didn't help or hurt the situation. The loop is the big, but a secondary problem. It is triggered by a small problem of Berry's refusal to work when the control is delivered by a TT (Transaction Translator) hub. The way to fix the trigger problem is to see what Windows does with a sniffer, and incorporate changes into berry_charge (the workaround is not to use a hub, or use a USB 1.1 hub, or any hub without TT function). I think I'll ask Greg Kroah about this. In regards the bigger problem, that one is fixable without Windows. Unfortunately, I cannot use dmesgs which Tengel collected, because they were tampered with. Most importanltly, the supposed event flood that Harald detected in the strace is not present in dmesg, which raises a big alarm (or it would, if I were sure I saw real dmesgs and not edits). So, if I had a complete dmesg, usbmon trace, and /proc/bus/usb/devices, I should be able to come up with something. Note, fixing udevd loop won't help Blackberry to charge. Greg says his Blackberry died in the same way when he tried it: http://marc.info/?l=linux-usb&m=120155944805973&w=2 I think it never worked through a hub. Hmm... Created attachment 294954 [details]
Windows USB snoop of Pearl 8100 plug-in
Pete, I just remembered that I sent this attachment in January 2007 to the
'barry' project. This is a Windows USB snoop of plugging in a Pearl 8100 with
the normal RIM Desktop software installed, then unplugging it. Apparently it
was quite useful in helping the team debug the initial chitter-chatter and do a
little reverse engineering.
Unfortunately I don't have the Windows around any longer to snoop again with
this setup (laptop + hub), if I remember correctly this was done with a
directly-connected device. (90% sure it was)
Hope this helps.
May I close this, since I'm not actively working on this bug until someone with the device figures out the workaround for the hub case? 1) why would you close a bug that was never fixed? you close it, it will disappear into the dungeons of time and never be seen again - we all know this is what happens. the fact that nobody knows how to properly fix it (from what I read/understand, nothing personal) is not cause to close this bug. 2) I upgraded the same original laptop to F9 over the weekend, I will retry this issue and post any results (new, changed, or same). Fedora 9 does not crash! It doesn't properly adjust the charge to 500mA even though berry_charge.ko is loaded, but that's a different result. I can at least plug in my device to the hub now without the device crashing and the udevd stuff going wild, so someone changed something to prevent the crash in F9. The USB Mass Storage properly loads and presents the device's internal SD card, normal operation. This message is a reminder that Fedora 8 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 8. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '8'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 8's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 8 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 8 changed to end-of-life (EOL) status on 2009-01-07. Fedora 8 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. Using F10, I still see the problem with udevd using lots of CPU and I have more specific information. It only happens when the berry_charge module is already loaded and then I plug in the blackberry. I don't see any problem with the blackberry (it doesn't crash or reboot). However the events/0 process starts using a bunch of CPU time, which must be what's causing udevd so much trouble. Unplugging the blackberry stops the events/0 process from using CPU, but it takes the udevd process a long time to process all the events. udevd seems to be getting messages to repeatedly add and remove the same device. Can someone reopen this or should I file a new bug because I don't need a hub to trigger this bug? |