Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 100920
Summary: | rhgb hangs system with pcmcia network card | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Alexandre Oliva <oliva> | ||||||
Component: | pcmcia-cs | Assignee: | Arjan van de Ven <arjanv> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 1 | CC: | david_j_morse, erik.hemdal, jrb, kmilos, me, michel.salim, mitr, notting, wtogami | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i386 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | I've verified that this problem is fixed in kernel-2.6.3-2.1.253.2.1 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2004-03-19 16:28:40 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 100643 | ||||||||
Attachments: |
|
Description
Alexandre Oliva
2003-07-27 04:21:04 UTC
Created attachment 93175 [details]
strace output for various programs during a graphical boot
FWIW, the hang does not occur when using a vanilla 2.4.21 kernel. There's no acpi support, of course, and the pcmcia network card doens't work (why? the needed modules are there!), but rhgb completes and gdm starts successfully. Ok, I figured out why pcmcia wouldn't load: because I hadn't deleted /lib/modules/<release>/pcmcia. As soon as I did, the network card would be enabled on boot again, but then rhgb would hang when exiting. We seem to have some conflict between rhgb and pcmcia :-( This is with acpi=off, btw. I'm changing this to kernel, since it now seems to be another symptom of problems in the pcmcia modules in the kernel, like the instant hang I get when acpi is enabled (bug 100528), just this one hangs only at the end of rhgb. Eeek. And the latest kernel erratum for Shrike (2.4.20-19.9), installed on a Severn tree, hangs just the same when rhgb is enabled on this machine. Still broken in kernel-2.4.22-20.1.2024.2.36.nptl, in case it matters (it probably does) Are you seeing this with both the latest rhgb and the mount /dev/pts line? It still hangs after installing today's updates, including hgb-0.10.2-1. Grab initscripts-7.36-2 and kudzu-1.1.32-1 (or later) Woohoo! That fixed it! Argh! I spoke too soon. It actually made it all the way to opening the GDM login screen, but then the system froze as before. Second time, it froze even before X for GDM started. I.e., no change :-( This sounds like more of a kernel or X issue at this point. Same problem occurs for me attempting to start X (startx) from run level 3. Machine hangs, num lock wont work. I believe it started just _after_ I grabbed the latest kudzu & initscripts from freshrpms.net this afternoon I am seeing this on an athlon machine with a via chipset. It hangs when X for gdm starts. The numlock will not respond, the display is garbled. One thing I did find out is that if a move the mouse normal operation is restored. acpi on or off makes no difference. *** Bug 101724 has been marked as a duplicate of this bug. *** Looks like X to me... Realistically for this kind of hardware specific problem, I'm not sure if or how myself or anyone else here at Red Hat will be able to debug and fix this problem without having the hardware physically in front of them and running things through a debugger, etc. I don't have this hardware available to do that, so someone who does will have to either: - Narrow the problem down and prove it is X, preferably with specific details of where it is in X that is causing the problem. or - Send me this hardware (Ontario, Canada) to use for debugging purposes for an indeterminate amount of time. Who all out there can reproduce this, and can you please narrow it down, and report back what the specific problem is? Personally I don't see any 100% proof present that this is an XFree86 bug, but it's certainly a possibility. Awaiting feedback... Also, let me assume it is X for a minute... Try disabling 2D acceleration with: Option "noaccel" If that works around the problem, try the XaaNo options one at a time from the XF86Config manpage after commenting out the noaccel option. Try to find which if any solve the problem. If either of these handles the issue it would be a video driver bug, which is workaroundable in the driver, but only if someone who has the actual hardware can test this now and provide details as to what works and what doesn't. HTH, TIA well scratch me off (Comment #12). I have stopped having X hangups w/ the radeon driver (Fire GL R300). Im afraid I lost track of exactly what upgrade solved the problem but it disappeared 1-2 weeks ago and I sync to Rawhide every 2-3 days. (Still no joy w/ dual head radeon) The problem is not in X, but in the kernel. I found out that if I switch to vt1 before loading $PCIC in /etc/init.d/pcmcia (that's yenta_socket on this box), and switch back to vt8 right after it, the machine no longer hangs. This may very well be related with the other problems on this machine, that have required noacpi or pci=noacpi in the past (bug 100528). Some more info. The problem only occurs when a 3Com Megahertz 3CXFE574BT card is inserted in a PCMCIA socket. In fact, I found out that, when the machine hangs, removing the card brings it back to life. But then, inserting it back will freeze the machine again upon the next text-to-graphical-mode switch. Unloading the 3c574_cs module is not enough to fix the problem. It's necessary to unload the yenta_socket module, and have it run again while in text mode to fix it, and then the fix is permanent (well, until the next reboot). Even if I stop pcmcia (such that even yenta_socket is unloaded) and load it again while X is active and visible in VT7, the problem no longer occurs. Maybe it has to do with the fact that, when I load yenta_socket in text mode it prints messages such as: Yenta IRQ list 06b0, PCI irq11 Socket status: 30000007 Yenta IRQ list 06b0, PCI irq11 Socket status: 30000011 I wonder if the problems could be related with the fact that the video card seems to use system memory, and Something Bad (TM) happens when the module attempts to write the messages above in text mode while we're in graphical mode. Created attachment 95480 [details]
patch that works around the hang
[comment added at the request of Alexandre] My networking does not initialize properly on FC1 after a clean install. This notebook has a Netgear FA511 10/100 pcmcia card and onboard ATI video. This did not happen under RH9. Until I applied the patch I had to either reinsert/hotplug the network card (after the boot completed) or set GRAPHICAL=no in /etc/sysconfig/init (and reboot) to get the ether up. Here are the lines from dmesg and XFree86 log after the patch: eth0: ADMtek Comet rev 17 at 0xc88e7000, 00:10:7A:6B:19:21, IRQ 10. (--) PCI:*(1:0:0) ATI Technologies Inc 3D Rage LT Pro AGP-133 rev 220, Mem @ 0xd8000000/24, 0xd9000000/12, I/O @ 0x8000/8, BIOS @ 0x000c0000/17 This pretty much rules XFree86 out of the picture, since we use completely different video cards. The network cards are also totally unrelated, so yenta_socket (can you confirm you're using this module) is my prime suspect. I get this same problem with rhgb acpi=on and my orinocco card plugged in. Remove the card and the machine unfreezes. I have yenta socket as well. By the way, where do you apply the patch? I guess this covers my bug 106838 as well. Just tried Alexandre's workaround and got a functional system on boot for the first time with rhgb, 3c59x and cs4232 working ok as before Severn. FC1, fully up-to-date (as of 12/17/03), Dell Latitude CPxH w/ Linksys PCM200 (tulip). I had the exact same symptoms as John McBride reported here: http://www.redhat.com/archives/fedora-list/2003-November/msg01073.html (apparently the Linksys PCM200 and Netgear FA511 use the same chip, as mine also reports ADMtek Comet rev 17) Symptom: with rhgb, my PCM200 NIC fails with these messages repeatedly: eth0: Transmit timed out, status fc67c057, CSR12 00000000, resetting... Either disabling graphical boot or applying Alexandre's patch both fix this issue. FWIW, the same problem was present with kernel-2.6.0-0.test11.1.13.i686.rpm I see this issue using a Dell TrueMobile 1050 PCMCIA card on an Inspiron 1100. It uses the Intersil driver. My system also uses a portion of system RAM for the (awful) Intel video subsystem. I get the hang just after starting X. Removing the card allows the boot to complete; then plugging it back in brings back eth1. I am using yenta-socket. I also get a hang on shutdown when I try to shutdown ntpd while using the wireless card. If I disable the card before shutting down, then I get shutdown failures of processes like ntpd, but I do get a clean shutdown. Same problem here. Network hangs after boot with card inserted. My system is: KDS Notebook, FC1, Kenel 2.4.22-1.2149.nptl, Video Trident CyberBlade Ai1 I tried 3 different net cards: 3Com 3CXFE575CT D-Link DFE650 D-Link DWL650+ (Wireless) Only the second one (DFE650) worked all right. The two other cards needed Oliva's patch, or to insert them only after the boot to get them working. The main difference between the DFE650 card and the others is that it is a 16bit pcmcia, while the others are 32bit pcCards. Maybe this information helps to find out what is the cause of the bug. Another information: I noted another misbehavior in cardctl. It should emit a beep when the card is inserted and recognized and another beep when the proper modules are loaded and the initscripts executed. This is also suposed to happen in the boot proccess when pcmcia service is started. When the card is not recognized, cardctl is suposed to emit a lower pith beep. With the DFE650 card the two beeps are emited as expected, but with the two other cards there are no beeps, even when the card is inserted after the boot. Hope this helps. Persio It's getting better in current rawhide (kernel 2.6.3-1.106). Preloading yenta-socket in initrd, with acpi disabled, it boots perfectly well and the network card works. Working around the incomatibility of the current /etc/init.d/pcmcia with kernel 2.6's /proc, as suggested in bug 116205, however, it will sometimes, but not always, work. At least once I noticed that cardmgr had detected only 1 socket, while the notebook has two, and the network card was in the other. Restarting cardmgr was enough to get the network card to work. Pre-loading yenta-socket is no longer needed, but pci=noacpi still is on this particular laptop, that is known to have buggy acpi tables. I suppose we can leave this closed, even though I closed it by mistake forgetting I still had the pci=noacpi flag in the boot command line. |