Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 576278 - Fedora 13 freezes at udev on Acer Aspire One D250
Summary: Fedora 13 freezes at udev on Acer Aspire One D250
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 13
Hardware: i386
OS: Linux
low
urgent
Target Milestone: ---
Assignee: John W. Linville
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: https://fedoraproject.org/wiki/Common...
Depends On: 533746
Blocks: F13Beta, F13BetaBlocker 574895
TreeView+ depends on / blocked
 
Reported: 2010-03-23 17:22 UTC by John W. Linville
Modified: 2010-04-18 17:13 UTC (History)
23 users (show)

Fixed In Version: kernel-2.6.33.1-19.fc13
Doc Type: Bug Fix
Doc Text:
Clone Of: 533746
Environment:
Last Closed: 2010-03-24 00:50:01 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description John W. Linville 2010-03-23 17:22:18 UTC
+++ This bug was initially created as a clone of Bug #533746 +++

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:

Boot from SDCard the livecd.

write image on SDCard
dd if=F12-Beta-i686-Live-KDE.iso of=/dev/sdb
 or lxde-i386-20091107.20.iso

Steps to Reproduce:
1.write iso image on SDCard
dd if=F12-Beta-i686-Live-KDE.iso of=/dev/sdb
 or lxde-i386-20091107.20.iso
2.boot from the usb multicard with or without combinations of acpi=off pci=noacpi security=off intel_iommu=off without quiet ...
  
Actual results:
After udev daemon starts there are 3 seconds of disk activity and than the system freezes (No CapsLock).

Expected results:
To boot the Fedora12 livecd on the AccerAspireOneD250 and do a proper install.

Additional info:
On the SATA harddisk there are two ext4 partitions with Ubuntu Karmic.

--- Additional comment from M8R-7fin56 on 2009-11-08 22:01:54 EST ---

I've had this problem on my PC with the Beta 2 LiveCD (non-KDE), the Beta 2 LXDE LiveCD, and a couple of LXDE nightlies since then (most recent being the 5th, those using LiveUSB creator and liveiso-to-usb).  I posted about it in a similar bug, but recieved no response.

--- Additional comment from fedora-triage-list on 2009-11-16 10:19:57 EST ---


This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

--- Additional comment from probinson on 2009-11-21 20:51:11 EST ---

I now have a pretty paperweight (Acer Aspire one D250) these comments are for when this bug is addressed:

Please do subsequent testing on the D250 model, see below.  Acer has changed 'something' for this model.

Not specifically related to this bug, but I've read everyone's glowing statements about F10, F11 and F12 on the Acer Aspire one.  Those comments do not apply to the D250 model.

F10:  Live boots, installs fine. But the NIC and wireless h/w don't even appear to this OS.  Period.  No network connectivity possible even obtaining and compiling in the tg3 drivers manually.

F11:  live boots, installs fine.  The 10/100 NIC at least appears - you see the MAC address.  Wireless doesn't even appear.  Still no network access.  F11 seems slower than F10 - even keyboard responsiveness.

Willing to be a guinea pig for any rush patches to F12 assist!

--- Additional comment from probinson on 2009-11-21 21:24:06 EST ---

Install DVD i386 gets to 'waiting for hardware to initialize...'

F12 Live I've tried removing 'quiet rhgb' and all combinations of 'noprobe nomodeset acpi=off acpi=noirq noacpi pci=noacpi noapm nodma nolapic noapic nolapic_timer'

Still no luck.

--- Additional comment from caldodge on 2009-11-24 13:22:21 EST ---

I hate to be a "me, too" guy, but I have the same experience while trying to install F12 on my D250-1165 (which hangs during "initializing hardware").

Some other versions of Linux will boot on this (Gparted 0.4.8.6, Knoppix 6.0, F11 install (it just doesn't recognize the built-in NIC)). I hope that provides some clue as to what's happening.

--- Additional comment from hafflys on 2009-11-27 17:52:35 EST ---

Please add me to the list.  I can't even get the live distro on an SD card to boot, let alone install.

--- Additional comment from beland.edu on 2009-11-30 03:14:23 EST ---

I also have an Aspire One D250, and I can confirm that the LiveCD locks up at the udev step with desktop-i386-20091129.00.iso (Fedora 13 Rawhide) as well as Fedora-12-i686-Live.iso, but I can boot up and log in with Fedora-11-i686-Live.iso.  There's only the original Windows partition on this machine.

--- Additional comment from beland.edu on 2009-12-01 10:21:55 EST ---

I did some testing (with original F11 RPMs - no updates except kernel-firmware-2.6.30.9-100.fc11.noarch), and it appears this problem was introduced in the rebase to the 2.6.30 kernel. kernel-2.6.29.6-217.2.6.fc11.i586 boots OK, but as reported, kernel-2.6.30.5-43.fc11.i586 wedges during udev startup, badly enough that pressing Caps Lock doesn't affect the corresponding LED.

It looks like other hardware configurations also stopped booting after this update; see: https://admin.fedoraproject.org/updates/F11/FEDORA-2009-9167

--- Additional comment from airlied on 2009-12-01 16:06:41 EST ---

can you try ignore_loglevel on boot and see what the last printed thing is?

--- Additional comment from beland.edu on 2009-12-02 12:23:16 EST ---

Booting Fedora-12-i686-Live.iso without rhgb and quiet, but with ignore_loglevel, the last lines printed are:

Starting udev: udev: starting version 145
ACPI: WMI: Mapper loaded
b43-pci-bridge 000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
b43-pci-bridge 000:01:00.0: setting latency timer to 64
intel_rng: FWH not detected

--- Additional comment from beland.edu on 2009-12-08 12:54:04 EST ---

Created an attachment (id=376960)
Output of "lspci -vvv"

Doing the same thing with desktop-i386-20091203.16.iso (which has kernel-2.6.32-0.65.rc8.git5) I get only:

Starting udev: udev: starting version 147
ACPI: WMI: Mapper loaded
b43-pci-bridge 000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
b43-pci-bridge 000:01:00.0: setting latency timer to 64

I'll attach hardware profile info obtained from booting under F11.

--- Additional comment from beland.edu on 2009-12-08 12:55:07 EST ---

Created an attachment (id=376962)
Output of smoltSendProfile

--- Additional comment from awilliam on 2009-12-09 12:07:48 EST ---

http://fedoraproject.org/wiki/Acer_Aspire_One suggests that the kernel parameter 'ssb.blacklist=1' helps with the AO751h model - could you try that with this model and see if it's maybe the same?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from beland.edu on 2009-12-09 13:37:55 EST ---

Yes, ssb.blacklist=1 enables my F12 LiveUSB image to boot.  Networking isn't working out of the box, but it wasn't working on F11 either.

--- Additional comment from vlad on 2009-12-09 14:31:10 EST ---

I used unetbootin to write the ISO image on the SD Card.
I booted successfully but, the instal stoped with no root device found.

--- Additional comment from hafflys on 2009-12-09 22:34:21 EST ---

I used livecd-iso-to-disk to write the ISO to the SD card after both unetbootin and the Fedora liveusb-creator failed to create the SD card properly.

Adding the ssb.blacklist=1 gives an error that the parameter is not recognized and is being ignored, but F12 will boot. Wired networking is functional out-of-box, but wireless is not.

I did a quick download of gparted to resize the NTFS partition and I am working on installing F12 now.  Thanks Adam.

--- Additional comment from hafflys on 2009-12-10 00:39:47 EST ---

See this thread on Fedora Forum about getting wireless to work.  I just tried it on a fresh F12 installation on the Acer Aspire D250, and it works.  Be sure to add ssb and b43 to the /etc/modprobe.d/blacklist.conf file too.

http://forums.fedoraforum.org/showthread.php?t=234055&highlight=ssb.blacklist%3D1

--- Additional comment from hafflys on 2009-12-10 00:42:28 EST ---

Correction:  It seems that installing kmod-wl from rpmfusion creates a broadcom-wl-blacklist.conf file in which bcm43xx, ssb, b43, and ndiswrapper are all specified, so manually entering them in the blacklist.conf file is probably not needed.

--- Additional comment from caldodge on 2009-12-10 09:59:16 EST ---

(In reply to comment #13)
> http://fedoraproject.org/wiki/Acer_Aspire_One suggests that the kernel
> parameter 'ssb.blacklist=1' helps with the AO751h model - could you try that
> with this model and see if it's maybe the same?
> 

Yes, that did the trick with my D250-1165.

I then ran into the "the installer has tried to mount image 1" problem with my flash drive, but merely added "askmethod" to the boot line, then pointed the computer to an NFS share (yes, the F12 install kernel recognizes the NIC).

It's installing right now. I'll post again on the install's success when it's done.

--- Additional comment from beland.edu on 2009-12-10 13:16:59 EST ---

Assuming it's the same cause, Fedora-12-i386-netinst.iso gets stuck after the "detecting hardware..." line is printed, wedging with the same symptoms (Caps Lock doesn't work).  Using "ssb.blacklist=1" or "noprobe" gets past the hang.

But the installer assumes I'm doing a hard drive installation and asks me which partition the installation image is.  When I try to force a URL install by using the Back button, it cannot detect either the wired nor wireless network interfaces.  I do have an Ethernet cable plugged in that I tested with a different machine, so I'm not sure why the wired interface isn't working for me but is working for Calvin.

--- Additional comment from awilliam on 2009-12-11 16:31:18 EST ---

your systems may have different wired ethernet adapters, I suppose. the 'ssb' module is vital not just for the b43 module (for Broadcom BCM43xx wireless controllers) but also for the b44 module (for Broadcom BCM44xx wired controllers). If your wired controller happens to use the b44 module, blacklisting ssb will cause it not to work...

kernel folks, looks like ssb is busted.

can anyone try loading ssb *after* booting and see if you get some fun logs? or even if it works (that'd be annoying)?

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from awilliam on 2009-12-11 16:33:12 EST ---

"Adding the ssb.blacklist=1 gives an error that the parameter is not recognized
and is being ignored"

this is normal, btw. See http://bugzilla.kernel.org/show_bug.cgi?id=14164 .

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from beland.edu on 2009-12-12 17:09:23 EST ---

When I do "modprobe ssb" with kernel-2.6.31.6-166.fc12.i686 and while logged in under Gnome, the system wedges hard enough that Caps Lock doesn't work, no error messages are printed on the screen, and nothing is added to /var/log/messages.

--- Additional comment from jvsmith on 2010-01-07 15:01:43 EST ---

*** Bug 551747 has been marked as a duplicate of this bug. ***

--- Additional comment from jvsmith on 2010-01-07 15:17:59 EST ---

I have an HP Mini 311 with the same Broadcom Corporation BCM4312 802.11b/g (rev
01). Here's what I've found. I downloaded the kernel rpm builds from
https://admin.fedoraproject.org/updates/search/kernel?_csrf_token=b76346f9c9e9fbef7773431aed20e2accef5059a

kernel 2.6.29.4-167.fc11.i686.PAE default F11 kernel boots fine with no issues.
kernel 2.6.29.6-213.fc11.i686.PAE an updated F11 kernel boots w/o issue.
kernel 2.6.29.6-217.2.16.fc11.i686.PAE another updated F11 kernel boots w/o
issue.
kernel 2.6.30.5-43.fc11 locks the system up hard at starting udev. If I had
ssb.blacklist=1 when booting 2.6.30.5-43.fc11.i686.PAE with ssb.blacklist=1 the
system boots w/o issue. 

Hope this helps.

--- Additional comment from jvsmith on 2010-01-08 18:03:02 EST ---

Using kernel 2.6.32.3-10.fc12.i686.PAE downloaded from koji my 311 system still doesn't boot except with adding ssb to the blocklist. 

Just some more information that I don't see in this report. Using lspci -vvvn for the wireless card in the 311 the first line is

03:00.0 0280: 14e4:4315 (rev 01)

According to http://wireless.kernel.org/en/users/Drivers/b43#Known_PCI_devices 14e4:4315 should be supported with 2.6.32 or later.

--- Additional comment from awilliam on 2010-01-11 13:11:27 EST ---

has anyone looked to see if they can get any kind of useful traceback from loading ssb? perhaps by booting with it blacklisted then manually loading it after boot ('modprobe ssb')? thanks.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from jvsmith on 2010-01-11 14:59:46 EST ---

Adam, thanks for your help. Once I added only 'blacklist ssb' to /etc/modprobe.d/anaconda.conf I could do a modprobe ssb  w/o getting the 'Unknown parameter...' in /var/log/messages. The problem is I don't get any useful information as my machine locks up hard. Same as on boot if I uncommented the 'blacklist ssb' line in /etc/modprobe.d/anaconda.conf. I did this running 2.6.32.3-10.fc12.i686.PAE.

--- Additional comment from awilliam on 2010-01-12 18:19:01 EST ---

well, thanks for trying :( so you get nothing at all relevant in /var/log/messages from the time you tried the modprobe and saw the lockup?



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from jvsmith on 2010-01-12 23:06:21 EST ---

(In reply to comment #29)
> well, thanks for trying :( so you get nothing at all relevant in
> /var/log/messages from the time you tried the modprobe and saw the lockup?

Nothing when doing modprobe ssb. Just a quick hard lockup. Do a modprobe b43 get a few lines of lib80211 stuff then some frequency information and that's all.

--- Additional comment from kurt on 2010-02-07 03:22:15 EST ---

Acer D250 with a 40gig Intel X25-V SSD, Fedora 12 from DVD (external USB drive) fails to install (waiting for hardware to initialize...), CentOS 5.4 i386 DVD installs fine.

--- Additional comment from beland.edu on 2010-02-12 16:59:37 EST ---

Nominating as F13 beta blocker, because this is a serious regression (if hardware-specific) and listed on Common Bugs.  I assume it could be fixed using older software or at least hacked so that the machines boot with the problem hardware disabled.  Fix by beta would allow time for adequate testing.

--- Additional comment from linville on 2010-03-12 13:57:07 EST ---

*** Bug 532369 has been marked as a duplicate of this bug. ***

--- Additional comment from linville on 2010-03-12 14:11:36 EST ---

Can the Acer Aspire One D250 users confirm that the hang when loading the ssb driver occurs even after removing b43-openfwwf (indeed, with no /lib/firmware/b43 at all) and even when running a 2.6.32-based kernel?  FWIW, has anyone tried a 2.6.33 (or later) kernel yet?

--- Additional comment from awilliam on 2010-03-12 15:21:33 EST ---

Note - testing with 2.6.33 can easily be achieved by trying a Fedora 13 image, either the Alpha or the nightly builds at http://alt.fedoraproject.org/pub/alt/nightly-composes/desktop/ .



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from kurt on 2010-03-12 16:11:11 EST ---

Ok got the i386 iso (verified sha256sum), booted, choosing "Boot" option and the logo fills up most of the way and then dies, DVD drive spins down. ctrl-alt-del doesn't work. Power cycled laptop. Tried again, chose "verify and boot", same deal, logo fills up most of the way, system stalls and becomes non responsive, DVD drive spins down. Also tried letting it do automatic boot, same results.

I guess that means it is still broken on the D250 =(.

--- Additional comment from beland.edu on 2010-03-12 16:17:24 EST ---

I tried desktop-i386-20100310.20.iso.  Got stuck at the udev stage again, until I added ssb.blacklist=1 to the boot parameters.

--- Additional comment from beland.edu on 2010-03-12 19:27:38 EST ---

With kernel-2.6.32.9-70.fc12.i686, even after removing b43-openfwwf-5.2-3.fc12.noarch and confirming that /lib/firmware/b43 does not exist, I tried rebooting and got the hang again until I added ssb.blacklist=1 as a boot parameter.

--- Additional comment from larry.finger on 2010-03-12 19:45:19 EST ---

After booting with ssb blacklistedf, is it possible to 'modprobe -v ssb'? This command requires root privilege (sudo) and may not be in the default path (I'm not a Fedora user.). If it works, please check the tail of dmesg for any output.

If that works, then try 'modprobe -v mac80211'. Again check dmesg output.

Finally, if that works, try 'modprobe -v b43' and check dmesg output.

I have downloaded the 32-bit live CD and will be trying it.

--- Additional comment from awilliam on 2010-03-12 20:11:53 EST ---

larry: I already asked that earlier. Several people have replied that attempting to load ssb freezes the machine.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from larry.finger on 2010-03-12 20:40:49 EST ---

Sorry, I missed that info. Is there any possibility of either serial or network console to get any dump info?

It seems that a 'modprobe -v b43' does not lock up instantly (Comment #30). Could someone try issuing that command and immediately switch to a logging console? Is that Ctrl-Alt-F10 on Fedora, or is it somewhere else? We might get some info that way.

On my i686 system with both b43 and b43legacy devices, the Live CD booted just fine and I was able to connect with the b43 using firmware from the openfwwf project. In any case, ssb loaded without error.

--- Additional comment from awilliam on 2010-03-12 20:55:17 EST ---

larry: this affects specific models - so far we know for sure of the Acer Aspire One 751h and D250. Many other systems with Broadcom adapters are known to boot successfully.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from larry.finger on 2010-03-12 21:44:58 EST ---

Have I told you I hate Netbooks? There are a number of machines that generate DMA errors when trying to use the BCM4312 devices; however, none of them crashed on booting; however, we have been unable to discover the reason. All reported so far will work if the kernel is configured for PIO rather than DMA.

Perhaps the diagnostics from these systems will help with the others. Any info from the crash would be really useful.

--- Additional comment from beland.edu on 2010-03-12 22:45:31 EST ---

With desktop-i386-20100310.20.iso (which looks like kernel-2.6.33-1), if I add ignore_loglevel to the kernel parameters, I'm now getting:

>>
Starting udev: udev: starting version 151
ACPI: WMI: Mapper loaded
b43-pci-bridge 000:01:00.0: PCI INT A -> GSI (level, low) -> IRQ 16
b43-pci-bridge 000:01:00.0: setting latency timer to 64
ssb: Core 0 found: ChipCommon (cc 0x800, rev 0x16, vendor 0x4243)
ssb: Core 1 found: IEE 802.11 (cc 0x812, rev 0x0F, vendor 0x4243)
ssb: Core 2 found: PCMCIA (cc 0x08D, rev 0x0A, vendor 0x4243)
ssb: Core 3 found: PCI-E (cc 0x820, rev 0x09, vendor 0x4243)
<<

That's transcribed from a photo.  This netbook has no serial cable, only USB, monitor, audio, and Ethernet that's not working because of all this Broadcom craziness.

--- Additional comment from beland.edu on 2010-03-12 23:02:06 EST ---

I played around with "modprobe -v b43"; all I learned was that it wedges the machine hard when it reaches the insmod step that loads ssb.ko (which I had to do manually because it was confused by the "blacklist" parameter).

--- Additional comment from larry.finger on 2010-03-12 23:14:17 EST ---

Doing the modprobe won't help unless you get to the logging console before it freezes. It may not show anything, but it might.

BTW, the Broadcom wl driver should work. The article at 
http://fedoramobile.org/fc-wireless/broadcom-linux-sta-driver gives the F9 and
F10 links for sneakernetting the necessary RPM file. It will taint your kernel, but you will have networking.

--- Additional comment from larry.finger on 2010-03-12 23:35:48 EST ---

Question for the people with this problem - are any of you in the Kansas City, MO area?

--- Additional comment from beland.edu on 2010-03-13 02:45:31 EST ---

I can set up networking that works in F12 using F12 RPMs from RPMFusion, but that blacklists ssb as a side effect.  I was tailing /var/log/messages while testing, and if the kernel didn't have time to get output from there back to the screen, I certainly wouldn't have made it to a different virtual terminal.  Normally kernel oopses during boot would be printed directly to the screen where I could see them, but in this case I don't see anything, so I'm thinking the kernel might not survive long enough to even spit out an error.  The hard wedge after insmod is pretty instantaneous.

--- Additional comment from jvsmith on 2010-03-13 21:11:38 EST ---

On my HP Mini 311 that I posted about previously and using the F13 Alpha release the only way I can get the machine to boot is by adding ssb.blacklist=1 to the boot command.

--- Additional comment from linville on 2010-03-15 11:31:11 EDT ---

FWIW, having rebuilt 2.6.32.9-67.fc12 locally sith B43 (and B43LEGACY) disabled (along with the B43_PCI_BRIDGE) I was able to load ssb without a hang on the HP Mini here (which hangs w/ the stock configuration).  Trying a build now (taking forever, damned netbooks) w/ B43_PIO=y...

--- Additional comment from linville on 2010-03-15 11:40:48 EDT ---

...and that still locks-up tight on modprobe.  I'm not sure if that really points at the problem, since the SSB code won't be exercised (much) w/o b43 to use it.  It is a shame that we can't plug a b44 adapter into these netbooks...

I guess I'll try some "printk" debugging to pinpoint the failure...

--- Additional comment from larry.finger on 2010-03-15 11:51:48 EDT ---

I'm confused. At first I thought that modprobing ssb was OK if b43 was not available, now I'm not sure. What is the exact configuration in Comment 51?

--- Additional comment from mb on 2010-03-15 12:00:34 EDT ---

Trying random kernels over and over again certainly is not going to fix the issue. There are basically no changes in the bootup/initialization code since months (years?).
I think somebody has to insert a fair amount of printks to find the place where it hangs.
Also keep in mind that the PCI-E core code _is_ broken. We know that from debugging of the DMA problem. Just as a hint.

--- Additional comment from linville on 2010-03-15 13:29:28 EDT ---

Larry, modprobing ssb is fine if you have disabled b43.  If you enable b43 it will lock-up when modprobing ssb (most likely due to the subsequent load of b43), even if using B43_PIO=y -- sorry if that was unclear.

Michael, the "random kernels" is to try to determine a rough cut for pinpointing the problem.  I could start with a printk in start_kernel, but I suspect it may take a while to pinpoint things that way. :-)

--- Additional comment from mb on 2010-03-15 13:43:50 EDT ---

(In reply to comment #54)
> Michael, the "random kernels" is to try to determine a rough cut for
> pinpointing the problem.

Well, is this a regression? Didn't sound like one to me.

> I could start with a printk in start_kernel, but I
> suspect it may take a while to pinpoint things that way. :-)    

That would be pretty silly, because we know that the problem is within the ssb or b43 initialization code. I think that codepath is small enough to add some printks to track down the point of failure. The very first thing would be to check whether the failure occurs in ssb or b43, because I think that is still unclear. Once we got a _rough_ pointer to where the failure is, we can add more specific printks to find the exact place.

--- Additional comment from larry.finger on 2010-03-15 14:07:49 EDT ---

I think it is a "regression" only in that LP PHYs were not supported before 2.6.32, thus the problem exists in .32, but not in .31. I have assumed that John's system has PCI ID of 14e4:4315 - at least the other reporters have that card. If you have some other card, its identity is important.

John:

If you do a 'sleep 5; modprobe ssb' and switch to the logging console, does any logging output show up?

If you do have the 4315, does this patch keep the system from freezing?

Index: wireless-testing/drivers/net/wireless/b43/main.c
===================================================================
--- wireless-testing.orig/drivers/net/wireless/b43/main.c
+++ wireless-testing/drivers/net/wireless/b43/main.c
@@ -4024,7 +4024,7 @@ static int b43_phy_versioning(struct b43
 #endif
 #ifdef CONFIG_B43_PHY_LP
        case B43_PHYTYPE_LP:
-               if (phy_rev > 2)
+//             if (phy_rev > 2)
                        unsupported = 1;
                break;
 #endif

--- Additional comment from j.a.watson on 2010-03-15 17:08:16 EDT ---

I have just run into the same problem, when loading F13 Alpha.  Once for certain, and once very probably but I am still trying to verify it.

For certain: HP Pavillion dm1-1020ez, Broadcom 4315 wireless adapter.  The Live Image freezes on boot every time.  Adding ssb.blacklist=1 to the boot command (at Adam's suggestion) fixes the freeze, and the Live Image boots and installs ok.  The installed image likewise freezes on boot, so I added the blacklist to the kernel line in the grub menu.lst file.  It then boots without trouble, but of course the Broadcom adapter doesn'w work.

Probable: HP Pavillion dv2-1010ez, Atheros 9285 wireless adapter.  The Live Image hangs on boot intermittently, not always.  Once I got it to boot and install, the installed image doesn't seem to hang at all, at least so far.

--- Additional comment from awilliam on 2010-03-15 17:15:19 EDT ---

j.a: your 'probable' is some other bug, this is specific to Broadcom chipsets (the ssb driver is only used with Broadcom chipsets). Just FYI, you can actually get the wireless working with ssb blacklisted by using the proprietary Broadcom driver, wl, which I think is available in RPMFusion.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from dxm523 on 2010-03-16 06:29:16 EDT ---

Hi,
I can also confirm this bug on a HP Compaq 615. Same symptoms as described above. Not tied to fedora, though. I am running 64bit debian and can only use broadcom wl for wlan. BUT for some reason the b43 driver sometimes does work (every 20 times or so, I rebooted a lot while trying to install) but i can not relate it to some special event, whether it is running windows or the proprietary drivers. As I could see on the bcm43xx list there seem to be some problems related to the ssb code writing to memory ranges it should not to :)
Would be glad to help somehow but am not that kernel hacker ;) Comment 44 reflects that what i have seen, too. Afterwards the screen turns black, no kernel oops.

--- Additional comment from linville on 2010-03-16 16:21:47 EDT ---

Larry, loading ssb locks it up tight -- nothing on the console, nothing on netconsole, nothing at all.  As for the LP phy patch from comment 56, that does _not_ avoid the hang.

I'm still poking at it, but I have yet to narrow-down the failing line -- still trying...

--- Additional comment from larry.finger on 2010-03-17 12:05:10 EDT ---

AFAIK, these are single CPU computers, thus we probably have a locking problem, or an infinite loop.

One way that might help eliminate a lot of steps would be to build a kernel with MMIO tracing enabled (CONFIG_MMIOTRACE=y). In one console, enter the following (as root):

echo 5600 > /sys/kernel/debug/tracing/buffer_size_kb
echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
cat /sys/kernel/debug/tracing/trace_pipe &

In a second console, 'sleep 10; modprobe b43' and switch back to the first console before the system freezes. The last screenful of trace messages before the freeze should help us determine where it was in the initialization process.

--- Additional comment from mb on 2010-03-17 12:41:43 EDT ---

> AFAIK, these are single CPU computers, thus we probably have a locking problem,

On UP it is extremely unlikely to have locking problems that lock up the complete machine, because spinlocks don't exist. Also, in the initialization code there's basically no concurrency. But it's trivial to rule out locking problems by enabling lockdep.

> or an infinite loop.

I think that is rather unlikely, too.

I think it is just locking up on an invalid bus access. So the kernel tries to read (or possibly write) a register that does not exist and thus the device does not respond on the bus. That would lock up the CPU in hardware.
The Broadcom 43xx device is _known_ to behave undefined (lockups) on access of dangling registers.

> echo 5600 > /sys/kernel/debug/tracing/buffer_size_kb
> echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
> cat /sys/kernel/debug/tracing/trace_pipe &

This might help a bit, but keep in mind that it will most likely not lead you to the actual line of code that locks up the whole machine. Userspace is involved in the "cat" and if it's not able to print the contentds of trace_pipe (because the CPU is busy running b43's initialization code), it won't print everything (or anything at all).
So for this to work you need a preemptible kernel. And even then it won't print everything that you want to see (if it prints something at all).

I think the only way to properly debug this is to insert synchronous printks into the code. But I think I already said that a few times...

(If it really locks up the CPU on an invalid bus access, a kernel level debugger also won't help.)

--- Additional comment from larry.finger on 2010-03-17 14:31:04 EDT ---

Once again I demonstrate my ignorance of the entire subject of locking.

I thought we had all the dangling register references worked out of b43. The x86 architecture has been fairly forgiving by returning all ones in reads of non-existent registers. Many were found because the PPC arch generates a machine check. At one point, I had instrumented all the ssb register read code to report the condition. In addition, all the instances that we had earlier were for registers that did not exist on older hardware. I would not expect such problems on new stuff.

I will put my checks back in to see if I find a read of a nonexistent register on my 4315.

--- Additional comment from mb on 2010-03-17 14:48:01 EDT ---

> I will put my checks back in to see if I find a read of a nonexistent register
on my 4315.

It is a lot easier to sprinkle printks over the code to first get an idea of what is going on at all. Nobody (including me) knows:
- Where it crashes (We don't even know for sure which module it crashes in, yet).
- What kind of crash occurs (loop, MMIO, lock, whatever etc...)

I think these two things are the very first things one need to find out _before_ anything else is done.

The printk sprinkling can be done in several iterations. Think of it like a git-bisect. It's a very fast way of narrowing down this type of hard-to-track-down-bug to a few hundred lines of code. For the first patch version I would just add a few printks to determine whether it hangs in SSB or b43.

I do not know if it hangs on some kind of MMIO access, of course. It's just a theory. So it's probably not a good idea to waste a lot of time searching for some "nonexistent registers". (I don't know how you'd correctly do that anyway).

--- Additional comment from linville on 2010-03-17 15:27:25 EDT ---

I'm doing the printk thing, currently narrowed down to call of ssb_pci_sprom_get in drivers/ssb/pci.c -- still iterating to pinpoint it further.

--- Additional comment from mb on 2010-03-17 16:11:49 EDT ---

Thanks John for tracking that down.

So this might be one of these devices that completely lack an SPROM (for whatever braindamaged reasons). I had reports of them in the past, but they didn't lock up back then.

We had a few discussions on how to handle these devices back then and it basically boiled down to a solution using the firmware loading mechanism.
It would basically work this way: A userspace script generates an SPROM image and stores it in /lib/firmware for the ssb kernel module to fetch via firmware loading mechanism. The main task of the userspace script is to generate a MAC address in the SPROM image. We cannot generate the SPROM inside of the kernel, because there's no sane way to generate a sufficiently unique MAC address that's also constant across reboots, kernel- or hardwarechanges. So the SPROM needs to be generated _once_ and then be stored on HDD.

I don't have an implementation for that, nor do I plan to do one, however.

--- Additional comment from larry.finger on 2010-03-17 17:12:54 EDT ---

John: Your comment rang a bell in my mind as well.

In my RE work, I have come across a routine named is_sprom_available() that returns a bool. I have not finished understanding the routine, but there is a section that refers to BCM4312 devices, i.e. those with PCI ID 14e4:4315.

To preserve clean-room conditions, I will not be able to write a patch for you, but I can give you a prescription (I don't think Michael will help here either.):

In struct ssb_bus, you should add a u32 to contain the chipcommon status.

In ssb_bus_scan() where the ssb routine reads the chipcommon id, revision, and capabilities, you should read the register with offset 0x2C (SSB_CHIPCO_CHIPSTAT) and save the result in the new word in ssb_bus. Ultimately, this read will be conditional on the rev >= 11, but that will be true for you.

Before the ssb code reads the SPROM in ssb_pci_sprom_get(), check if the status word from above & 3 is not equal to 2. If that is true, your device does not have an SPROM and we need to do some fixup similar to what Michael described above. For now, you can return ENOMEM or ENODEVICE.

If this Q&D patch keeps your machine from freezing, I'll work on a complete set of specs for a proper is_sprom_available() and Gabor or Rafal will be able to put together a set of patches and a userland utility to create a suitable SPROM replacement file in /lib/firmware.

--- Additional comment from mb on 2010-03-18 12:36:06 EDT ---

> In struct ssb_bus, you should add a u32 to contain the chipcommon status.

Please don't put the variable into the bus structure, but into the chipcommon data structure.

> For now, you can return ENOMEM or ENODEVICE.

Yeah as a quick fix for tha fatal hang this is an acceptable workaround.

> Gabor or Rafal will be able to
> put together a set of patches and a userland utility to create a suitable SPROM
> replacement file in /lib/firmware.

I will put that stuff into the b43-tools package, of course. But I'm currently unable to write these tools. I accept patches, of course. Note that you need to implement an asynchronous firmware (sprom) fetching mechanism using the asynchronous firmware library functions.
For the userspace tool it's probably best to extend the ssb-sprom tool to support generating a valid sprom.

--- Additional comment from linville on 2010-03-18 14:41:25 EDT ---

http://bcm-v4.sipsolutions.net/802.11/IsSpromAvailable

--- Additional comment from linville on 2010-03-18 14:43:22 EDT ---

Created an attachment (id=401103)
ssb_check_for_sprom.patch

Haven't tested this one yet, but an open-coded check just of the chipcommon status avoided the crash on the box here.  Also it still avoids a working device as well, but better not to crash... :-)

--- Additional comment from larry.finger on 2010-03-18 15:09:32 EDT ---

Michael and I are working out the details of the user-space and kernel components of supplying a virtual SPROM image, but that shouldn't take long.

Your patch looks fine. I probably would have coded it as

		return ((bus->chipco.status & 0x3) != 2);

rather than

		if ((bus->chipco.status & 0x3) != 2)
			return true;
		else
			return false;

We should also supply some defines for all those magic numbers, but that is also a matter of taste.

Thanks for the grunt work on this problem.

--- Additional comment from linville on 2010-03-18 15:17:15 EDT ---

Scratch build w/ above patch is (or will be) available here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=2061828

Again, the wireless still won't work but (hopefully) it won't crash on load of
ssb.ko...

Re: style -- I was debating about that.  Honestly, both ways seem a bit awkward/ugly. :-(  Do you have any suggestions for naming the magic numbers?  Or might they already be defined somewhere?

--- Additional comment from mb on 2010-03-18 15:23:34 EDT ---

(In reply to comment #70)
> Created an attachment (id=401103) [details]
> ssb_check_for_sprom.patch
> 
> Haven't tested this one yet, but an open-coded check just of the chipcommon
> status avoided the crash on the box here.  Also it still avoids a working
> device as well, but better not to crash... :-)    

Please send patches for review via email. It's a pain to comment on patches here.

The patch has a few problems:

1) Don't read chipstat if it doesn't exist (>= chipcommon rev 11). We know what happens on reading registers that don't exist. ;)
2) You are checking the chip revision where you should check the chipcommon core revision.
3) Please create a defined name for chipcommon capability 0x40000000. It obviously is a "SPROM-present" capability flag.

Just as a sidenote: The patch does have a potential for creating regressions, IMO. I think we should not blindly apply it without any testing on a fair amount of devices.

I also support Larry's comment.

--- Additional comment from awilliam on 2010-03-18 15:34:26 EDT ---

John: with this patch, should the wired networking now work on systems which have b44 wired adapters? previously, because you had to blacklist ssb, you also lost wired functionality on systems whose wired adapter is also broadcom...



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from mb on 2010-03-18 15:39:58 EDT ---

(In reply to comment #74)
> John: with this patch, should the wired networking now work on systems which
> have b44 wired adapters? previously, because you had to blacklist ssb, you also
> lost wired functionality on systems whose wired adapter is also broadcom...

Most likely, yes. b44 should most likely work with this patch.
At least we don't know of b44 devices without SPROM. You should simply try it.

--- Additional comment from awilliam on 2010-03-18 15:56:34 EDT ---

I can't, I don't have an affected system. I'm just trying to track the issue for documentation purposes.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

--- Additional comment from larry.finger on 2010-03-18 17:27:34 EDT ---

John:

Some defines for the magic numbers:

0x4000000 can be called SSB_CHIPCO_CAP_SPROM and defined in include/linux/ssb/ssb_driver_chipcommon.h

The 0x40 in the 0x4322 branch is BCM4322_SPROM_PRESENT (Note: I simplified the specs.).

The 0x1 in the 0x4325 branch is BCM4325_SPROM_PRESENT.

--- Additional comment from linville on 2010-03-18 20:23:35 EDT ---

Cool, thanks Larry...any suggestions for the numbers in the 0x4312 branch?  I suspect that it is similar to the mappings for SSB_CHIPCO_CHST_4325_SPROM_OTP_SEL and the related definitions below it...?

--- Additional comment from larry.finger on 2010-03-19 00:41:51 EDT ---

The 3 is SSB_CHIPCO_CHST_4325_SPROM_OTP_SEL.

The 2 is  SSB_CHIPCO_CHST_4325_OTP_SEL.

These two definitions have 4325 in them because they originated from the N PHY code, but still apply to the 4312.

--- Additional comment from dxm523 on 2010-03-19 07:43:19 EDT ---

Larry's patch did work for me. modprobe b43 which loads ssb does not hang any more. Heres the dmesg after modprobe:

b43-pci-bridge 0000:06:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
b43-pci-bridge 0000:06:00.0: setting latency timer to 64
ssb: Core 0 found: ChipCommon (cc 0x800, rev 0x16, vendor 0x4243)
ssb: Core 1 found: IEEE 802.11 (cc 0x812, rev 0x0F, vendor 0x4243)
ssb: Core 2 found: PCMCIA (cc 0x80D, rev 0x0A, vendor 0x4243)
ssb: Core 3 found: PCI-E (cc 0x820, rev 0x09, vendor 0x4243)
b43-pci-bridge 0000:06:00.0: PCI INT A disabled
Broadcom 43xx driver loaded [ Features: PL, Firmware-ID: FW13 ]

thanks a lot, looking forward to the SPROM userspace stuff

--- Additional comment from dxm523 on 2010-03-19 07:47:29 EDT ---

sorry it is actually Johns patch ;) btw my uname -a if it matters somehow:

Linux mobilemog-ng 2.6.34-rc1-next-20100319+ #1 SMP Fri Mar 19 11:40:44 CET 2010 x86_64 GNU/Linux

--- Additional comment from linville on 2010-03-19 16:47:54 EDT ---

Created an attachment (id=401347)
0001-ssb-do-not-read-SPROM-if-it-does-not-exist.patch

--- Additional comment from linville on 2010-03-19 18:14:29 EDT ---

Created an attachment (id=401358)
0001-ssb-do-not-read-SPROM-if-it-does-not-exist.patch

--- Additional comment from linville on 2010-03-20 19:17:26 EDT ---

F-12:

http://koji.fedoraproject.org/koji/buildinfo?buildID=162784

F-13:

http://koji.fedoraproject.org/koji/buildinfo?buildID=162794

--- Additional comment from updates on 2010-03-23 10:56:34 EDT ---

kernel-2.6.32.10-90.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/kernel-2.6.32.10-90.fc12

Comment 1 Fedora Update System 2010-03-23 17:27:35 UTC
kernel-2.6.33.1-19.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/kernel-2.6.33.1-19.fc13

Comment 2 Adam Williamson 2010-03-23 21:43:08 UTC
Can people seeing this problem on f13 please test the updated kernel and see how it works? thanks.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 3 Fedora Update System 2010-03-24 00:49:56 UTC
kernel-2.6.33.1-19.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 4 Christopher Beland 2010-03-28 07:32:55 UTC
I can confirm that desktop-i386-20100326.17.iso is now booting on my D250, with some warnings I'll post separately.

Comment 5 Christopher Beland 2010-04-18 17:13:54 UTC
For the record, the warnings were reported under Bug 567325.


Note You need to log in before you can comment on or make changes to this bug.