Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 577463

Summary: b44 module fails to load with 2.6.33.1-19.fc13.i686.PAE, doesn't boot
Product: [Fedora] Fedora Reporter: Honza 'thingie' Bartoš <thingie>
Component: kernelAssignee: John W. Linville <linville>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 13CC: anton, awilliam, bojan, dougsland, fschwarz, gansalmon, goran.wallin.dev, itamar, johannbg, jonathan, kernel-maint, mads, milan.slanar, volker, walovaton
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.33.1-24.fc13 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-04-07 21:50:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 538274    
Attachments:
Description Flags
0001-ssb-avoid-null-ptr-deref-in-ssb_is_sprom_available.patch none

Description Honza 'thingie' Bartoš 2010-03-27 08:04:38 UTC
Description of problem:
While booting system with 2.6.33.1-19.fc13.i686.PAE kernel and b44 ethernet card (HP Compaq nx7400 (EY505ES#AKB) laptop), the system produces stacktrace of b44 module, modprobe fails and it hangs after "setting hostname" message, thus failing to boot. I could boot it only after removing b44 module.

Version-Release number of selected component (if applicable):
2.6.33.1-19.fc13.i686.PAE

How reproducible:
Always.

BUG: unable to handle kernel NULL pointer dereference at 00000010
IP: [<f9116b23>] ssb_is_sprom_available+0xe/0x7c [ssb]
*pdpt = 0000000032274001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/module/rfkill/initstate
Modules linked in: b44(+) ppdev ssb snd_timer parport_pc iTCO_wdt snd hp_wmi microcode par
port serio_raw mii soundcore iTCO_vendor_support joydev btusb mmc_core bluetooth snd_page_
alloc wmi rfkill dm_multipath firewire_ohci yenta_socket rsrc_nonstatic firewire_core crc_
itu_t i915 drm_kms_helper drm i2c_algo_bit i2c_core video output [last unloaded: scsi_wait
_scan]

Pid: 635, comm: modprobe Not tainted 2.6.33.1-19.fc13.i686.PAE #1 30A2/HP Compaq nx7400 (EY505ES#AKB)
EIP: 0060:[<f9116b23>] EFLAGS: 00010296 CPU: 0
EIP is at ssb_is_sprom_available+0xe/0x7c [ssb]
EAX: f23694f0 EBX: f23694f0 ECX: 00000000 EDX: 00000000
ESI: ffffffed EDI: f2301e4c EBP: f2301da4 ESP: f2301da4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process modprobe (pid: 635, ti=f2300000 task=f21ad680 task.ti=f2300000)
Stack:
 f2301dc4 f91173b5 00000000 f23694f0 f2301dcc f23694f0 f9117397 f2301e4c
<0> f2301e58 f9115bf3 00000000 00000000 00000000 00000000 00000000 00000000
<0> 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:
 [<f91173b5>] ? ssb_pci_get_invariants+0x1e/0x4fb [ssb]
 [<f9117397>] ? ssb_pci_get_invariants+0x0/0x4fb [ssb]
 [<f9115bf3>] ? ssb_fetch_invariants+0x27/0x68 [ssb]
 [<f9116390>] ? ssb_bus_register+0xd0/0x161 [ssb]
 [<f9117397>] ? ssb_pci_get_invariants+0x0/0x4fb [ssb]
 [<f9116519>] ? ssb_bus_pcibus_register+0x29/0x48 [ssb]
 [<f9117f25>] ? ssb_pcihost_probe+0xa2/0xd5 [ssb]
 [<c05dd264>] ? local_pci_probe+0x13/0x15
 [<c05ddcee>] ? pci_device_probe+0x48/0x6b
 [<c06741a3>] ? driver_probe_device+0xca/0x1d2
 [<c06742f3>] ? __driver_attach+0x48/0x64
 [<c067378a>] ? bus_for_each_dev+0x42/0x6c
 [<c0673f91>] ? driver_attach+0x19/0x1b
 [<c06742ab>] ? __driver_attach+0x0/0x64
 [<c0673a19>] ? bus_add_driver+0x101/0x24a
 [<c067455e>] ? driver_register+0x81/0xe8
 [<c05cfbb6>] ? __raw_spin_lock_init+0x28/0x4e
 [<c05dded1>] ? __pci_register_driver+0x51/0xae
 [<c045d436>] ? up_read+0x1b/0x31
 [<f918b000>] ? b44_init+0x0/0x58 [b44]
 [<f9117dbb>] ? ssb_pcihost_register+0x33/0x35 [ssb]
 [<f918b02e>] ? b44_init+0x2e/0x58 [b44]
 [<c040306c>] ? do_one_initcall+0x62/0x170
 [<c04753e1>] ? sys_init_module+0xae/0x1e9
 [<c0408bdf>] ? sysenter_do_call+0x12/0x38
Code: ff ff 75 08 89 15 fc ca 11 f9 31 c0 5d c3 55 89 e5 0f 1f 44 00 00 a1 fc ca 11 f9 5d c3 55 89 e5 0f 1f 44 00 00 8b 90 a0 02 00 00 <8a> 4a 10 80 f9 0a 76 62 8b 90 90 00 00 00 66 81 fa 22 43 74 1f 
EIP: [<f9116b23>] ssb_is_sprom_available+0xe/0x7c [ssb] SS:ESP 0068:f2301da4
CR2: 0000000000000010
---[ end trace 77dedb5755bdea58 ]---

Comment 1 Milan Slanař 2010-03-30 14:58:05 UTC
I have this problem on Acer TravelMate 660 with Fedora 12 i686.
With kernels 2.6.32.10-90 and 2.6.32.10-92.
Kernel 2.6.32.10-83 boots O.K.

Comment 2 Chuck Ebbert 2010-03-30 16:44:21 UTC
*** Bug 577311 has been marked as a duplicate of this bug. ***

Comment 3 Chuck Ebbert 2010-03-30 17:05:37 UTC
drivers/ssb/sprom.c:

bool ssb_is_sprom_available(struct ssb_bus *bus)
{
       /* status register only exists on chipcomon rev >= 11 */
       if (bus->chipco.dev->id.revision < 11)
               return true;

bus->chipco.dev is NULL

Comment 4 John W. Linville 2010-03-30 17:28:09 UTC
Crud...let me review the calling sequence...

Comment 5 John W. Linville 2010-03-30 17:49:18 UTC
Created attachment 403536 [details]
0001-ssb-avoid-null-ptr-deref-in-ssb_is_sprom_available.patch

Can you confirm that this fixes the problem?

Comment 7 Honza 'thingie' Bartoš 2010-03-30 20:08:13 UTC
Everything looks OK with the patch. b44 loaded, ethernet presumably works (can't try right now), boots as expected.

Comment 8 John W. Linville 2010-03-30 20:41:29 UTC
Excellent...thanks for the report (and sorry for the problem)!

Comment 9 Jóhann B. Guðmundsson 2010-03-31 08:39:28 UTC
Confirmed fixed here as well..

Comment 10 Felix Schwarz 2010-03-31 20:29:06 UTC
Can you please push kernel-2.6.32.10-94.fc12 (http://koji.fedoraproject.org/koji/buildinfo?buildID=164636) for Fedora 12 as well ASAP? It fixes a very similar problem for me (bug 578217).

Comment 11 Göran Wallin 2010-04-04 13:25:19 UTC
Confirming this bug for kernel 2.6.32.10-90 and 2.6.32.10-92 on a Dell Inspiron 8600. 

Kernel will not boot.

Confirming that kernel-2.6.32.10-94.fc12 from Koji fixes the issue. Hoping to see it soon on updates-testing. 

The following bugs seem to be duplicates but haven't been marked as such yet:
Bug 579122
Bug 579118
Bug 577463

Comment 12 William Lovaton 2010-04-04 18:15:26 UTC
I confirm that I have the same boot problem for Fedora 13 Rawhide and that the new kernel fixes it.

Comment 13 Adam Williamson 2010-04-06 20:23:14 UTC
I'm setting this to block the Beta. F13 kernels since -24 fix this issue, but Beta RC4 has -19 :(



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 14 Adam Williamson 2010-04-06 20:38:16 UTC
we're fairly sure about this, but can someone with an affected system please try booting some F13 Beta RC4 image - http://serverbeach1.fedoraproject.org/pub/alt/stage/13-Beta.RC4/ , any of them should be okay for testing, the netinst ISO would be the smallest to download - and confirm that it fails? Just to make sure we're not on a false trail here.

For the record, I checked with John Linville, and he confirms that the first fix for the initial issue (which is https://bugzilla.redhat.com/show_bug.cgi?id=533746 ) went into kernel -19 for F13, and kernel -90 for F12. That fix caused this regression for most (or possibly all) systems with Broadcom wired ethernet adapters supported by the b44 driver (there are many many such systems). We had multiple reports from F12 testers that F12 kernel -90 fails to boot; we've no reason to assume that F13 kernel -19 would behave any different. A fix for the regressions was added to F12 kernel -94 and F13 kernel -24, which multiple testers confirm resolves the regressions. That's why we should take that kernel (at least) into Beta.

Comment 15 Bojan Smojver 2010-04-06 22:03:16 UTC
(In reply to comment #14)
> we've no reason to assume that F13 kernel -19 would behave any different

This kernel is busted on Dell Inspiron 6400 (which has b44), just like -90 and -92. No boot.

Whatever you do, don't take -19 to beta.

Comment 16 Göran Wallin 2010-04-07 06:57:26 UTC
Confirming that F13 Beta RC4 fails to boot on a Dell Inspiron 8600 with the same error.

Comment 17 Jóhann B. Guðmundsson 2010-04-07 10:04:57 UTC
FYI All nightly composes since 20100324 to present day contain the kernel with offending patch ( kernel-2.6.33.1-19.fc13 ). So we have been composing images with this broken patch for a long time. Just compose image with kernel 33.1-24 or newer where this has been fixed or rename a compose that pre-dates the previous mentioned date to Beta RC4.

Comment 18 John W. Linville 2010-04-07 13:09:37 UTC
Just for the record, anything older than -19 will crash on some number of b43-equipped devices, particularly newer netbooks.  So my money would be on moving forward to -24...

Comment 19 Jóhann B. Guðmundsson 2010-04-07 13:27:17 UTC
Is there anything standing in the way of going all the way to 2-35? 

We might as well expose the latest build to reporters since it serves no purpose exposing a kernel to the reporter that gets replaced as soon as he runs update which he does right after install..

Comment 20 John W. Linville 2010-04-07 14:42:05 UTC
No objection either way from me, but -24 is somewhat closer to what has been getting tested so far.

Comment 21 Adam Williamson 2010-04-07 18:43:37 UTC
Johann: it does serve a purpose. If you have kernel -24 and then get -35 as an update, and -35 doesn't work, you can still boot -24. If we ship the beta with -35 and it turns out to be bad, there's no such option.

We're following the principle of taking the smallest possible change that includes the fix we need.

Thanks for the info, reporters. We spun Beta RC5 with kernel -24. If no problems emerge in that build, it will likely be shipped as Beta. You can test that to make sure it boots, if you like...thanks.

Comment 22 Fedora Update System 2010-04-07 21:28:08 UTC
kernel-2.6.33.1-24.fc13 has been submitted as an update for Fedora 13.
http://admin.fedoraproject.org/updates/kernel-2.6.33.1-24.fc13

Comment 23 Fedora Update System 2010-04-07 21:50:26 UTC
kernel-2.6.33.1-24.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 24 Mads Kiilerich 2010-04-08 10:32:38 UTC
This issue is also in kernel-PAE-2.6.32.10-90.fc12.i686 which is in F12 updates and kernel-PAE-2.6.32.10-92.fc12.i686 which is in F12 updates-testing.

Please push 2.6.32.10-94 or later to F12.

Comment 25 Bojan Smojver 2010-04-08 10:48:40 UTC
(In reply to comment #24)
 
> Please push 2.6.32.10-94 or later to F12.

You can vote here:

https://admin.fedoraproject.org/updates/kernel-2.6.32.11-99.fc12

Comment 26 Mads Kiilerich 2010-04-08 11:06:26 UTC
(In reply to comment #25)
> You can vote here:
> 
> https://admin.fedoraproject.org/updates/kernel-2.6.32.11-99.fc12    

Thanks, sorry - now 3 days after the build it wasn't in updates-testing, so I assumed it wasn't pushed to bodhi.