Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1365917
Summary: | kernel panic at boot - x2apic_cluster_probe+0x33/0x70 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Peter Gervase <pgervase> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | rawhide | CC: | awilliam, byodlows, gansalmon, itamar, jforbes, jonathan, kernel-maint, kparal, labbott, madhu.chinakonda, mchehab, pgervase, plautrba, pschindl, robatino | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | RejectedBlocker AcceptedFreezeException | ||||||||
Fixed In Version: | kernel-4.8.0-0.rc2.git3.1.fc25 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-08-22 22:07:58 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1277284, 1277285 | ||||||||
Attachments: |
|
Description
Peter Gervase
2016-08-10 13:48:34 UTC
Created attachment 1189632 [details]
screen shot showing the new panic message two minutes after the first one
Can you test the following scratch build? It contains a probable fix from the upstream developers http://koji.fedoraproject.org/koji/taskinfo?taskID=15217136 I saw the same/similar kernel panic problem on my Lenovo x240 with kernel-4.8.0-0.rc1.git3.1.fc25.x86_64 http://koji.fedoraproject.org/koji/taskinfo?taskID=15217136 build fixes it. Thanks! *** Bug 1367396 has been marked as a duplicate of this bug. *** kernel from koji build linked in comment 3 works for me. System boots normally with it. I tested it with Fedora 24 with kernel-4.8.0-0.rc0.git3.1.fc25.x86_64 installed and it didn't boot (kernel panic). Then I installed kernel from koji and it booted normally. Can you confirm that kernel-4.8.0-0.rc1.git0.1.fc25 does not display this behaviour? for blocker / release engineering purposes: labbott states she's certain that kernel-4.8.0-0.rc1.git0.1.fc25 - which is the current 'stable' f25 kernel build, i.e. the one in the 'fedora' repo and which is included in composes - *would* be affected by this bug. That means that if we decide the bug is a blocker, we must find a fix for it before we can ship Alpha. But, she and jforbes also believe this is fixed in upstream kernel by commit d52c0569bab4edc888832df44dc7ac28517134f6 , and that furthermore that means the bug should be fixed by these Fedora builds: f25: http://koji.fedoraproject.org/koji/buildinfo?buildID=792279 (kernel-4.8.0-0.rc2.git1.1.fc25) Rawhide: http://koji.fedoraproject.org/koji/buildinfo?buildID=792280 (kernel-4.8.0-0.rc2.git1.1.fc26) that build is not currently submitted as an update for F25. It would be good if reporters could confirm the fix. labbott also states she'd vote -1 blocker / +1 FE for this bug, given the range of hardware affected. jforbes says "1365917 could theoretically impact any modern intel machine", the upstream commit can be seen at https://lkml.org/lkml/2016/8/11/516 , describing the issue, if anyone feels up to evaluating its impact themselves. "any modern intel machine" is quite scary to me, I might be more inclined to go +1 blocker for this one, I'm definitely +1 FE. To clarify the "Any modern intel machine" x2apic was introduced with nehalem, so about 6 years ago. It can also be "opted out" of by firmware, and frequently is. I don't know the percentages of machines that do or don't opt out, I know by a quick look at 3 machines here, 2 have it turned off, 1 has it turned on. You can check by looking at a dmesg after boot, you will either see "x2apic enabled" or "DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit" with instructions on how to override the opt out. A quick google search shows that several people have seen this bug, but it is still hard to determine because no one shipped a kernel to masses of users with the bug. Bit more discussion about the range of hardware likely affected by this: <jwb> jforbes: eh... i won't disagree but that might be stretching it <jforbes> jwb: theoretically. x2apic came in with nahalem, and it is basically a race condition with CPU state change realistically it is probably a smaller subset, but a quick google search says it is non trivial <jwb> jforbes: yeah, but i thought there was a firmware component to x2apic support too i might be thinking of something else <jforbes> jwb: there is, thus the theoretical part <jwb> right. so the stretch is that most laptop class hardware doesn't have the firmware bits for x2apic. at least not that i've seen but desktop/larger servers are certainly a possibility now if we only could tell for certainty what most Fedora users have for machines. IN A WORLD <jforbes> Well, that would certainly be nice only 1 out of 3 machines here has it enabled I could power on and check others I suppose But even in the ones that disable by default, it can be overridden For those that have dep issues installing: $ sudo rpm -ivh kernel-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm error: Failed dependencies: kernel-core-uname-r = 4.8.0-0.rc2.git1.1.fc26.x86_64 is needed by kernel-4.8.0-0.rc2.git1.1.fc26.x86_64 kernel-modules-uname-r = 4.8.0-0.rc2.git1.1.fc26.x86_64 is needed by kernel-4.8.0-0.rc2.git1.1.fc26.x86_64 I made https://bugzilla.redhat.com/show_bug.cgi?id=1367929 to clean up the dep checking - "uname -r" not getting parsed. I'll test booting to that rc2 kernel... er...you're reading that wrong. you have to install at least the kernel, kernel-core and kernel-modules packages when manually installing a kernel build. The package called 'kernel' is basically just a metapackage and doesn't contain anything. The actual kernel is in 'kernel-core', the modules are in 'kernel-modules'. You may also need 'kernel-modules-extra' depending on your hardware. Right, you need all three, but the error shouldn't say "uname-r" in the failed deps. kernel-core-4.8.0-0.rc2.git1.1.fc26.x86_64 and kernel-modules-4.8.0-0.rc2.git1.1.fc26.x86_64 are what should be specified, not "kernel-core-uname-r" or "kernel-modules-uname-r". $ sudo rpm -ivh kernel-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm kernel-core-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm kernel-modules-4.8.0-0.rc2.git1.1.fc26.x86_64.rpm Preparing... ################################# [100%] Updating / installing... 1:kernel-core-4.8.0-0.rc2.git1.1.fc################################# [ 33%] 2:kernel-modules-4.8.0-0.rc2.git1.1################################# [ 67%] 3:kernel-4.8.0-0.rc2.git1.1.fc26 ################################# [100%] nah, the Provides: are explicitly named that way in the spec, the spec clearly doesn't expect the 'uname-r' to be interpreted as a command: http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/tree/kernel.spec#n633 http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/tree/kernel.spec#n824 http://pkgs.fedoraproject.org/cgit/rpms/kernel.git/tree/kernel.spec#n847 etc. I dunno why the kernel team decided to use those names, but it's a conscious choice. Per Paul Whalen: "adding 'nox2apic' (on Fedora-25-20160807.n.0) got the installer booting on an x220 laptop". Given that there's a relatively straightforward workaround on the kernel boot command line, I'm inclined to say -1 blocker, +1 FE here. Discussed at 2016-08-18 go/no-go meeting, functioning as a blocker review meeting: https://meetbot-raw.fedoraproject.org/fedora-meeting/2016-08-18/f25-alpha-go_no_go-meeting.2016-08-18-17.00.html . Given our best estimate as to the range of hardware affected, and on the basis there's a simple documentable workaround, we decided to reject it as an Alpha blocker, but accept it as a freeze exception issue. kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 I can confirm that with 'nox2apic' I can boot (installer and installed system). kernel-4.8.0-0.rc2.git2.1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git2.1.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git3.1.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0dd1a509c8 kernel-4.8.0-0.rc2.git3.1.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.8.0-0.rc2.git3.1.fc25 really solves problem for me. |