Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1305181
Summary: | [abrt] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | lgwalts <lgw0619> | ||||||||
Component: | xorg-x11-drv-mga | Assignee: | X/OpenGL Maintenance List <xgl-maint> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 23 | CC: | bradleywilliams2007, dhill171, eddie, edgar.hoch, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, ray, xgl-maint | ||||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Unspecified | ||||||||||
URL: | https://retrace.fedoraproject.org/faf/reports/bthash/792132632e38da4cabdd9b4227f6749a18948fb4 | ||||||||||
Whiteboard: | abrt_hash:7a2b9ea6b2b64ede39df6ad51d68326a18018efb; | ||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2016-03-07 14:28:07 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Attachments: |
|
Description
lgwalts
2016-02-05 23:46:01 UTC
Created attachment 1121527 [details]
File: dmesg
Created attachment 1121683 [details]
messages from failed boot kernel 4.3.4.300
see line 4071 after that systemd has stopped
Created attachment 1121686 [details]
messages from kernel 4.2.8.200
4.2.8.200 - boots completely - no errors
Still broken in 4.3.5.300 Installed kde and switched to kdm: system boots but kdm does not function completely. Removed kde and switched to gnome: could not login with gdm (would not accept password), switched to kdm and login screen appeared allowing gnome to work. Removed xfce, kde and gnome as completely as possible, re-installed xfce with lightdm. System would boot completely with kernel 4.2.8.200; newer kernels failed at login. Switched to kdm and system boots to kernel 4.2.8.800 and 4.3.5.300 but kdm has some bad behavior; mostly on shutdown or reboot. I have been unable to find the source of the kernel oops! The problems are more involved and less fatal to the system than expressed here. The first symptom after the first upgrade of a fresh fedora 23 install is that the 4.2.3 kernel worked well (not going to mention having to disable Wayland to get the console to not look like a defective horizontal hold TV), but once the kernel upgrades to 4.3.5 the Cntl-Alt-F2 switch to text login fails to trigger. To allow the user to get into the system to debug it required that a telnet server or ssh server be installed, I used telnet as it is easier (to me). Once a telnet session has been opened the system will still be accessible to text based debug tools. The next step to trigger the Oops is to logout of the console GUI login. This will trigger the Oops and the telnet session can be used to possibly debug the hung xorg program or the kernel Oops itself (not my forte). This was using the Fedora 23 Workstation install on a Dell T530. Most Dell servers should reproduce this if the onboard iDRAC G200 video is used by the console. Yes, the console goes black and the system seems hung, but using an alternative way into a text login that is not based on the mgag200 driver should allow quick assessment of the problem. I look forward to a resolution to this as the system has not crashed and more information should be possible to be obtained. Thanks. Please, maintainers, currently there are at least four bug reports for the same bug, but no reaction from the maintainers on any of them. The bug is fatal, as many servers have a mga graphic card on board for the console, and the servers are unresponsible immediate after reboot because of this crash! In my case the problem occurs on supermicro servers with on board graphic "Matrox Electronics Systems Ltd. MGA G200eW WPCM450 (rev 0a)" (output from lspci when booted in rescue kernel - the only kernel that doesn't crash, 4.3.* kernels crash). I think, bug reports #1302824, #1299901, #1303327 describe the same problem. Please, can you solve the problem! Please test this scratch build when it completes and let me know if it resolves the issue. http://koji.fedoraproject.org/koji/taskinfo?taskID=13022511 I spent two weeks chasing this problem and looking for a workaround or solution that would work out for a novice sys-admin, only to discover two things: 1: The Fedora Kernel Team is completely unresponsive. 2: Someone let the village idiot loose in configuration files. Unfortunately I needed some up-time from my server so it flipped to Debian and is not currently available for testing. On a side note: Fedora 23 is the most problematic release in recent memory. I hope Fedora 24 is delayed 8 months so that things can be fixed and tested. It would be nice to see the team stop releasing crap! For those of you still running Fedora please test the scratch build above as requested. To add some more information, this is a bug in the upstream kernel that may have been fixed in the 4.4 kernel release. The scratch build contains a backport of the upstream patch. Dear Josh, many thanks for the quick response! I have installed the following files on two of our servers: kernel-4.3.5-301.fc23.x86_64.rpm kernel-core-4.3.5-301.fc23.x86_64.rpm kernel-devel-4.3.5-301.fc23.x86_64.rpm kernel-headers-4.3.5-301.fc23.x86_64.rpm kernel-modules-4.3.5-301.fc23.x86_64.rpm kernel-modules-extra-4.3.5-301.fc23.x86_64.rpm kernel-tools-4.3.5-301.fc23.x86_64.rpm kernel-tools-libs-4.3.5-301.fc23.x86_64.rpm And yes, success: The problem is solved, there is no kernel crash as described above, both server have booted until login prompt (text console, but gdm tested too) and seems to work normal. However, on on of them - Supermicro X8DAH Mainboard with Intel Xeon X5680 CPU - journalctl shows another kernel backtrace, which may be related to graphic output too. But this server seems to be usable too (usually we use ssh to access the servers). kernel: ------------[ cut here ]------------ kernel: WARNING: CPU: 6 PID: 1400 at drivers/dma/ioat/dca.c:342 ioat_dca_init+0x17c/0x1a0 [ioatdma]() kernel: ioatdma 0000:80:16.0: APICID_TAG_MAP set incorrectly by BIOS, disabling DCA kernel: Modules linked in: edac_core ioatdma(+) tpm_infineon lpc_ich shpchp snd_hwdep i2c_i801 acpi_cpufreq tpm_tis snd_pcm_oss snd_seq snd_seq_device snd_pcm snd_timer vboxnetadp(OE) nfsd vbo kernel: CPU: 6 PID: 1400 Comm: systemd-udevd Tainted: G OE 4.3.5-301.fc23.x86_64 #1 kernel: Hardware name: Supermicro X8DAH/X8DAH, BIOS 2.1 12/30/2011 kernel: 0000000000000000 00000000d0cca38e ffff8823e6067958 ffffffff813a63ef kernel: ffff8823e60679a0 ffff8823e6067990 ffffffff810a07d2 ffff8811ea296000 kernel: ffff8811ea1aac10 ffff8823e9d04d00 ffffc900197f0100 0000000000000100 kernel: Call Trace: kernel: [<ffffffff813a63ef>] dump_stack+0x44/0x55 kernel: [<ffffffff810a07d2>] warn_slowpath_common+0x82/0xc0 kernel: [<ffffffff810a08e4>] warn_slowpath_fmt_taint+0x54/0x70 kernel: [<ffffffffa0375f0c>] ioat_dca_init+0x17c/0x1a0 [ioatdma] kernel: [<ffffffffa0371e2e>] ioat_pci_probe+0x85e/0xdc0 [ioatdma] kernel: [<ffffffff8178184e>] ? _raw_spin_unlock_irqrestore+0xe/0x10 kernel: [<ffffffff813eff35>] local_pci_probe+0x45/0xa0 kernel: [<ffffffff813f131d>] pci_device_probe+0xfd/0x140 kernel: [<ffffffff814d9d12>] driver_probe_device+0x222/0x480 kernel: [<ffffffff814d9ff4>] __driver_attach+0x84/0x90 kernel: [<ffffffff814d9f70>] ? driver_probe_device+0x480/0x480 kernel: [<ffffffff814d77ec>] bus_for_each_dev+0x6c/0xc0 kernel: [<ffffffff814d94ce>] driver_attach+0x1e/0x20 kernel: [<ffffffff814d900b>] bus_add_driver+0x1eb/0x280 kernel: [<ffffffffa0361000>] ? 0xffffffffa0361000 kernel: [<ffffffff814da840>] driver_register+0x60/0xe0 kernel: [<ffffffff813ef91c>] __pci_register_driver+0x4c/0x50 kernel: [<ffffffffa036108c>] ioat_init_module+0x8c/0x1000 [ioatdma] kernel: [<ffffffff81002123>] do_one_initcall+0xb3/0x200 kernel: [<ffffffff8120435e>] ? kmem_cache_alloc_trace+0x19e/0x220 kernel: [<ffffffff811a4a27>] ? do_init_module+0x27/0x1e5 kernel: [<ffffffff811a4a5f>] do_init_module+0x5f/0x1e5 kernel: [<ffffffff8112554e>] load_module+0x201e/0x2630 kernel: [<ffffffff81121a10>] ? __symbol_put+0x60/0x60 kernel: [<ffffffff811e584c>] ? alloc_vmap_area+0x2fc/0x360 kernel: [<ffffffff81125cae>] SyS_init_module+0x14e/0x190 kernel: [<ffffffff81781dae>] entry_SYSCALL_64_fastpath+0x12/0x71 kernel: ---[ end trace 9c3a43c266863a7c ]--- Should I open another bug report? (In reply to Edgar Hoch from comment #11) > Dear Josh, many thanks for the quick response! > > I have installed the following files on two of our servers: > > kernel-4.3.5-301.fc23.x86_64.rpm > kernel-core-4.3.5-301.fc23.x86_64.rpm > kernel-devel-4.3.5-301.fc23.x86_64.rpm > kernel-headers-4.3.5-301.fc23.x86_64.rpm > kernel-modules-4.3.5-301.fc23.x86_64.rpm > kernel-modules-extra-4.3.5-301.fc23.x86_64.rpm > kernel-tools-4.3.5-301.fc23.x86_64.rpm > kernel-tools-libs-4.3.5-301.fc23.x86_64.rpm > > And yes, success: > > The problem is solved, there is no kernel crash as described above, both > server have booted until login prompt (text console, but gdm tested too) and > seems to work normal. Thank you for testing. > However, on on of them - Supermicro X8DAH Mainboard with Intel Xeon X5680 > CPU - journalctl shows another kernel backtrace, which may be related to > graphic output too. But this server seems to be usable too (usually we use > ssh to access the servers). > > > kernel: ------------[ cut here ]------------ > kernel: WARNING: CPU: 6 PID: 1400 at drivers/dma/ioat/dca.c:342 > ioat_dca_init+0x17c/0x1a0 [ioatdma]() > kernel: ioatdma 0000:80:16.0: APICID_TAG_MAP set incorrectly by BIOS, > disabling DCA > kernel: Modules linked in: edac_core ioatdma(+) tpm_infineon lpc_ich shpchp > snd_hwdep i2c_i801 acpi_cpufreq tpm_tis snd_pcm_oss snd_seq snd_seq_device > snd_pcm snd_timer vboxnetadp(OE) nfsd vbo > kernel: CPU: 6 PID: 1400 Comm: systemd-udevd Tainted: G OE > 4.3.5-301.fc23.x86_64 #1 > kernel: Hardware name: Supermicro X8DAH/X8DAH, BIOS 2.1 12/30/2011 > kernel: 0000000000000000 00000000d0cca38e ffff8823e6067958 ffffffff813a63ef > kernel: ffff8823e60679a0 ffff8823e6067990 ffffffff810a07d2 ffff8811ea296000 > kernel: ffff8811ea1aac10 ffff8823e9d04d00 ffffc900197f0100 0000000000000100 > kernel: Call Trace: > kernel: [<ffffffff813a63ef>] dump_stack+0x44/0x55 > kernel: [<ffffffff810a07d2>] warn_slowpath_common+0x82/0xc0 > kernel: [<ffffffff810a08e4>] warn_slowpath_fmt_taint+0x54/0x70 > kernel: [<ffffffffa0375f0c>] ioat_dca_init+0x17c/0x1a0 [ioatdma] > kernel: [<ffffffffa0371e2e>] ioat_pci_probe+0x85e/0xdc0 [ioatdma] > kernel: [<ffffffff8178184e>] ? _raw_spin_unlock_irqrestore+0xe/0x10 > kernel: [<ffffffff813eff35>] local_pci_probe+0x45/0xa0 > kernel: [<ffffffff813f131d>] pci_device_probe+0xfd/0x140 > kernel: [<ffffffff814d9d12>] driver_probe_device+0x222/0x480 > kernel: [<ffffffff814d9ff4>] __driver_attach+0x84/0x90 > kernel: [<ffffffff814d9f70>] ? driver_probe_device+0x480/0x480 > kernel: [<ffffffff814d77ec>] bus_for_each_dev+0x6c/0xc0 > kernel: [<ffffffff814d94ce>] driver_attach+0x1e/0x20 > kernel: [<ffffffff814d900b>] bus_add_driver+0x1eb/0x280 > kernel: [<ffffffffa0361000>] ? 0xffffffffa0361000 > kernel: [<ffffffff814da840>] driver_register+0x60/0xe0 > kernel: [<ffffffff813ef91c>] __pci_register_driver+0x4c/0x50 > kernel: [<ffffffffa036108c>] ioat_init_module+0x8c/0x1000 [ioatdma] > kernel: [<ffffffff81002123>] do_one_initcall+0xb3/0x200 > kernel: [<ffffffff8120435e>] ? kmem_cache_alloc_trace+0x19e/0x220 > kernel: [<ffffffff811a4a27>] ? do_init_module+0x27/0x1e5 > kernel: [<ffffffff811a4a5f>] do_init_module+0x5f/0x1e5 > kernel: [<ffffffff8112554e>] load_module+0x201e/0x2630 > kernel: [<ffffffff81121a10>] ? __symbol_put+0x60/0x60 > kernel: [<ffffffff811e584c>] ? alloc_vmap_area+0x2fc/0x360 > kernel: [<ffffffff81125cae>] SyS_init_module+0x14e/0x190 > kernel: [<ffffffff81781dae>] entry_SYSCALL_64_fastpath+0x12/0x71 > kernel: ---[ end trace 9c3a43c266863a7c ]--- > > Should I open another bug report? Yes. The above has nothing to do with the originally reported problem. Thanks Josh the scratch build works great. The kernel Oops is gone, the system reboots, logs out on he console GDM GUI and Cntl-Alt-F2 works again. Thanks for the build, works great. Is there a bug fix for having to disable Wayland to get a viewable (not torn like bad horizontal control) on these G200 based console systems? (In reply to Brad from comment #13) > Thanks Josh the scratch build works great. The kernel Oops is gone, the > system reboots, logs out on he console GDM GUI and Cntl-Alt-F2 works again. > > Thanks for the build, works great. Thanks for testing. > Is there a bug fix for having to disable Wayland to get a viewable (not torn > like bad horizontal control) on these G200 based console systems? I have no idea. As far as I know, G200 isn't a highly tested GPU for Wayland development. Someone on the xgl-maint team might know better. Either way, that would be covered under a different bug. Per Edgar and Brad the fix works, this issue should be marked closed. Our update tooling will close the bug when the fix is actually available in the updates repositories. *** Bug 1302824 has been marked as a duplicate of this bug. *** *** Bug 1310557 has been marked as a duplicate of this bug. *** *** Bug 1303327 has been marked as a duplicate of this bug. *** kernel-4.3.6-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-e7162262b0 kernel-4.4.2-300.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-ec8b4ce774 kernel-4.3.6-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-e7162262b0 kernel-4.3.6-201.fc22 solved the problem here. switching tty's on consol with ctr-alt-f2/f1 keys also working again. I'm a happy administrator. thank you. kernel-4.4.2-301.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e12ae5359 kernel-4.4.2-301.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-7e12ae5359 kernel-4.4.3-200.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a5ac00e07c kernel-4.3.6-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.4.2-301.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report. kernel-4.4.3-200.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a5ac00e07c kernel-4.4.3-201.fc22 has been submitted as an update to Fedora 22. https://bodhi.fedoraproject.org/updates/FEDORA-2016-9fbe2c258b kernel-4.4.3-201.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-9fbe2c258b kernel-4.4.3-201.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report. |