Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1513150 - nouveau kernel panic in nvkm_dp_train_sense
Summary: nouveau kernel panic in nvkm_dp_train_sense
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Rob Clark
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1527936 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-11-14 20:03 UTC by Jeff Peeler
Modified: 2018-02-02 17:39 UTC (History)
23 users (show)

Fixed In Version: kernel-4.14.16-200.fc26 kernel-4.14.16-300.fc27
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-02 16:58:28 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl -a -b (376.51 KB, text/x-vhdl)
2017-11-14 20:03 UTC, Jeff Peeler
no flags Details
lspci -nn (2.30 KB, text/plain)
2017-11-14 20:05 UTC, Jeff Peeler
no flags Details
4.14.11 dmesg output (8.17 KB, text/plain)
2018-01-09 00:40 UTC, Jeff Peeler
no flags Details
4.14.12 custom compiled debug kernel (8.41 KB, text/plain)
2018-01-09 00:41 UTC, Jeff Peeler
no flags Details
4.14.12 debug full dmesg (79.33 KB, text/plain)
2018-01-09 02:33 UTC, Jeff Peeler
no flags Details
journalctl -b -k 4.15 kernel (100.41 KB, text/x-vhdl)
2018-01-09 02:39 UTC, Jeff Peeler
no flags Details

Description Jeff Peeler 2017-11-14 20:03:55 UTC
Created attachment 1352145 [details]
journalctl -a -b

I don't know how to reproduce this, but it's been a problem on Fedora 26 as well (now running Fedora 27). The bug happens both with Wayland and Xorg. Here's the trace output:

Nov 14 11:49:16 fedora kernel: WARNING: CPU: 1 PID: 5 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:169 nvkm_dp_train_sense+0xd9/0x200 [nouveau]
Nov 14 11:49:16 fedora kernel: Modules linked in: rfcomm fuse xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun xt_addrtype nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 br_netfilter xt_conntrack overlay ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc usblp arc4 iTCO_wdt iTCO_vendor_support mei_wdt intel_rapl snd_hda_codec_hdmi iwlmvm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel mac80211 kvm irqbypass intel_cstate intel_uncore iwlwifi intel_rapl_perf
Nov 14 11:49:16 fedora kernel:  snd_hda_codec_realtek cfg80211 snd_hda_codec_generic wmi_bmof rtsx_pci_ms snd_hda_intel i2c_i801 memstick snd_hda_codec uvcvideo btusb btrtl videobuf2_vmalloc btbcm snd_hda_core btintel bluetooth joydev snd_hwdep videobuf2_memops snd_seq videobuf2_v4l2 videobuf2_core snd_seq_device videodev snd_pcm thinkpad_acpi media snd_timer snd tpm_tis mei_me tpm_tis_core ecdh_generic mei soundcore intel_pch_thermal tpm rfkill shpchp dm_crypt hid_logitech_hidpp hid_uclogic hid_logitech_dj rtsx_pci_sdmmc mmc_core nouveau crct10dif_pclmul crc32_pclmul crc32c_intel mxm_wmi i2c_algo_bit drm_kms_helper ghash_clmulni_intel e1000e ttm nvme ptp serio_raw pps_core drm rtsx_pci nvme_core wmi video
Nov 14 11:49:16 fedora kernel: CPU: 1 PID: 5 Comm: kworker/u16:0 Not tainted 4.13.11-300.fc27.x86_64 #1
Nov 14 11:49:16 fedora kernel: Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET73W (1.46 ) 09/28/2017
Nov 14 11:49:16 fedora kernel: Workqueue: nvkm-disp gf119_disp_super [nouveau]
Nov 14 11:49:16 fedora kernel: task: ffff9313c0750000 task.stack: ffffa7528316c000
Nov 14 11:49:16 fedora kernel: RIP: 0010:nvkm_dp_train_sense+0xd9/0x200 [nouveau]
Nov 14 11:49:16 fedora kernel: RSP: 0018:ffffa7528316fc60 EFLAGS: 00010297
Nov 14 11:49:16 fedora kernel: RAX: 0000000000000000 RBX: ffff9313b9caa000 RCX: 0000000000000000
Nov 14 11:49:16 fedora kernel: RDX: 0000000000000006 RSI: ffffa7528500e534 RDI: 0000000001009005
Nov 14 11:49:16 fedora kernel: RBP: ffffa7528316fca0 R08: ffffa7528316fd48 R09: ffffa7528316fc6e
Nov 14 11:49:16 fedora kernel: R10: 0000000000000000 R11: 0000000000000010 R12: ffff9313bb34cc00
Nov 14 11:49:16 fedora kernel: R13: ffffa7528316fd40 R14: 0000000000000000 R15: 0000000000000000
Nov 14 11:49:16 fedora kernel: FS:  0000000000000000(0000) GS:ffff9313e3c40000(0000) knlGS:0000000000000000
Nov 14 11:49:16 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 14 11:49:16 fedora kernel: CR2: 00007f1620d9a000 CR3: 0000000873e09000 CR4: 00000000003406e0
Nov 14 11:49:16 fedora kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 14 11:49:16 fedora kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 14 11:49:16 fedora kernel: Call Trace:
Nov 14 11:49:16 fedora kernel:  nvkm_dp_acquire+0x50b/0xcd0 [nouveau]
Nov 14 11:49:16 fedora kernel:  nv50_disp_super_2_2+0x5d/0x470 [nouveau]
Nov 14 11:49:16 fedora kernel:  ? nvkm_devinit_pll_set+0xf/0x20 [nouveau]
Nov 14 11:49:16 fedora kernel:  gf119_disp_super+0x19c/0x2f0 [nouveau]
Nov 14 11:49:16 fedora kernel:  process_one_work+0x193/0x3c0
Nov 14 11:49:16 fedora kernel:  worker_thread+0x4e/0x3c0
Nov 14 11:49:16 fedora kernel:  kthread+0x125/0x140
Nov 14 11:49:16 fedora kernel:  ? process_one_work+0x3c0/0x3c0
Nov 14 11:49:16 fedora kernel:  ? kthread_park+0x60/0x60
Nov 14 11:49:16 fedora kernel:  ret_from_fork+0x25/0x30
Nov 14 11:49:16 fedora kernel: Code: b9 02 02 00 00 ba 09 00 00 00 be 01 00 00 00 48 89 df 49 89 c0 48 89 45 c0 e8 04 92 fd ff 85 c0 41 89 c7 75 5d 80 7d ce 06 74 02 <0f> ff 48 89 df e8 ed 8f fd ff 45 84 f6 75 55 49 8b 44 24 08 83-
Nov 14 11:49:16 fedora kernel: ---[ end trace ef2c7fa1bd7cbc7f ]---

I'll do anything I can to help debug this. There doesn't appear to be much of a way to recover in a Wayland session, but Xorg eventually seems to lock up too.

Comment 1 Jeff Peeler 2017-11-14 20:05:06 UTC
Created attachment 1352146 [details]
lspci -nn

Comment 2 Jeff Peeler 2017-11-14 20:11:59 UTC
I know the output I included contains the laptop, but this is a Lenovo P50. I have the laptop in the dock with the lid open. One external monitor is connected to the DVI port and (perhaps obviously) another external monitor is connected via DisplayPort.

Comment 3 Laura Abbott 2017-11-14 21:40:28 UTC
Move for the graphics team to take a look.

Comment 4 Jeff Peeler 2017-11-22 18:38:10 UTC
What additional information should I be gathering when this happens?

Nov 22 09:18:29 fedora kernel: ------------[ cut here ]------------
Nov 22 09:18:29 fedora kernel: WARNING: CPU: 6 PID: 6772 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:169 init_rdauxr+0xe5/0x120 [nouveau]
Nov 22 09:18:29 fedora kernel: Modules linked in: vfat fat rfcomm fuse xt_CHECKSUM tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_addrtype nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJE
Nov 22 09:18:29 fedora kernel:  intel_uncore snd_hda_codec_generic intel_rapl_perf snd_hda_intel snd_hda_codec uvcvideo snd_hda_core videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_hwdep iwlwifi videobuf2_co
Nov 22 09:18:29 fedora kernel: CPU: 6 PID: 6772 Comm: kworker/u16:2 Tainted: G        W       4.13.12-300.fc27.x86_64 #1
Nov 22 09:18:29 fedora kernel: Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET73W (1.46 ) 09/28/2017
Nov 22 09:18:29 fedora kernel: Workqueue: nvkm-disp gf119_disp_super [nouveau]
Nov 22 09:18:29 fedora kernel: task: ffff8c61793a26c0 task.stack: ffff9e9bcb8f4000
Nov 22 09:18:29 fedora kernel: RIP: 0010:init_rdauxr+0xe5/0x120 [nouveau]
Nov 22 09:18:29 fedora kernel: RSP: 0018:ffff9e9bcb8f7c10 EFLAGS: 00010297
Nov 22 09:18:29 fedora kernel: RAX: 0000000000000000 RBX: ffff9e9bcb8f7d08 RCX: 0000000000000000
Nov 22 09:18:29 fedora kernel: RDX: 0000000000000001 RSI: ffff9e9bc500e534 RDI: 0000000001009000
Nov 22 09:18:29 fedora kernel: RBP: ffff9e9bcb8f7c40 R08: ffff9e9bcb8f7c1e R09: ffff9e9bcb8f7c1f
Nov 22 09:18:29 fedora kernel: R10: 0000000000000000 R11: 0000000000000010 R12: ffff8c6239f82800
Nov 22 09:18:29 fedora kernel: R13: 0000000000000102 R14: 00000000000000dc R15: 0000000000000000
Nov 22 09:18:29 fedora kernel: FS:  0000000000000000(0000) GS:ffff8c6263d80000(0000) knlGS:0000000000000000
Nov 22 09:18:29 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 22 09:18:29 fedora kernel: CR2: 00001359e0de1438 CR3: 00000003a8e09000 CR4: 00000000003406e0
Nov 22 09:18:29 fedora kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 22 09:18:29 fedora kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 22 09:18:29 fedora kernel: Call Trace:
Nov 22 09:18:29 fedora kernel:  init_auxch+0x108/0x1a0 [nouveau]
Nov 22 09:18:29 fedora kernel:  nvbios_exec+0x47/0xd0 [nouveau]
Nov 22 09:18:29 fedora kernel:  nvkm_dp_acquire+0x6bf/0xcd0 [nouveau]
Nov 22 09:18:29 fedora kernel:  nv50_disp_super_2_2+0x5d/0x470 [nouveau]
Nov 22 09:18:29 fedora kernel:  ? pick_next_task_fair+0x137/0x550
Nov 22 09:18:29 fedora kernel:  ? __switch_to+0x1fc/0x4a0
Nov 22 09:18:29 fedora kernel:  gf119_disp_super+0x19c/0x2f0 [nouveau]
Nov 22 09:18:29 fedora kernel:  process_one_work+0x193/0x3c0
Nov 22 09:18:29 fedora kernel:  worker_thread+0x4e/0x3c0
Nov 22 09:18:29 fedora kernel:  kthread+0x125/0x140
Nov 22 09:18:29 fedora kernel:  ? process_one_work+0x3c0/0x3c0
Nov 22 09:18:29 fedora kernel:  ? kthread_park+0x60/0x60
Nov 22 09:18:29 fedora kernel:  ret_from_fork+0x25/0x30
Nov 22 09:18:29 fedora kernel: Code: 00 fa eb 85 4c 8d 4d df 4c 8d 45 de 44 89 e9 ba 09 00 00 00 be 01 00 00 00 4c 89 e7 e8 55 3c 03 00 85 c0 75 19 80 7d df 01 74 02 <0f> ff 4c 89 e7 e8 41 3a 03 00 0f b6 45 de e9 
Nov 22 09:18:29 fedora kernel: ---[ end trace 79db828187cb27ea ]---

---

Nov 22 12:44:30 fedora kernel: ------------[ cut here ]------------
Nov 22 12:44:30 fedora kernel: WARNING: CPU: 5 PID: 14468 at drivers/gpu/drm/drm_atomic_helper.c:1682 drm_atomic_helper_commit_hw_done+0x93/0xa0 [drm_kms_helper]
Nov 22 12:44:30 fedora kernel: Modules linked in: vfat fat rfcomm fuse xt_CHECKSUM tun ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_addrtype nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJE
Nov 22 12:44:30 fedora kernel:  intel_uncore snd_hda_codec_generic intel_rapl_perf snd_hda_intel snd_hda_codec uvcvideo snd_hda_core videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 snd_hwdep iwlwifi videobuf2_co
Nov 22 12:44:30 fedora kernel: CPU: 5 PID: 14468 Comm: kworker/u16:1 Tainted: G        W       4.13.12-300.fc27.x86_64 #1
Nov 22 12:44:30 fedora kernel: Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET73W (1.46 ) 09/28/2017
Nov 22 12:44:30 fedora kernel: Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Nov 22 12:44:30 fedora kernel: task: ffff8c61d0c58000 task.stack: ffff9e9bca26c000
Nov 22 12:44:30 fedora kernel: RIP: 0010:drm_atomic_helper_commit_hw_done+0x93/0xa0 [drm_kms_helper]
Nov 22 12:44:30 fedora kernel: RSP: 0018:ffff9e9bca26fdb0 EFLAGS: 00010206
Nov 22 12:44:30 fedora kernel: RAX: ffff8c61789d7000 RBX: 0000000000000000 RCX: ffff8c6239f85800
Nov 22 12:44:30 fedora kernel: RDX: ffff8c6111682c00 RSI: 000000000000004c RDI: ffff8c617a4b2780
Nov 22 12:44:30 fedora kernel: RBP: ffff9e9bca26fdd0 R08: ffffffffc05d6da0 R09: ffff8c616f65ce40
Nov 22 12:44:30 fedora kernel: R10: 0000000000000000 R11: 00000000000003b3 R12: ffff8c6239de8000
Nov 22 12:44:30 fedora kernel: R13: ffff8c617a4b2000 R14: ffff8c617a4b2780 R15: ffff8c6239f97410
Nov 22 12:44:30 fedora kernel: FS:  0000000000000000(0000) GS:ffff8c6263d40000(0000) knlGS:0000000000000000
Nov 22 12:44:30 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 22 12:44:30 fedora kernel: CR2: 00007f4001d7fd40 CR3: 00000003a8e09000 CR4: 00000000003406e0
Nov 22 12:44:30 fedora kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 22 12:44:30 fedora kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Nov 22 12:44:30 fedora kernel: Call Trace:
Nov 22 12:44:30 fedora kernel:  nv50_disp_atomic_commit_tail+0x847/0x39e0 [nouveau]
Nov 22 12:44:30 fedora kernel:  nv50_disp_atomic_commit_work+0x12/0x20 [nouveau]
Nov 22 12:44:30 fedora kernel:  process_one_work+0x193/0x3c0
Nov 22 12:44:30 fedora kernel:  worker_thread+0x4e/0x3c0
Nov 22 12:44:30 fedora kernel:  kthread+0x125/0x140
Nov 22 12:44:30 fedora kernel:  ? process_one_work+0x3c0/0x3c0
Nov 22 12:44:30 fedora kernel:  ? kthread_park+0x60/0x60
Nov 22 12:44:30 fedora kernel:  ret_from_fork+0x25/0x30
Nov 22 12:44:30 fedora kernel: Code: fa 49 8d 7d 30 e8 4e fe c1 f9 41 c6 84 24 28 04 00 00 00 49 8b 4e 08 83 c3 01 39 99 38 03 00 00 7f 9d 5b 41 5c 41 5d 41 5e 5d c3 <0f> ff eb c5 f3 c3 0f 1f 80 00 00 00 00 0f 1f 
Nov 22 12:44:30 fedora kernel: ---[ end trace 79db828187cb27eb ]---

Comment 5 nik 2017-12-25 19:45:12 UTC
I can reproduce a very similar crash reliably on Fedora 27.

This bug looks strongly related to https://bugs.freedesktop.org/show_bug.cgi?id=103351 and https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723619.

The external monitor works perfectly on 4.12.9-300.fc26.x86_64 and breaks on 4.13.1-301.fc27.x86_64, the commit mentioned in that issue was introduced between these two version.

Dec 25 01:41:47 localhost gnome-shell[1662]: Failed to apply DRM plane transform 0: Invalid argument
Dec 25 01:41:47 localhost gnome-shell[1662]: Failed to apply DRM plane transform 0: Invalid argument
Dec 25 01:41:48 localhost gnome-shell[1662]: JS WARNING: [resource:///org/gnome/shell/ui/workspaceThumbnail.js 892]: reference to undefined property "_switchWorkspaceNotifyId"
Dec 25 01:41:48 localhost gsd-color[1301]: no xrandr-Dell Inc.-DELL U2415-7MT0167S57AS device found: Failed to find output xrandr-Dell Inc.-DELL U2415-7MT0167S57AS
Dec 25 01:41:48 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at           (null)
Dec 25 01:41:48 localhost kernel: IP:           (null)
Dec 25 01:41:48 localhost kernel: PGD 0 
Dec 25 01:41:48 localhost kernel: P4D 0 
Dec 25 01:41:48 localhost kernel: 
Dec 25 01:41:48 localhost kernel: Oops: 0010 [#1] SMP
Dec 25 01:41:48 localhost kernel: Modules linked in: rfcomm fuse ccm nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables bnep sunrpc vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support mei_wdt irqbypass intel_cstate intel_uncore intel_rapl_perf arc4 joydev wmi_bmof uvcvideo iwldvm btusb btrtl mac80211 btbcm btintel bluetooth videobuf2_vmalloc videobuf2_memops videobuf2_v4l2
Dec 25 01:41:48 localhost kernel:  videobuf2_core videodev snd_hda_codec_hdmi i2c_i801 thinkpad_acpi media iwlwifi lpc_ich snd_hda_codec_conexant snd_hda_codec_generic mei_me snd_hda_intel ecdh_generic mei snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd shpchp tpm_tis soundcore tpm_tis_core tpm rfkill dm_crypt nouveau crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel serio_raw mxm_wmi i2c_algo_bit drm_kms_helper sdhci_pci ttm sdhci e1000e mmc_core drm ptp pps_core wmi video
Dec 25 01:41:48 localhost kernel: CPU: 0 PID: 68 Comm: kworker/u16:1 Not tainted 4.13.1-301.fc27.x86_64 #1
Dec 25 01:41:48 localhost kernel: Hardware name: LENOVO 4180PC4/4180PC4, BIOS 83ET76WW (1.46 ) 07/05/2013
Dec 25 01:41:48 localhost kernel: Workqueue: nvkm-disp gf119_disp_super [nouveau]
Dec 25 01:41:48 localhost kernel: task: ffff8f14901b8000 task.stack: ffffa0d8c1250000
Dec 25 01:41:48 localhost kernel: RIP: 0010:          (null)
Dec 25 01:41:48 localhost kernel: RSP: 0018:ffffa0d8c1253c10 EFLAGS: 00010206
Dec 25 01:41:48 localhost kernel: RAX: ffffffffc03b5ee0 RBX: 0000000000000000 RCX: 0000000000000016
Dec 25 01:41:48 localhost kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8f148f0fe720
Dec 25 01:41:48 localhost kernel: RBP: ffffa0d8c1253c98 R08: 0000000000000000 R09: 0000000000000000
Dec 25 01:41:48 localhost kernel: R10: 0000000000000000 R11: 0000000000001000 R12: 0000000000000000
Dec 25 01:41:48 localhost kernel: R13: 0000000000000000 R14: ffff8f148fd90800 R15: ffffa0d8c1253d38
Dec 25 01:41:48 localhost kernel: FS:  0000000000000000(0000) GS:ffff8f149dc00000(0000) knlGS:0000000000000000
Dec 25 01:41:48 localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 25 01:41:48 localhost kernel: CR2: 0000000000000000 CR3: 0000000122e09000 CR4: 00000000000406f0
Dec 25 01:41:48 localhost kernel: Call Trace:
Dec 25 01:41:48 localhost kernel:  ? nvkm_dp_train_drive+0x183/0x2c0 [nouveau]
Dec 25 01:41:48 localhost kernel:  nvkm_dp_acquire+0x4f3/0xcd0 [nouveau]
Dec 25 01:41:48 localhost kernel:  nv50_disp_super_2_2+0x5d/0x470 [nouveau]
Dec 25 01:41:48 localhost kernel:  ? nvkm_devinit_pll_set+0xf/0x20 [nouveau]
Dec 25 01:41:48 localhost kernel:  gf119_disp_super+0x19c/0x2f0 [nouveau]
Dec 25 01:41:48 localhost kernel:  process_one_work+0x193/0x3c0
Dec 25 01:41:48 localhost kernel:  worker_thread+0x4a/0x3a0
Dec 25 01:41:48 localhost kernel:  kthread+0x125/0x140
Dec 25 01:41:48 localhost kernel:  ? process_one_work+0x3c0/0x3c0
Dec 25 01:41:48 localhost kernel:  ? kthread_park+0x60/0x60
Dec 25 01:41:48 localhost kernel:  ret_from_fork+0x25/0x30
Dec 25 01:41:48 localhost kernel: Code:  Bad RIP value.
Dec 25 01:41:48 localhost kernel: RIP:           (null) RSP: ffffa0d8c1253c10
Dec 25 01:41:48 localhost kernel: CR2: 0000000000000000
Dec 25 01:41:48 localhost kernel: ---[ end trace 9a99aaab375d014e ]---
Dec 25 01:41:50 localhost abrt-dump-journal-oops[942]: abrt-dump-journal-oops: Found oopses: 1
Dec 25 01:41:50 localhost abrt-dump-journal-oops[942]: abrt-dump-journal-oops: Creating problem directories
Dec 25 01:41:50 localhost kernel: nouveau 0000:01:00.0: DRM: EVO timeout
Dec 25 01:41:51 localhost abrt-notification[2289]: System encountered a non-fatal error in nvkm_dp_train_drive()
Dec 25 01:41:52 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:41:54 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:41:56 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:41:58 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:00 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:02 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:04 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:06 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:08 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:10 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:12 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:14 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:16 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:18 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout
Dec 25 01:42:20 localhost kernel: nouveau 0000:01:00.0: DRM: base-1: timeout

Comment 6 Jan Vlug 2017-12-31 12:20:08 UTC
Related to #1514831 and #1529854?

Comment 7 Rob Clark 2018-01-06 15:30:37 UTC
*** Bug 1527936 has been marked as a duplicate of this bug. ***

Comment 8 Rob Clark 2018-01-06 15:33:02 UTC
upstream bug: https://bugs.freedesktop.org/show_bug.cgi?id=103421

I think I see what the issue is.

Comment 9 Rob Clark 2018-01-06 16:03:04 UTC
This patch fixes the issue for me:

https://patchwork.freedesktop.org/patch/196301/

Comment 10 Jeff Peeler 2018-01-08 18:59:01 UTC
Rob, any chance you have a kernel RPM that I could test with?

Comment 11 Rob Clark 2018-01-08 19:24:14 UTC
(In reply to Jeff Peeler from comment #10)
> Rob, any chance you have a kernel RPM that I could test with?

unfortunately not, I was just building kernel outside of koji/rpmbuild.  But I was having the same crash on my desktop at home without the patch.

Comment 12 Jeff Peeler 2018-01-09 00:40:31 UTC
Created attachment 1378826 [details]
4.14.11 dmesg output

Comment 13 Jeff Peeler 2018-01-09 00:41:40 UTC
Created attachment 1378827 [details]
4.14.12 custom compiled debug kernel

Comment 14 Jeff Peeler 2018-01-09 00:43:37 UTC
I recompiled a Fedora kernel with the fix made by Rob, however I'm still experiencing problems. I'm unable to boot a 4.14 based kernel with my external monitors plugged in (the problem seems to be getting worse), so I have to plug in my monitors after booting. Above I attached two dmesg outputs: one from the latest available kernel in Fedora and the second from my compiled kernel.

Comment 15 Rob Clark 2018-01-09 01:44:48 UTC
(In reply to Jeff Peeler from comment #14)
> I recompiled a Fedora kernel with the fix made by Rob, however I'm still
> experiencing problems. I'm unable to boot a 4.14 based kernel with my
> external monitors plugged in (the problem seems to be getting worse), so I
> have to plug in my monitors after booting. Above I attached two dmesg
> outputs: one from the latest available kernel in Fedora and the second from
> my compiled kernel.

well, the 4.14 dmesg is at least a *different* problem.. so I guess that is progress.

My desktop at home where I had same problem (at least as what was originally reported in this bz, I haven't seen the 2nd issue you posted yet) is GF119 card, single monitor but always connected at boot.

Any chance I could get a full dmesg w/ the 4.14+patch kernel?  What you posted there is a lockdep warning.. which might be the issue you are hitting or might just be a warning about some potential issue.  IIRC the -debug kernel should have CONFIG_DETECT_HUNG_TASK=y, although I don't remember the default timeout, ie. CONFIG_DEFAULT_HUNG_TASK_TIMEOUT.. maybe 60 or 120 sec.  If you can ssh to the system after it fails to boot to login screen, if it is stuck at this point due to lock issue, then after however many seconds for HUNG_TASK_TIMEOUT you should start getting more backtraces in dmesg.  This would be useful to see.

Thanks

Comment 16 Rob Clark 2018-01-09 01:51:10 UTC
hmm, possibly relevant.. my dp monitor is dp single stream, instead of mst.. it looks from the backtrace like yours is mst (although some monitors have a menu option to switch between mst and sst)..

from a quick look, the new issue doesn't look like something that should be a nouveau specific issue, but I'll have a closer look in the morning.

Comment 17 Jeff Peeler 2018-01-09 02:33:24 UTC
Created attachment 1378855 [details]
4.14.12 debug full dmesg

Comment 18 Jeff Peeler 2018-01-09 02:39:58 UTC
Created attachment 1378856 [details]
journalctl -b -k 4.15 kernel

One other data point is I tried booting a 4.15 kernel from koji. I don't see any panic, but the monitors also don't work at all.

Comment 19 Rob Clark 2018-01-09 13:12:18 UTC
(In reply to Jeff Peeler from comment #17)
> Created attachment 1378855 [details]
> 4.14.12 debug full dmesg

ok, from full log, I realize that you have GM107, not GF119 and your issue is different from the backtrace that nik posted in #c5 (which is the actual issue that I fixed, but that same problem doesn't apply to GM107..

Comment 23 Jeff Peeler 2018-01-18 16:00:00 UTC
When will the function pointer fix be in fedora's kernel?

I'm thinking since a real issue was solved here and that my particular issue seems to vary based on the exact kernel version, I should open a new bug.

Also, I had previously considered seeing if the issue was still present with a display port to HDMI adapter, but those cost four times as much ($40) as the hdmi -> dp adapters do.

Comment 24 Jeff Peeler 2018-01-21 18:38:53 UTC
The issue I'm experiencing is related to bug 1509294. There's also an upstream bug here: https://bugs.freedesktop.org/show_bug.cgi?id=103721. I'm going to submit further info there instead.

Comment 25 Fedora Update System 2018-01-24 16:48:10 UTC
kernel-4.14.15-300.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-0b4c4e1fed

Comment 26 Fedora Update System 2018-01-24 16:49:37 UTC
kernel-4.14.15-200.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-7a461886fb

Comment 27 Fedora Update System 2018-01-25 07:55:31 UTC
kernel-4.14.15-200.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-7a461886fb

Comment 28 Fedora Update System 2018-01-25 08:38:27 UTC
kernel-4.14.15-300.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-0b4c4e1fed

Comment 29 Fedora Update System 2018-01-28 15:37:00 UTC
kernel-4.14.15-301.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-eaf32ed38c

Comment 30 Fedora Update System 2018-01-28 15:38:09 UTC
kernel-4.14.15-201.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-951143e759

Comment 31 Fedora Update System 2018-01-31 16:57:24 UTC
kernel-4.14.15-301.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-eaf32ed38c

Comment 32 Fedora Update System 2018-01-31 16:57:59 UTC
kernel-4.14.15-201.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-951143e759

Comment 33 Fedora Update System 2018-02-01 04:17:11 UTC
kernel-4.14.16-300.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2018-d09a73ce72

Comment 34 Fedora Update System 2018-02-01 04:18:20 UTC
kernel-4.14.16-200.fc26 has been submitted as an update to Fedora 26. https://bodhi.fedoraproject.org/updates/FEDORA-2018-d82b617d6c

Comment 35 Fedora Update System 2018-02-01 19:10:15 UTC
kernel-4.14.16-200.fc26 has been pushed to the Fedora 26 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-d82b617d6c

Comment 36 Fedora Update System 2018-02-01 19:31:28 UTC
kernel-4.14.16-300.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-d09a73ce72

Comment 37 Fedora Update System 2018-02-02 16:58:28 UTC
kernel-4.14.16-200.fc26 has been pushed to the Fedora 26 stable repository. If problems still persist, please make note of it in this bug report.

Comment 38 Fedora Update System 2018-02-02 17:39:53 UTC
kernel-4.14.16-300.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.