Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 529292

Summary: Graphics hang with KMS on nVidia 7800GT with FC12 beta RC2 install
Product: [Fedora] Fedora Reporter: Chris Ball <chris-rhbugs>
Component: kernelAssignee: Ben Skeggs <bskeggs>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: rawhideCC: awilliam, bskeggs, carl, dougsland, gansalmon, itamar, jlaska, jwboyer, kernel-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-11-06 16:28:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 530341    
Attachments:
Description Flags
dmesg with drm debug=15 and nouveau.modeset=1
none
X log when launching with nouveau DDX and no KMS
none
X log when launching with nv DDX and no KMS
none
binary edid from Apple Cinema 30"
none
dmesg after Ben's patch
none
xorg log after Ben's patch
none
Compendium of similar failures from four reporters
none
Versions of drivers, libdrm and kernel none

Description Chris Ball 2009-10-16 02:02:36 UTC
Description of problem:

When booting an installed FC12 beta RC2 (installed with vesa to avoid the hang), the display stops changing when modeset happens and stays hung; nothing I try can get further changes in video output.  It hangs when running nv or nouveau *without* modesetting, also with a hang and no more video updates, too.
Haven't tried the nvidia driver because I can't get it to build against F12 beta.

Version-Release number of selected component (if applicable):

The card is an eVGA 7800GT, 10de:0092 3842:c517.

The display is an Apple Cinema 30" connected via dual-link DVI at 2560x1600.

F12 beta RC2, 2.6.31.1-56.fc12.i686.PAE, 64-bit Athlon64 3800+ machine

How reproducible:

Every time.

Steps to Reproduce:
1. boot without nomodeset
2.
3.
  
Actual results:
video hangs

Expected results:
working video

Additional info:

Attaching:

* dmesg with drm debug=15
* nouveau DDX log (doesn't seem to show anything wrong, but the display doesn't draw anything past a glitched pointer)
* nv DDX log (doesn't seem to show anything wrong, the display draws the GDM background, and then glitches and hangs when the GDM box animates/expands to show the user choices 
* binary copy of the EDID

Comment 1 Chris Ball 2009-10-16 02:03:37 UTC
Created attachment 364997 [details]
dmesg with drm debug=15 and nouveau.modeset=1

Comment 2 Chris Ball 2009-10-16 02:04:12 UTC
Created attachment 364998 [details]
X log when launching with nouveau DDX and no KMS

Comment 3 Chris Ball 2009-10-16 02:04:48 UTC
Created attachment 364999 [details]
X log when launching with nv DDX and no KMS

Comment 4 Chris Ball 2009-10-16 02:05:14 UTC
Created attachment 365000 [details]
binary edid from Apple Cinema 30"

Comment 5 Chris Ball 2009-10-16 02:19:58 UTC
Tried kernel 2.6.31.4-83.fc12.i686 just in case it helped; same result.

Comment 6 Chris Ball 2009-10-16 02:24:09 UTC
Also tried the other DVI port -- this card has two DVI ports, one of which has a dual-link TMDS and can run at full resolution.

Doing so results in the mode attempted being 1280x800 instead (which is correct for the single transmitter), but it still hangs rather than setting the mode.

Comment 7 Chris Ball 2009-10-16 17:08:30 UTC
Ben committed a kernel patch to nouveau GIT to fix this last night, and I've tried it out.  

It modesets -- I get a 2560x1600 mode that is legible, but contains small amounts of glitching and pixel trails from previous text.

When X starts with nouveau, I get a screenful of snow/noise.  If I hit ctrl+alt+backspace, X exits with a "Fatal server error: Detected GPU lockup".

Remarkably, after this happens, the fb is no longer glitchy, and starting X *another* time gets me a working X session.  However, I see that the kernel has said "GPU lockup - switching to software fbcon".

Am attaching the new dmesg, with the GPU lockup at the bottom, and new X log, also with the GPU lockup message.

Comment 8 Chris Ball 2009-10-16 17:10:26 UTC
Created attachment 365069 [details]
dmesg after Ben's patch

Comment 9 Chris Ball 2009-10-16 17:11:11 UTC
Created attachment 365070 [details]
xorg log after Ben's patch

Comment 10 Adam Williamson 2009-11-05 22:15:59 UTC
So, we have four people hitting very similar issues. I'm not sure these are the same, but concentrating them into one report for now. Ben can tell us if they need to be split up.

Those affected are Chris Ball (reporter of this bug), James Laska, Carl van Tonder (via #530169) and Josh Boyer.

I am attaching a compendium of the messages each reporter is getting. They're obviously similar, but not identical, so we don't know if these are all the same bug. Symptoms are slightly different in each case, but we can say at least that this generally renders the current X session completely unusable.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 11 Adam Williamson 2009-11-05 22:16:40 UTC
Created attachment 367743 [details]
Compendium of similar failures from four reporters

Comment 12 Adam Williamson 2009-11-05 22:18:10 UTC
Adding all affected users to CC and setting bug to block F12.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 13 Adam Williamson 2009-11-05 22:29:58 UTC
Ben would like everyone to test kernel -122 and see if it resolves these problems. It's available at:

http://koji.fedoraproject.org/koji/buildinfo?buildID=139823

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 14 Carl van Tonder 2009-11-05 23:25:55 UTC
Created attachment 367754 [details]
Versions of drivers, libdrm and kernel

Still busy installing -122 kernel so will post results after that. In the mean-time, my versions of -drv-nouveau, libdrm and sever-Xorg as well as my *current* kernel.

Comment 15 Josh Boyer 2009-11-05 23:49:48 UTC
The -122 kernel seems to have fixed things on my iMac G5 using Nouveau.

Comment 16 Adam Williamson 2009-11-06 00:38:23 UTC
setting this to MODIFIED. we have a tag request in for 122:

https://fedorahosted.org/rel-eng/ticket/3126

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 17 Adam Williamson 2009-11-06 00:39:20 UTC
note that Carl van Tonder reported on IRC the fix works for him. James Laska reports that the protection fault errors are gone from his logs but he cannot confirm X is working as he's not in front of the machine. Chris Bell has not yet been able to test.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 18 James Laska 2009-11-06 12:10:42 UTC
I can report that the hangs no longer occur on my nVidia Corporation NV44 [Quadro NVS 285] system using kernel-2.6.31.5-122.fc12.i686.

Comment 19 Chris Ball 2009-11-07 23:39:23 UTC
-122 is actually worse for me, but I suspect a bug's been fixed in the process.

Previously, I would get a somewhat glitchy modeset from nouveau, then start X, X would detect a GPU lockup, I'd start X *again*, and I'd have a working X session.

With -122, I get the glitchy modeset, but the lockup never happens.  Then when I start X I get a glitchy X session, and that soon turns into a full system hang.