Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1167223 - system freezes when playing videos in firefox after upgrade to mesa-10.3.3-1.20141110.fc20
Summary: system freezes when playing videos in firefox after upgrade to mesa-10.3.3-1....
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: mesa
Version: 20
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Igor Gnatenko
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-24 09:18 UTC by Kamil Páral
Modified: 2014-12-13 09:50 UTC (History)
3 users (show)

Fixed In Version: mesa-10.3.5-1.20141207.fc21
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-13 09:50:22 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Kamil Páral 2014-11-24 09:18:40 UTC
Description of problem:
After I upgraded to from mesa-10.1.5-1.20140607.fc20 to mesa-10.3.3-1.20141110.fc20 yesterday, my system *very* often freezes when playing videos in firefox (flash or html5) - 3 times in the last 2 hours. I see this in journal:

Nov 24 08:50:21 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x00000000000224a9 last fence id 0x00000000000224a6 on ring 0)
Nov 24 08:50:21 titan kernel: radeon 0000:01:00.0: failed to get a new IB (-35)
Nov 24 08:50:21 titan kernel: [drm:radeon_cs_ib_fill] *ERROR* Failed to get ib !
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: Saved 12107 dwords of commands on ring 0.
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GPU softreset: 0x0000006C
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00400002
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x84010243
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83146
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44E84266
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00100140
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Nov 24 08:50:22 titan kernel: [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e
Nov 24 08:50:22 titan kernel: [drm] PCIE gen 3 link speeds already enabled
Nov 24 08:50:22 titan kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: WB enabled
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88022fef8c00
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff88022fef8c04
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff88022fef8c08
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88022fef8c0c
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff88022fef8c10
Nov 24 08:50:22 titan kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc900055b5a18
Nov 24 08:50:22 titan kernel: [drm] ring test on 0 succeeded in 4 usecs
Nov 24 08:50:22 titan kernel: [drm] ring test on 1 succeeded in 1 usecs
Nov 24 08:50:22 titan kernel: [drm] ring test on 2 succeeded in 1 usecs
Nov 24 08:50:22 titan kernel: [drm] ring test on 3 succeeded in 6 usecs
Nov 24 08:50:22 titan kernel: [drm] ring test on 4 succeeded in 5 usecs
Nov 24 08:50:23 titan kernel: [drm] ring test on 5 succeeded in 1 usecs
Nov 24 08:50:23 titan kernel: [drm] UVD initialized successfully.
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10000msec
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x000000000002258d last fence id 0x00000000000224a6 on ring 0)
Nov 24 08:50:33 titan kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
Nov 24 08:50:33 titan kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35).
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: ib ring test failed (-35).
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GPU softreset: 0x00000048
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0xA0003028
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00010000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00400002
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x84010243
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GRBM_SOFT_RESET=0x0000DDFF
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS               = 0x00003028
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE0           = 0x00000006
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   GRBM_STATUS_SE1           = 0x00000006
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS               = 0x200000C0
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   SRBM_STATUS2              = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x00000000
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0:   R_00D834_DMA_STATUS_REG   = 0x44C83D57
Nov 24 08:50:33 titan kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Nov 24 08:50:34 titan kernel: [drm] probing gen 2 caps for device 8086:c01 = 261ad03/e
Nov 24 08:50:34 titan kernel: [drm] PCIE gen 3 link speeds already enabled
Nov 24 08:50:34 titan kernel: [drm] PCIE GART of 1024M enabled (table at 0x0000000000276000).
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: WB enabled
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88022fef8c00
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff88022fef8c04
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff88022fef8c08
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88022fef8c0c
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff88022fef8c10
Nov 24 08:50:34 titan kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000075a18 and cpu addr 0xffffc900055b5a18
Nov 24 08:50:34 titan kernel: [drm] ring test on 0 succeeded in 4 usecs
Nov 24 08:50:34 titan kernel: [drm] ring test on 1 succeeded in 1 usecs
Nov 24 08:50:34 titan kernel: [drm] ring test on 2 succeeded in 1 usecs
Nov 24 08:50:34 titan kernel: [drm] ring test on 3 succeeded in 5 usecs
Nov 24 08:50:34 titan kernel: [drm] ring test on 4 succeeded in 5 usecs
Nov 24 08:50:34 titan kernel: [drm] ring test on 5 succeeded in 1 usecs
Nov 24 08:50:34 titan kernel: [drm] UVD initialized successfully.
Nov 24 08:50:34 titan kernel: [drm] ib test on ring 0 succeeded in 0 usecs
Nov 24 08:50:34 titan kernel: [drm] ib test on ring 1 succeeded in 0 usecs
Nov 24 08:50:34 titan kernel: [drm] ib test on ring 2 succeeded in 0 usecs
Nov 24 08:50:34 titan kernel: [drm] ib test on ring 3 succeeded in 0 usecs
Nov 24 08:50:34 titan kernel: [drm] ib test on ring 4 succeeded in 1 usecs
Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: ring 5 stalled for more than 10000msec
Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000002 on ring 5)
Nov 24 08:50:44 titan kernel: [drm:uvd_v1_0_ib_test] *ERROR* radeon: fence wait failed (-35).
Nov 24 08:50:44 titan kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on ring 5 (-35).
Nov 24 08:50:44 titan kernel: switching from power state:
Nov 24 08:50:44 titan kernel: 	ui class: none
Nov 24 08:50:44 titan kernel: 	internal class: boot 
Nov 24 08:50:44 titan kernel: 	caps: 
Nov 24 08:50:44 titan kernel: 	uvd    vclk: 0 dclk: 0
Nov 24 08:50:44 titan kernel: 		power level 0    sclk: 15000 mclk: 15000 vddc: 900 vddci: 950 pcie gen: 3
Nov 24 08:50:44 titan kernel: 	status: c b 
Nov 24 08:50:44 titan kernel: switching to power state:
Nov 24 08:50:44 titan kernel: 	ui class: performance
Nov 24 08:50:44 titan kernel: 	internal class: none
Nov 24 08:50:44 titan kernel: 	caps: 
Nov 24 08:50:44 titan kernel: 	uvd    vclk: 0 dclk: 0
Nov 24 08:50:44 titan kernel: 		power level 0    sclk: 30000 mclk: 15000 vddc: 875 vddci: 850 pcie gen: 3
Nov 24 08:50:44 titan kernel: 		power level 1    sclk: 45000 mclk: 140000 vddc: 950 vddci: 1000 pcie gen: 3
Nov 24 08:50:44 titan kernel: 		power level 2    sclk: 90000 mclk: 140000 vddc: 1150 vddci: 1000 pcie gen: 3
Nov 24 08:50:44 titan kernel: 		power level 3    sclk: 95500 mclk: 140000 vddc: 1188 vddci: 1000 pcie gen: 3
Nov 24 08:50:44 titan kernel: 	status: r 
Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0: GPU fault detected: 146 0x0fc24404
Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x000152FE
Nov 24 08:50:44 titan kernel: radeon 0000:01:00.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x02044004
Nov 24 08:50:44 titan kernel: VM fault (0x04, vmid 1) at page 86782, read from TC (68)
Nov 24 08:50:46 titan kernel: radeon 0000:01:00.0: ring 5 stalled for more than 11940msec
Nov 24 08:50:46 titan kernel: radeon 0000:01:00.0: GPU lockup (waiting for 0x0000000000000003 last fence id 0x0000000000000002 on ring 5)
Nov 24 08:50:46 titan kernel: radeon 0000:01:00.0: failed to get a new IB (-35)
...

The display switches off, and powers back on in regular intervals, just to show a black screen and power off again.

I'm not able to recover from it, even though sometimes I'm able to use sysrq kill signal and get a working VT, and then reboot safely.

Downgrading back to mesa-10.1.5-1.20140607.fc20.x86_64 seems to fix the problem.

I'm using:
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Curacao PRO [Radeon R9 270] [1002:6811] (prog-if 00 [VGA controller])
	Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3050]
	Flags: bus master, fast devsel, latency 0, IRQ 28
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Memory at f0000000 (64-bit, non-prefetchable) [size=256K]
	I/O ports at e000 [size=256]
	Expansion ROM at f0040000 [disabled] [size=128K]
	Capabilities: [48] Vendor Specific Information: Len=08 <?>
	Capabilities: [50] Power Management version 3
	Capabilities: [58] Express Legacy Endpoint, MSI 00
	Capabilities: [a0] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
	Capabilities: [150] Advanced Error Reporting
	Capabilities: [270] #19
	Capabilities: [2b0] Address Translation Service (ATS)
	Capabilities: [2c0] #13
	Capabilities: [2d0] #1b
	Kernel driver in use: radeon
	Kernel modules: radeon


Version-Release number of selected component (if applicable):
mesa-10.3.3-1.20141110.fc20
kernel-3.17.3-200.fc20.x86_64

How reproducible:
extremely often - many times per day

Steps to Reproduce:
1. open firefox and play some videos on youtube / other video sites
2. see computer hang and display power off

Comment 1 Kamil Páral 2014-11-28 12:42:23 UTC
Is there any hope of fixing this in the foreseeable future? Because at the moment, I must avoid pulling mesa updates from F20 Updates repo.

Comment 2 Igor Gnatenko 2014-11-28 21:30:18 UTC
I'll check tomorrow. Sorry for late response.

Comment 3 Igor Gnatenko 2014-12-02 09:16:59 UTC
10.3.4 should fix this issue. I'll prepare new version.

http://cgit.freedesktop.org/mesa/mesa/commit/?h=10.3&id=f02f0559c69daae6ca73e72d32dc329fcb2fd316

Comment 4 Igor Gnatenko 2014-12-02 10:25:45 UTC
http://koji.fedoraproject.org/koji/buildinfo?buildID=596516

please test this build.

Comment 5 Kamil Páral 2014-12-02 16:07:49 UTC
Hi, Igor, 10.3.4 is looking good! :-) In an hour, I haven't seen a single system freeze. I'll continue testing a bit more and report back. Thanks for the update.

Comment 6 Fedora Update System 2014-12-02 16:10:44 UTC
mesa-10.3.4-1.20141202.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/mesa-10.3.4-1.20141202.fc20

Comment 7 Fedora Update System 2014-12-02 16:11:36 UTC
mesa-10.3.4-1.20141202.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/mesa-10.3.4-1.20141202.fc21

Comment 8 Igor Gnatenko 2014-12-02 16:12:22 UTC
(In reply to Kamil Páral from comment #5)
> Hi, Igor, 10.3.4 is looking good! :-) In an hour, I haven't seen a single
> system freeze. I'll continue testing a bit more and report back. Thanks for
> the update.

Cool! Would be good if we will ship new mesa with f21 release.

Comment 9 Kamil Páral 2014-12-02 16:38:09 UTC
(In reply to Igor Gnatenko from comment #8)
> Cool! Would be good if we will ship new mesa with f21 release.

Unfortunately that's not really likely. We're frozen now and hopefully today's RC2 is the last release candidate for f21. The update would end up in 0-day updates, though, if it earns enough karma.

Comment 10 Kamil Páral 2014-12-02 16:42:02 UTC
One more thing, could you please set a higher limit on karma auto-push for those bodhi updates? We don't want to repeat the last experience, when the update was accepted in a day, and therefore not tested properly. Ideally, I would completely turn off karma auto-push and manually inspect the updates in a week and push them if the feedback is good, but I understand that requires more time from you as package maintainers. So just speaking with my QA hat on. But at least increased auto-push limits would be nice, since this is a core system component. Thanks.

Comment 11 Fedora Update System 2014-12-03 06:07:02 UTC
Package mesa-10.3.4-1.20141202.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mesa-10.3.4-1.20141202.fc21'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2014-16183/mesa-10.3.4-1.20141202.fc21
then log in and leave karma (feedback).

Comment 12 Kamil Páral 2014-12-05 07:45:58 UTC
After a few more days, I can confirm I see no freezing while playing videos. Thanks.

Comment 13 Fedora Update System 2014-12-13 09:50:22 UTC
mesa-10.3.5-1.20141207.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.