Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 528312 (udev-intel-flood)
Summary: | udev takes almost 100% CPU due to xorg (intel) continuously re-initializing displays | |||
---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Matěj Cepl <mcepl> | |
Component: | xorg-x11-server | Assignee: | Adam Jackson <ajax> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | |
Severity: | high | Docs Contact: | ||
Priority: | medium | |||
Version: | 14 | CC: | alessandro.suardi, alexvillacislasso, andy, apbartok, awilliam, bobg+redhat, bookreviewer, bschneiders, bugzilla.redhat, cbm, cindwhite1, erecio, eric.brunet, erik-fedora, fzuuzf, gbarros, jaroslav.pulchart, jirka, jrb, jruemker, karlcz, karl+rhbugzilla, knnthsrnsn, lfarkas, luto, martin, mcepl, mdl-mailing, me, mefoster, mhlavink, mishu, nkudriavtsev, opensource, paul, pbaumgar, pingou, pnewell0705, ramindeh, rcrodgers622, redhat-bugzilla, redhat, rhbugzilla, smarlow, theo148, tvujec, valent.turkovic, whanlon, xgl-maint, zingale | |
Target Milestone: | --- | Keywords: | Patch, Reopened, Triaged | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | card_GM45 | |||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 591709 (view as bug list) | Environment: | ||
Last Closed: | 2011-11-30 15:37:30 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 432388, 591709 | |||
Attachments: |
Created attachment 364367 [details]
/var/log/audit/audit.log
Created attachment 364368 [details]
stdout from dmesg
can you strace udevd? what is the output of "ps axc" during the 100% CPU? So, of course, in the moment I want to debug, I cannot reproduce it. Picking NEEDINFO and let's see how it goes. When I will have more information, I will let you know, otherwise feel free to close. Created attachment 364640 [details]
strace.txt
Created attachment 364641 [details]
strace-02.txt
Created attachment 364642 [details]
strace-3.txt
Created attachment 364643 [details]
ps-axc-log.txt
Created attachment 364644 [details]
ps-axc-02.txt
(In reply to comment #3) > can you strace udevd? > > what is the output of "ps axc" during the 100% CPU? It is hard to start all data collection when udev strikes, but I think I managed to get some data at least. What do you think? ok :) the important part in strace is missing... "strace -s 2048" extends the string size of strace. It seems something emits a lot of "change" events.. (means a lot of open("w")/close() on a device) or fnotify does not work. Try to run as root: # udevadm monitor --env so we can see exactly what is happening Created attachment 364819 [details]
udevadm.txt
Created attachment 364820 [details]
udevadm-2.txt
Hmm, plot is getting darker ... I am afraid that after all Xorg IS root of all evil :( yes.. it seems to open()/close() very fast and in a loop.. please reassign. Created attachment 364885 [details]
/var/log/Xorg.0.log
*** Bug 528894 has been marked as a duplicate of this bug. *** I'm not sure what has changed, but I've rebooted several times today (I usually would see these symptoms after a cold boot) and X hasn't gone off the deep end once. Addendum: looking at the list of packages that were updated on this computer yesterday -- yesterday because I didn't get symptoms this morning even -- it looks like it *might* have been the update to xorg-x11-drv-evdev-2.3.0-1.fc12.x86_64 that fixed things ... discussed at today's blocker bug meeting: this is downgraded to target as the impact is not serious enough to be a blocker (just means the system is very slow for a couple of minutes after resuming). -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Well, looks like I spoke too soon on Friday -- when I booted this morning, the computer was pretty much unusable for about 15 minutes after I logged in (X using 80+% CPU, load average over 5). So for me at least, it's more like very slow for 15 minutes after booting, not just for "a couple of minutes after resuming" -- certainly serious for me. :( Addendum: and it started doing it again about 10 minutes after it stopped! Argh, must be Monday ... Reporters, what outputs does your machine claim to have? 'xrandr' output is sufficient. Here's xrandr on my machine (desktop, Intel graphics): Screen 0: minimum 320 x 200, current 1280 x 1024, maximum 8192 x 8192 VGA1 connected 1280x1024+0+0 (normal left inverted right x axis y axis) 338mm x 270mm 1280x1024 60.0*+ 1024x768 60.0 800x600 60.3 640x480 60.0 DVI1 disconnected (normal left inverted right x axis y axis) DP1 disconnected (normal left inverted right x axis y axis) One more datapoint from me: I've seen this bug in Rawhide for some time now (a month or two?). Lenovo Thinkpad T400, Intel Corporation Mobile 4 Series, i915. When I do a clean boot, everything works just fine, but when I suspend to ram, and then resume, I have normal behaviour for maybe 5-10 seconds, then everything locks up, 'top' shows X is eating cpu. After 10-30 seconds I can use the laptop for maybe 2-3 seconds, and things lock up again. This continues for about 1-2 minutes. Then everything works as normal. For me this is 100% reproducable. I use KMS (ie. I don't have nomodeset in grub). $ xrandr Screen 0: minimum 320 x 200, current 1440 x 900, maximum 8192 x 8192 LVDS1 connected 1440x900+0+0 (normal left inverted right x axis y axis) 304mm x 190mm 1440x900 60.2*+ 50.0 1024x768 60.0 800x600 60.3 56.2 640x480 59.9 VGA1 disconnected (normal left inverted right x axis y axis) DVI1 disconnected (normal left inverted right x axis y axis) DP1 disconnected (normal left inverted right x axis y axis) DVI2 disconnected (normal left inverted right x axis y axis) DP2 disconnected (normal left inverted right x axis y axis) DP3 disconnected (normal left inverted right x axis y axis) My current versions: $ rpm -qa \*udev\* \*intel\* libgudev1-145-11.fc12.x86_64 xorg-x11-drv-intel-2.9.1-1.fc12.x86_64 system-config-printer-udev-1.1.13-6.fc12.x86_64 libudev-145-11.fc12.i686 libudev-145-11.fc12.x86_64 libgudev1-145-11.fc12.i686 udev-145-11.fc12.x86_64 intel-gpu-tools-2.9.1-1.fc12.x86_64 xorg-x11-drv-intel-devel-2.9.1-1.fc12.x86_64 Created attachment 366986 [details]
'udevadm monitor --env' from right before suspend, then resume, and until things work as normal
bradford:~$ xrandr Screen 0: minimum 320 x 200, current 1440 x 900, maximum 8192 x 8192 LVDS1 connected 1440x900+0+0 (normal left inverted right x axis y axis) 303mm x 190mm 1440x900 60.0*+ 50.0 1024x768 60.0 800x600 60.3 56.2 640x480 59.9 VGA1 disconnected (normal left inverted right x axis y axis) DVI1 disconnected (normal left inverted right x axis y axis) DP1 disconnected (normal left inverted right x axis y axis) DVI2 disconnected (normal left inverted right x axis y axis) DP2 disconnected (normal left inverted right x axis y axis) DP3 disconnected (normal left inverted right x axis y axis) bradford:~$ and yes behavior is identical to what MartinG described in comment 24 I see that this bug is set as a "fedora-x-target Fedora Universal X target" blocker. Shouldn't it also be a F12 blocker? Or is to too hardware specific? Just asking, no biggie. Please let me know if I can provide any other logs etc... see comment #20, we did discuss it at a meeting and decided the impact was not severe enough to qualify as a release blocker. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Good news! I just upgraded the kernel from kernel-2.6.31.5-96.fc12.x86_64 to kernel-2.6.31.5-115.fc12.x86_64, and have not hit the bug yet (crossing fingers). I've just done five suspend(ram)/resume cycles, successfully. I also upgraded a bunch of other packages, xorg-x11-server-common-1.7.0-5.fc12.x86_64, xorg-x11-server-Xorg-1.7.0-5.fc12.x86_64, glibc-2.11-1.x86_64, DeviceKit-disks-009-3.fc12.x86_64 to mention some. And may I add that resuming is incredibly fast, yay! (I've got a solid state disk, btw) (In reply to comment #28) > I see that this bug is set as a "fedora-x-target Fedora Universal X target" > blocker. Shouldn't it also be a F12 blocker? Or is to too hardware specific? fedora-x-target blocks F12Target fedora-x-blocker blocks F12Blocker I spoke too soon. I had this lockup issue again. What I did this time, was to put the laptop in suspend-to-ram while the power cord was plugged, let it be suspended for several hours, then unplug the powercord while in suspend, and then open the lid to wake it up. After about five seconds of "normal" behaviour, the mouse got jerky, and things locked up for several seconds. Thanks for the clarification in comment #31, btw. Since this bugzilla report was filed, there have been several major updates in various components of the Xorg system, which may have resolved this issue. Users who have experienced this problem are encouraged to upgrade their system to the latest version of their packages (at least F12Beta, but even better if the very latest versions). Please, if you experience this problem on the up-to-date system, let us now in the comment for this bug, or whether the upgraded system works for you. If you won't be able to reply in one month, I will have to close this bug as INSUFFICIENT_DATA. Thank you. [This is a bulk message for all open Fedora Rawhide Xorg-related bugs. I'm adding myself to the CC list for each bug, so I'll see any comments you make after this and do my best to make sure every issue gets proper attention.] My comment #32 is with a fully updated system. No new updates available in Rawhide per Thu Nov 5 19:52:22 UTC 2009. So this bug is still valid. Current packages: kernel-2.6.31.5-115.fc12.x86_64 # rpm -qa \*intel\* \*Xorg\* \*drm\* \*glibc\* \*udev\* libgudev1-145-11.fc12.x86_64 libdrm-devel-2.4.15-1.fc12.x86_64 xorg-x11-drv-intel-2.9.1-1.fc12.x86_64 glibc-2.11-1.i686 system-config-printer-udev-1.1.13-6.fc12.x86_64 glibc-2.11-1.x86_64 libdrm-2.4.15-1.fc12.i686 glibc-headers-2.11-1.x86_64 libdrm-2.4.15-1.fc12.x86_64 libudev-145-11.fc12.i686 libudev-145-11.fc12.x86_64 glibc-debuginfo-2.10.90-25.x86_64 libgudev1-145-11.fc12.i686 udev-145-11.fc12.x86_64 intel-gpu-tools-2.9.1-1.fc12.x86_64 xorg-x11-drv-intel-devel-2.9.1-1.fc12.x86_64 glibc-devel-2.11-1.x86_64 glibc-common-2.11-1.x86_64 xorg-x11-server-Xorg-1.7.0-5.fc12.x86_64 Lenovo Thinkpad T400 I'd be happy to test suggested packages from koji if any. martin: #33 was an automated comment which makes not much sense in this context, sorry. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers Bug still around using kernel-2.6.31.5-122.fc12.x86_64, xorg-x11-server-Xorg-1.7.1-7.fc12.x86_64. This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle. Changing version to '12'. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping *** Bug 538006 has been marked as a duplicate of this bug. *** We filed this bug in the upstream database (https://bugs.freedesktop.org/show_bug.cgi?id=25259) and believe that it is more appropriate to let it be resolved upstream. We will continue to track the issue in the centralized upstream bug tracker, and will review any bug fixes that become available for consideration in future updates. Thank you for the bug report. I experience this bug on a i386 desktop system running Fedora 12 fully updated and it is a show stopper for me. Anyone else experiencing it would agree. The system becomes practically unusable. It should not be closed. see bugs #538196 and #541184. I also experience this problem - on an HP EliteBook 6930p, running Fedora 12 i686 fully updated, without using suspend. The problem is very erratic, and I often have long periods of usability, but once it kicks in the system is pretty much unusable. I also consider it a showstopper. it's closed upstream because it's being worked on upstream. It doesn't mean it won't be fixed. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers *** Bug 538196 has been marked as a duplicate of this bug. *** *** Bug 541184 has been marked as a duplicate of this bug. *** Created attachment 378444 [details]
upstream patch reformatted to fit Fedora kernel tree
I have built 2.6.30.10-104 from the F-11 CVS tree with the patch from #45 included, and it does not fix the problem on my machine. However, also excluding HDMIC_HOTPLUG_INT_STATUS seems to do the trick for me, the udev storms are gone. (In reply to comment #46) > I have built 2.6.30.10-104 from the F-11 CVS tree with the patch from #45 > included, and it does not fix the problem on my machine. Yeah, comments in the upstream bug indicate that it helps only for some models; it actually seems to help me, but not you. Thats vwhat happened with me too. And my old f11 udev scripts no longer works. But xorg problem is more. Important *** Bug 509762 has been marked as a duplicate of this bug. *** Please note after applying the patch listed for my HDMI/udevd issue, I get the following every second in my syslog: Dec 31 15:19:40 pcsca65 kernel: DRHD: handling fault status reg 3 Dec 31 15:19:40 pcsca65 kernel: DMAR:[DMA Write] Request device [00:02.0] fault addr b08003000 Dec 31 15:19:40 pcsca65 kernel: DMAR:[fault reason 05] PTE Write access is not set Dec 31 15:19:40 pcsca65 kernel: DRHD: handling fault status reg 3 Dec 31 15:19:40 pcsca65 kernel: DMAR:[DMA Write] Request device [00:02.0] fault addr b08003000 Dec 31 15:19:40 pcsca65 kernel: DMAR:[fault reason 05] PTE Write access is not set I tried commenting out two bits and just three bits same error message. Anything I can test on my Lenovo T400? I still have this bug with kernel 2.6.32.2-18.fc13.x86_64 (rawhide) on Intel Mobile 4, i915; on every resume from suspend (to ram), the system functions normal for some seconds, and then locks up for up to a minute or so. Is this the same as https://bugzilla.redhat.com/show_bug.cgi?id=523646 ? I don't think so, see my bug 523646 comment 55 even if it's upstream why do you close this bug? udev fills my Xorg.0.log and it becomes a few gigabytes since udev re-discover my samsung lcd. and since udev use 100% cpu i can't use my system. it's still valid on a fully updated f12! (In reply to comment #55) > even if it's upstream why do you close this bug? > udev fills my Xorg.0.log and it becomes a few gigabytes since udev re-discover > my samsung lcd. and since udev use 100% cpu i can't use my system. it's still > valid on a fully updated f12! I agree as well. Perhaps I'm ignorant of how things are done, but closing the bug kind of sweeps it under the rug doesn't it? You have a product that is practically crippled when used on certain popular hardware. Should it not stay open in some fashion so that if you do some sort of audit of bugs that need to be addressed, it'll show up? Even if it's moved upstream it still affects your currently released product and it should be recognized as being an outstanding issue. I agree too. I have been running a kernel.org (hand patched) kernel b/c it's the only way I can use my computer. *** Bug 523646 has been marked as a duplicate of this bug. *** If this is the bug that's going to be kept, and the others closed, the title should be changed. This problem has nothing to do with resume from suspend - except as one possible trigger - as I was getting this without any suspension involved. I eventually got the system to work by reverting to a Fedora 11 kernel, but this is hardly a robust solution. This appears to affect all Intel X4500 drivers, so it's a pretty big problem. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers adjusted, hope that's correct. (In reply to comment #55) > since udev use 100% cpu i can't use my system. it's still > valid on a fully updated f12! Same problem for me, I still have to "killall udevd" each time I reboot the computer The duplicate bug https://bugzilla.redhat.com/show_bug.cgi?id=523646 had priority high and had F13Blocker status. Could we please have those designations added to this one? you can nominate it as a blocker yourself, it requires no special privileges. Priority is to be set by the package maintainer only - https://fedoraproject.org/wiki/BugZappers/BugStatusWorkFlow#Priority_and_Severity The recently released F12 kernel 2.6.32.9 appears to have fixed the bug for my ASUS ul80vt notebook (hybrid Intel GM45 / nVidia discrete graphics). I can cold boot, warm boot, and resume from hibernate or suspend with only a few EDID probes triggered by each. Thank you for the kernel version bump. Does anyone still have trouble after updating to this kernel? I have just rebooted on 2.6.32.9-67.fc12.x86_64 and after a while, Xorg starts again to use one processor slowing down to death the computer. The usual killall udevd works still (and is still needed). In my case (might be related I have no idea), I have: * a dual screen (vga and hdmi) * a wireless keyboard/mouse on usb * graphic card: 4 Series Chipset Integrated Graphics Controller [8086:2E12] (driver i915) Problem still happens for me using 2.6.32.9-70.fc12.i686, on HP EliteBook 6930p (Intel GM45). System is completely unusable for 5 - 10 minutes at a time. And that's just cold booting, no suspend/resume. I can get rid of the problem (as before) by patching my kernel as per https://bugs.freedesktop.org/show_bug.cgi?id=25259 - by completely disabling the HDMI bits in the hotplug mask (I'm only using VGA output, not HDMI). But since the last comment in that freedesktop bug suggests it might be fixed in 2.6.33rc7, perhaps I shouldn't be surprised it's not fixed in 2.6.32.9. for me it also still exist with kernel-PAE-2.6.32.9-70.fc12.i686 Same for my F12 installation with kernel 2.6.32.9-70. Suspend and resume is trigger of this issue for me. I'm curious about something: in the upstream bug, comment #2 (https://bugs.freedesktop.org/show_bug.cgi?id=25259#c2), ajax comments that this is because of a patch (uevent.patch?) that Fedora ships that does input plugging events. So why not just disable uevent.patch in the SRPM? That's what I did and haven't seen this problem since. Plugging in a VGA monitor, having X notice and extending desktop is cool: but not as cool as having working suspend/resume! Wouldn't this be a stopgap solution for F12? Then try to get it working in F13... 100% agree! That does sound pretty reasonable to me. -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #70) > So why not just disable uevent.patch in the SRPM? That's what I did and > haven't seen this problem since. I tried to disable the patch just to find out that it's not there. What exactly have you done to get it working? The well-known patch for 2.6.31 (disabling all HDMI bits) no longer works with 2.6.32. Is there a working solution for the latest F12 kernel? I was able to disable the uevent patch in xorg-x11-drv-intel as per comment #70. The user indicated their reported slowdown problem did not occur since the build without the uevent patch was installed. Created attachment 400407 [details]
Disable HDMI hotplug for 2.6.32.9
(In reply to comment #73) > The well-known patch for 2.6.31 (disabling all HDMI bits) no longer works with > 2.6.32. Is there a working solution for the latest F12 kernel? Works for me. See attachment (id=400407). It's the only way my laptop is usable! Could we expect this patch in kernel build 2.6.32.10.*? I suppose it might be nice to throw in a kernel parameter to enable the hotplug code, for people for whom it works and who actually find it useful? -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers (In reply to comment #75) > Created an attachment (id=400407) [details] > Disable HDMI hotplug for 2.6.32.9 alas this didn't work for me, i still have to kill off udevd to be able to do anything on my systems. I've been running kernel-2.6.32.9-70.fc12.i686 for about a week now and it has resolved my problems. I'm running on a desktop using VGA only (no suspend/resume issues here). This problem still occurs for me on an Intel desktop motherboard. It is a little more difficult to trigger than just booting, but is still consistently triggered on my mythtv machine at home. X will start and auto-login to my mythtv GUI user without trouble. But when I activate mythfrontend, it will trigger this spinning, so it seems somehow the application is inducing the server to start probing the outputs again. Interestingly, on my system apcupsd is also spinning at the same time, and if I kill apcupsd, xorg will stop spinning and behave normally. But apcupsd doesn't act abnormally except with xorg is also spinning. Given that apcupsd is just talking to my UPS via USB, perhaps this clue will help people track down the interrupt handling mess? I will attach a dump of /proc/interrupts from the machine. Created attachment 401161 [details]
/proc/interrupts dump from a machine where xorg intel and apcupsd seem to fight
This is the /proc/interrupts dump I mentioned in my previous comment
System is still unusable with 2.6.32.10-90.fc12 The patch in http://lkml.org/lkml/2010/3/27/88 seams to help here. I would like to know, if it also helps you. Created attachment 403730 [details] patch from the lkml discussion (In reply to comment #84) > The patch in > http://lkml.org/lkml/2010/3/27/88 > seams to help here. > I would like to know, if it also helps you. Taken from http://thread.gmane.org/gmane.linux.kernel/967076 (or http://article.gmane.org/gmane.linux.kernel/967076/raw if you prefer). (In reply to comment #84) > The patch in > http://lkml.org/lkml/2010/3/27/88 > seams to help here. > I would like to know, if it also helps you. Works like a charm for me. Testing scratch build is now brewing at http://koji.fedoraproject.org/koji/taskinfo?taskID=2088892 anybody can download and try this. (In reply to comment #88) > Testing scratch build is now brewing at > http://koji.fedoraproject.org/koji/taskinfo?taskID=2088892 anybody can download > and try this. this also works for me too. thanks same here, fixed. Almost gone, but not quite: I installed kernel-2.6.32.10-94.bz528312.fc12.x86_64, rebooted, logged in and then put the laptop to sleep (Lenovo Thinkpad T400, Intel Mobile 4 series). Then, unplugged the power supply, and opened the lid to wake it up; as usual, the laptop partly froze for several seconds (maybe one minute). However, when I tried to reproduce it, by doing a couple of more suspend/resume cycles, everything seems to work just smooth. "udevadm monitor --property" gives about 52 KiB of text. (btw, the severe flickering seen on eg. kernel-2.6.34-0.19.rc2.git4.fc14.x86_64 is gone too) Created attachment 404303 [details] udevadm monitor --property of kernel 2.6.32.10-94.bz528312.fc12.x86_64 The problem persists on kernel 2.6.32.10-94.bz528312.fc12.x86_64, but is less frequent it seems. Attached is "udevadm monitor --property" from right before suspend/resume cycle, until things behave normal again. I had to let the machine stay in suspend for a while to reproduce it. The partial lockup lasts for about a minute. This is on a Lenovo T400, Intel Mobile 4. (In reply to comment #93) > This is on a Lenovo T400, Intel Mobile 4. What kind & count of connectors to external displays does it have? > What kind & count of connectors to external displays does it have?
I have none connected, but there is one VGA connector directly on the laptop, and the dock-in station have one or maybe two DVIs if recall correctly. (I am not using the dock-in station).
kernel-PAE-2.6.32.11-99.fc12.i686 works great for me. Note that this is with a desktop, so suspend/resume was n/a for me. My laptop: Lenovo T400 with docking station, DVI connected to second monitor Kernel: 2.6.32.10-94.bz528312.fc12.x86_64 OK: - I cannot reproduce this issue after suspend to disk or ram :) ISSUE: - after some "working time" GUI "freeze" again with udev 100% :( I have the same problem: udevd eating up the CPU. Problem did not occur in FC10, started with clean install of FC12. In "normal" state, the CPU is hot (60° and more) and the load is at 52%, what would correspond to one full core of the CPU :-( Killing udevd (all the processes) helps, but then disks will not automount and the cursor focus is lost every minute or so - e.g. in Konsole the focus goes away from the shell and the menu Edit is selected. I have noticed that opening a Dolphin (file manager) window will increase the CPU load, same thing when I mount a removable drive. There is always one udevd process which eats the CPU, a second one which is always restarted by the heavy one, and about a dozen which seem to be just idling and which will not restart if killed. System info: Kernel: 2.6.32.11-99.fc12.x86_64 Hardware: MSI-GT725 CPU: Intel core2duo P-9500 dual core ATI M98L mobility Radeon HD-4850 $ xrandr Screen 0: minimum 320 x 200, current 1680 x 1050, maximum 8192 x 8192 LVDS connected 1680x1050+0+0 (normal left inverted right x axis y axis) 0mm x 0mm 1680x1050 60.0*+ 1400x1050 60.0 60.0 1280x1024 59.9 60.0 1440x900 59.9 1280x960 60.0 59.9 1280x854 59.9 1280x800 59.8 1280x720 59.9 1152x768 59.8 1024x768 60.0 59.9 800x600 60.3 59.9 56.2 848x480 59.7 720x480 59.7 640x480 59.9 59.4 VGA-0 disconnected (normal left inverted right x axis y axis) HDMI-0 disconnected (normal left inverted right x axis y axis) $ less /proc/interrupts CPU0 CPU1 0: 5639116 8699832 IO-APIC-edge timer 1: 5 4995 IO-APIC-edge i8042 4: 0 0 IO-APIC-edge enecir 8: 1 0 IO-APIC-edge rtc0 9: 543 240060 IO-APIC-fasteoi acpi 12: 71 65 IO-APIC-edge i8042 16: 1 248 IO-APIC-fasteoi uhci_hcd:usb3, firewire_ohci, mmc0 17: 89 15 IO-APIC-fasteoi HDA Intel 18: 0 0 IO-APIC-fasteoi ehci_hcd:usb1, uhci_hcd:usb8 19: 183242 20 IO-APIC-fasteoi uhci_hcd:usb5, uhci_hcd:usb7 21: 0 0 IO-APIC-fasteoi uhci_hcd:usb4 22: 2631 141 IO-APIC-fasteoi HDA Intel 23: 0 191516 IO-APIC-fasteoi ehci_hcd:usb2, uhci_hcd:usb6 24: 0 0 PCI-MSI-edge pciehp 25: 0 0 PCI-MSI-edge pciehp 26: 0 0 PCI-MSI-edge pciehp 27: 0 0 PCI-MSI-edge pciehp 28: 0 0 PCI-MSI-edge pciehp 29: 8954244 10423 PCI-MSI-edge ahci 30: 163 149265 PCI-MSI-edge radeon 31: 14 34116 PCI-MSI-edge eth0 32: 0 0 PCI-MSI-edge iwlagn NMI: 0 0 Non-maskable interrupts LOC: 12111100 10081211 Local timer interrupts SPU: 0 0 Spurious interrupts PMI: 0 0 Performance monitoring interrupts PND: 0 0 Performance pending work RES: 148588 242181 Rescheduling interrupts CAL: 58 140 Function call interrupts TLB: 84812 81857 TLB shootdowns TRM: 0 0 Thermal event interrupts THR: 0 0 Threshold APIC interrupts MCE: 0 0 Machine check exceptions MCP: 54 54 Machine check polls ERR: 1 MIS: 0 From /var/log/Xorg.0.log : (EE) AIGLX error: dlopen of /usr/lib64/dri/r600_dri.so failed (/usr/lib64/dri/r600_dri.so: cannot open shared object file: No such file or directory) $ udevadm monitor --env (spews lots of entries like the one below - after udevd is killed, only one entry per 2-3 seconds) monitor will print the received events for: UDEV - the event which udev sends out after rule processing KERNEL - the kernel uevent KERNEL[1271104588.993266] change /devices/pci0000:00/0000:00:1f.2/host4/target4:0:0/4:0:0:0 (scsi) UDEV_LOG=3 ACTION=change DEVPATH=/devices/pci0000:00/0000:00:1f.2/host4/target4:0:0/4:0:0:0 SUBSYSTEM=scsi SDEV_MEDIA_CHANGE=1 DEVTYPE=scsi_device DRIVER=sr MODALIAS=scsi:t-0x05 SEQNUM=471291 UDEV [1271104589.008249] change /devices/pci0000:00/0000:00:1f.2/host4/target4:0:0/4:0:0:0/block/sr0 (block) UDEV_LOG=3 ACTION=change DEVPATH=/devices/pci0000:00/0000:00:1f.2/host4/target4:0:0/4:0:0:0/block/sr0 SUBSYSTEM=block DEVNAME=/dev/sr0 DEVTYPE=disk SEQNUM=388466 ID_CDROM=1 ID_CDROM_CD_R=1 ID_CDROM_CD_RW=1 ID_CDROM_DVD=1 ID_CDROM_DVD_R=1 ID_CDROM_DVD_RW=1 ID_CDROM_DVD_RAM=1 ID_CDROM_DVD_PLUS_R=1 ID_CDROM_DVD_PLUS_RW=1 ID_CDROM_DVD_PLUS_R_DL=1 ID_CDROM_MRW=1 ID_CDROM_MRW_W=1 ID_VENDOR=Optiarc ID_VENDOR_ENC=Optiarc\x20 ID_MODEL=DVD_RW_AD-7560S ID_MODEL_ENC=DVD\x20RW\x20AD-7560S\x20 ID_REVISION=SX01 ID_TYPE=cd ID_BUS=scsi ID_PATH=pci-0000:00:1f.2-scsi-4:0:0:0 ACL_MANAGE=1 ANACBIN=/sbin GENERATED=1 DKD_PRESENTATION_NOPOLICY=0 MAJOR=11 MINOR=0 DEVLINKS=/dev/block/11:0 /dev/scd0 /dev/disk/by-path/pci-0000:00:1f.2-scsi-4:0:0:0 /dev/cdrom /dev/cdrw /dev/dvd /dev/dvdrw KERNEL[1271104589.055744] change /devices/pci0000:00/0000:00:1f.2/host4/target4:0:0/4:0:0:0 (scsi) UDEV_LOG=3 ACTION=change DEVPATH=/devices/pci0000:00/0000:00:1f.2/host4/target4:0:0/4:0:0:0 SUBSYSTEM=scsi SDEV_MEDIA_CHANGE=1 DEVTYPE=scsi_device DRIVER=sr MODALIAS=scsi:t-0x05 SEQNUM=471292 ...8<..................................... Hope this helps. I tested 2.6.32.11-99.fc12.x86_64 and at least on the first boot it seems to stabilize pretty quickly. However, watching top via ssh, I did notice that Xorg and apcupsd both get very busy at the same time several times (getting up to 50-60% CPU each for 10 or more seconds, much more than I see on other systems during X startup), I think corresponding to initial gdm startup and then again during auto-login to GNOME desktop on my media-center PC. Also, I notice while watching top that some CPU% numbers are spurious like negative numbers or 9999% during a single refresh of the screen, then go back to sensible numbers. I have seen this on several kernel versions now, during this bootup phase when Xorg tends to go crazy. I don't think I ever see such behavior from top on other systems. Is it possible there is an underlying system time bug that triggers this Xorg/intel problem...? The clocksource is defaulting to HPET. Also, unlike my earlier comment #81 killing apcupsd does not always resolve the issue when it was malfunctioning. Sometimes killing apcupsd and even udevd were not enough, and I had to kill -9 Xorg as well (so gdm would restart it). Still experiencing this problem on kernel-2.6.32.11-99.fc12.x86_64 Hello! I have a bug https://bugzilla.redhat.com/show_bug.cgi?id=528312 with partially different symptoms, but hope the source of the bug is the same. I found out the kernel commit with regression https://bugzilla.redhat.com/show_bug.cgi?id=573200#c11 . I did not post a patch to disable the commit by myself. You can look at http://lkml.indiana.edu/hypermail//linux/kernel/1001.1/00966.html how to do that. kernel-2.6.32.11-99.fc12.x86_64 seems to have resolved the problem for me, I experienced the bug every time for the -90 kernel (had other hassle with -94.bz528312 but -99 has been fine. (Desktop system only, was previously seeing high CPU usage and huge Xorg.log files every time, but seems resolved by -99). Thanks. If we speak about the same bug, it is resolved only for i8xx chipsets. I have GM45. I updated to F13 (devel) and this issue is still valid (Xorg.0.log is full of "EDID for output ...") after suspend to disk and resume. Kernel version 2.6.33.3-79.fc13.x86_64. This issue is still live in Fedora 12. I'm using: kernel-2.6.32.11-99.fc12.x86_64 xorg-x11-drv-intel-2.9.1-1.fc12.x86_64 I've rebuilt the xorg-x11-drv-intel package without the uevent.patch as a workaround. This makes Xorg not use 100% CPU and keeps the Xorg.0.log from filling up with EDID events. But udevd is still constantly using some CPU instead. This is an HP Pavilion p6340f. The sticker says "Intel GMA X4500 integrated graphics". lspci says "VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)" and suggests it's using the i915 driver. dmesg says "agpgart-intel 0000:00:00.0: Intel G45/G43 Chipset". First time for me to notice udevd spinning at 100% on one of my two cores, GM45 chipset on a Dell E6400 - on 2.6.34-git8. I was just working in Xorg and found laptop hot on my lap... killing udevd brought back my laptop to normal state. No suspend/resume. No extra logging found in Xorg.0.log. I build and run approx 75-90% of all released -git kernels, and I had *never* seen this before. Last known good kernel: 2.6.34-git4. Latest Xorg related yum update: May 16 18:20:05 Updated: xorg-x11-drv-evdev-2.3.3-1.fc12.x86_64 Oh, never mind... Mine seems to be a new mainline kernel bug, as per: http://lkml.org/lkml/2010/5/23/100 Sorry for the noise. Can someone who can reproduce this please boot with drm.debug=0x0f and attach dmesg from the resulting udev storm? Created attachment 416285 [details]
dmesg with drm.debug=0x0f from effected system
Booted 2.6.32.12-115.fc12.x86_64 with drm.debug=0x0f.
System misbehaved at around May 25 16:49.
Attached is copy of dmesg.
Created attachment 416287 [details]
messages, dmesg and xorg.0.log from effected system
Booted kernel with drm.debug=0x0f
It ran run perhaps 10 mins, and at around May 25 16:49 (perhaps) started displaying poor behaviour. (Current kernel seems to run for a short time (10-15 mins) before getting upset, previous had displayed problem immediately!)
Attached contains dmesg, messages, and xorg.0.log.
Hi, yes the uptime was now 20minuts (for me) before the issue turned up :( (F13 kernel 2.6.33.4-95.fc13.x86_64) Created attachment 416461 [details]
dmesg with drm.debug=0x0f on Intel Mobile 4
After my second suspend/resume cycle (uptime 23 hours or so), the system hung for about a minute (right before May 25 20:06:46 CEST 2010). See attached dmesg. This is on a Lenovo Thinkpad T400:
$ cat /proc/cmdline
ro root=/dev/VolGroup00/lv_root rhgb quiet selinux=0 vga=0x318 SYSFONT=latarcyrheb-sun16 LANG=en_US.UTF-8 KEYTABLE=no rd_plytheme=charge intel_iommu=igfx_off drm.debug=0x0f
$ uname -r
2.6.34-11.fc14.x86_64
$ rpm -qa \*intel\*
intel-gpu-tools-2.10.0-5.fc14.x86_64
xorg-x11-drv-intel-2.10.0-5.fc14.x86_64
$ lspci|grep VGA
00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07)
Created attachment 416462 [details]
relevant parts of /var/log/messages corresponding to previous dmesg output
I filed https://bugzilla.redhat.com/show_bug.cgi?id=600465 but it's almost certainly a duplicate of this bug. A couple of data points that I haven't seen mentioned: 1. When the system is unresponsive it's possible to hotkey out of X to a textual virtual terminal and performance returns to normal immediately. Switching back to the X session makes the system bog down again. 2. This bug appeared for the first time only a week or two ago for me and was very intermittent and solved by a reboot. It got worse and worse, especially in the last couple of days, to the point where my system is almost totally unusable, almost all the time. Possibly related: the weather turned distinctly hotter in the last few days (the only other relevant variable I can think of). Mine is a Dell "Studio Slim" x86_64 desktop. This is a total showstopper for me. In intel_setup_outputs in drivers/gpu/drm/i915/intel_display.c, try commenting out the entire contents of the "} else if (SUPPORTS_DIGITAL_OUTPUTS(dev)) {" block. That fixes it for me (at the cost of breaking displayport). In a clarification I just added to bug 600465 (a likely duplicate of this bug), I pointed out that while some sufferers of this bug have symptoms for just a minute or two, mine continue indefinitely once they begin. From some of the comments here it sounds like others may be having the same experience? Please clarify if you can. I also noted: "this problem was recurring a few times each day when I first reported it, but in the past few days it has happened less than once per day. The only difference I can think of between then and now has been the ambient temperature -- it was very hot, but it cooled off. This weekend is supposed to be very hot again, so we'll see if the problem worsens" Bob: what kernel are you on? I think 2.6.35 has a regression, which I'm about to post a patch for 2.6.35 to at least not make it worse. Also, for my amusement, can you wait until the problem is happening, then (after switching to a VT if you have to) run intel_reg_read 0x61114 a bunch of times and send me all the output? intel_reg_read 0x61110 might also be interesting, but you only have to do that once. Created attachment 423471 [details]
Workaround patch
For those of you who just want to use your computer without waiting for this to get fixed for real, try the attached kernel patch. Then boot with i915.hotplug_mask=0x38000000. That might mean you have to run xrandr (no parameters needed) after plugging or unplugging a digital cable.
You could also try specifying even fewer bits to see if some combination keeps the problem fixed but lets hotplug work. For example, 0x08000000 stops the bug for me (but I haven't tested hotplug yet since I'm away from my docking station).
Andy: I'm using an up-to-date F12, so I'm on kernel 2.6.32.12-115.fc12.x86_64. Also, my intel-gpu-tools is 2.9.1 and doesn't have intel_reg_read. Please see bug 600465 for some possibly informative attachments I just added in response to a NEEDINFO request. Another workaround, which works for me on my Q45 based system: killall -STOP Xorg during the storm, then wait until udev finishes spinning at 100% (a minute or less), then killall -9 Xorg. After this, the system seems to behave normally. For me, the storm only happens after bootup, and not every time, but I never use suspend/resume on this host... (In reply to comment #118) > For those of you who just want to use your computer without waiting for this to > get fixed for real, try the attached kernel patch. Then boot with > i915.hotplug_mask=0x38000000. That might mean you have to run xrandr (no > parameters needed) after plugging or unplugging a digital cable. Thanks for the workaround, I applied it and my laptop has been working fine for the past 18 hours. I've using the workaround patch from Andy for a week now, and I've never had this performance issue again (F13 with Intel GM45 and kernel 2.6.33.5-112). But a permanent solution is really needed :-( Update: I built a kernel with the patch from comment #118 (and ran it with the appropriate flags) and it DID NOT HELP. The symptoms appeared after a warm and a cold reboot. So I tried disabling the uevent patch in xorg-x11-drv-intel as suggested in comment #70 and it DID HELP. That is, the udev storms continued to happen, but they did not slow X to a crawl. In fact, a udev storm is happening right now as I type this, but it's monopolizing just one of my four cores, which I can live with for now. (Cross-posting this update to bug #600465.) Bob, did you append "i915.hotplug_mask=0x38000000" to your kernel arguments? Yes I did; in fact I hard-coded it into grub.conf. And I double-checked the kernel arguments at the boot menu. And I triple-checked with dmesg that that argument made it to the running kernel. (I also counted the correct number of trailing zeroes a few times to make sure I had it right.) Bob: what kernel version are you running? If it's 2.6.35-anything, try the patch here in addition to the hotplug_mask patch: https://patchwork.kernel.org/patch/105727/ Failing that, can you run either intel_reg_read 0x61110 or intel_reg_dumper and post the output? Both of them live in intel-gpu-tools. Hi Andy, As I reported in comment #119, my version of intel-gpu-tools does not have intel_reg_read (or intel_reg_dumper). But it does have something called intel_gpu_dump, so just in case that's useful, I'll attach its output. On the other hand, I'm on a slightly newer kernel now: 2.6.32.14-127.i915_irq_patch.fc12.x86_64. Not new enough for the patch you suggested -- though it looks like the patch will apply to 2.6.32 just fine, except for the hotplug_en &= CRT_HOTPLUG_MASK; line in intel_crt.c, which doesn't exist. (In fact, although i915_reg.h defines CRT_HOTPLUG_MASK, no code in that directory appears to use it.) If I get some time this weekend perhaps I will try the patch anyway. Created attachment 426952 [details]
Output of intel_gpu_dump
Unfortunately, the gpu dump doesn't help, and your kernel might have different hotplug code. Basically, if the PORT_HOTPLOG_EN (0x61110) register has any of bits 0x38000000 set, then my patch didn't work. If not, then either your kernel does something strange or there's a differnet bug. Is there any chance you could download the intel-gpu-tools source and build it? git link and tarballs are here: http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/ You shouldn't need the other patch unless fc12 backported a regression from 2.6.35, which sounds rather unlikely. OK, I built the newer intel-gpu-tools, and the output of intel_reg_read 0x61110 is... 0x38000320. So I quadruple-checked the kernel params: % dmesg ... Kernel command line: ro root=/dev/mapper/vg_marzipan2-lv_root noiswmd LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us rhgb quiet i915.hotplug_mask=0x38000000 ... and I also checked that I'm running a kernel that actually contains the patch in question: % modinfo i915 ... parm: hotplug_mask:Disable these hotplug bits (non-Ironlake) (uint) ... Eager to help get to the bottom of this. Feels like we're close. (There's just a few other places that write to PORT_HOTPLUG_EN.) Let me know what else I can do. You probably need the fix in commit 6e0032f0ae4440e75256bee11b163552cae21962, which you can find here: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=6e0032f0ae4440e75256bee11b163552cae21962 I'll be out of town for a few days, so good luck :) Thanks Andy. I applied that diff and built another kernel, and this time: % sudo intel_reg_read 0x61110 0x61110 : 0x320 I'll do what I can to provoke a udev storm and will report back in a few days. A few days later: the issue is completely resolved. Thanks again! I which version of Fedora Linux kernel is this bug fixed? Valent: none. The bug seems to be resolved if you apply my patch and boot with i915.hotplug_mask=0x38000000. That causes hotplug of digital outputs to work considerably less well (the kernel won't notice in a timely manner), so it's unlikely to get applied anywhere. The patch has been working nicely so far. I only rebuilt the patched i915.ko module. I am wondering if someone was kind enough to make the freshest i915.ko module available when a kernel update comes out, as this would be enormous help to people like me (not as if the patch itself wasn't useful on its own). Any news on a patch in Fedora, not just the workarounds above? We're still using one of those workarouds, which isn't perfect, and this bug continues to be an issue on a large number of workstations. Im also hit by this bug on a dell E4200 laptop (intel chipset, x86-64, F13 up to date, KDE). A couple of minutes after most resume, the system hangs for about one minute with X taking all the CPU, then everything is back to normal. Thes /var/log/Xorg.0.log increases a lot during one of these storms. On the last instance, I had this [ 17359.712] (II) intel(0): EDID for output LVDS1 [ 17359.712] (II) intel(0): Manufacturer: LCD Model: 2109 Serial#: 909718585 [ 17359.712] (II) intel(0): Year: 2010 Week: 12 [ 17359.712] (II) intel(0): EDID Version: 1.3 [ 17359.712] (II) intel(0): Digital Display Input [ 17359.712] (II) intel(0): Max Image Size [cm]: horiz.: 26 vert.: 16 [ 17359.712] (II) intel(0): Gamma: 2.20 [ 17359.712] (II) intel(0): No DPMS capabilities specified [ 17359.712] (II) intel(0): Supported color encodings: RGB 4:4:4 YCrCb 4:4:4 [ 17359.712] (II) intel(0): First detailed timing is preferred mode [ 17359.712] (II) intel(0): redX: 0.580 redY: 0.340 greenX: 0.310 greenY: 0.550 [ 17359.712] (II) intel(0): blueX: 0.155 blueY: 0.155 whiteX: 0.313 whiteY: 0.329 [ 17359.712] (II) intel(0): Manufacturer's mask: 0 [ 17359.713] (II) intel(0): Supported detailed timing: [ 17359.713] (II) intel(0): clock: 82.0 MHz Image Size: 261 x 163 mm [ 17359.713] (II) intel(0): h_active: 1280 h_sync: 1352 h_sync_end 1480 h_blank_end 1660 h_border: 0 [ 17359.713] (II) intel(0): v_active: 800 v_sync: 803 v_sync_end 809 v_blanking: 823 v_border: 0 [ 17359.713] (II) intel(0): Supported detailed timing: [ 17359.713] (II) intel(0): clock: 56.3 MHz Image Size: 261 x 163 mm [ 17359.714] (II) intel(0): h_active: 1280 h_sync: 1352 h_sync_end 1480 h_blank_end 1694 h_border: 0 [ 17359.714] (II) intel(0): v_active: 800 v_sync: 803 v_sync_end 809 v_blanking: 831 v_border: 0 [ 17359.714] (II) intel(0): HMW1K@121EWU [ 17359.714] (II) intel(0): [ 17359.714] (II) intel(0): EDID (in hex): [ 17359.714] (II) intel(0): 00ffffffffffff003064092139343936 [ 17359.715] (II) intel(0): 0c140103901a10780a87f594574f8c27 [ 17359.715] (II) intel(0): 27505400000001010101010101010101 [ 17359.715] (II) intel(0): 0101010101010820007c512017304880 [ 17359.715] (II) intel(0): 360005a31000001afe15009e51201f30 [ 17359.715] (II) intel(0): 4880360005a31000001a000000fe0048 [ 17359.715] (II) intel(0): 4d57314b403132314557550a000000fe [ 17359.715] (II) intel(0): 00000000000000000001010a202000dc [ 17359.715] (II) intel(0): EDID vendor "LCD", prod id 8457 [ 17359.716] (II) intel(0): Printing DDC gathered Modelines: [ 17359.716] (II) intel(0): Modeline "1280x800"x0.0 82.00 1280 1352 1480 1660 800 803 809 823 +hsync -vsync (49.4 kHz) [ 17359.716] (II) intel(0): Modeline "1280x800"x0.0 56.30 1280 1352 1480 1694 800 803 809 831 +hsync -vsync (33.2 kHz) [ 17359.717] (II) intel(0): Not using default mode "320x240" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "400x300" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "400x300" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "512x384" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "640x480" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "640x512" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "800x600" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "896x672" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "928x696" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "960x720" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "700x525" (doublescan mode not supported) [ 17359.717] (II) intel(0): Not using default mode "1024x768" (doublescan mode not supported) [ 17359.717] (II) intel(0): Printing probed modes for output LVDS1 [ 17359.717] (II) intel(0): Modeline "1280x800"x60.0 82.00 1280 1352 1480 1660 800 803 809 823 +hsync -vsync (49.4 kHz) [ 17359.717] (II) intel(0): Modeline "1280x800"x40.0 56.30 1280 1352 1480 1694 800 803 809 831 +hsync -vsync (33.2 kHz) [ 17359.717] (II) intel(0): Modeline "1024x768"x60.0 65.00 1024 1048 1184 1344 768 771 777 806 -hsync -vsync (48.4 kHz) [ 17359.718] (II) intel(0): Modeline "800x600"x60.3 40.00 800 840 968 1056 600 601 605 628 +hsync +vsync (37.9 kHz) [ 17359.718] (II) intel(0): Modeline "800x600"x56.2 36.00 800 824 896 1024 600 601 603 625 +hsync +vsync (35.2 kHz) [ 17359.718] (II) intel(0): Modeline "640x480"x59.9 25.18 640 656 752 800 480 490 492 525 -hsync -vsync (31.5 kHz) [ 17359.744] (II) intel(0): EDID for output VGA1 [ 17359.753] (II) intel(0): EDID for output HDMI1 [ 17359.753] (II) intel(0): EDID for output DP1 [ 17359.762] (II) intel(0): EDID for output HDMI2 [ 17359.762] (II) intel(0): EDID for output DP2 [ 17359.762] (II) intel(0): EDID for output DP3 repeated 276 times, for a total duration of 40 seconds. I have the impression that it happens nearly always after long suspends (more than 20 minutes) but not always after short suspends (couple of seconds, for testing). I have the impression that the storm starts when there is some high system activity (when I launch konqueror or bring on the front an opened oowriter), but it might just be impressions. It never happens on a cold boot. On a top, I see xorg taking the CPU, but I don't see udevd. This is an incredibly annoying bug. I get the excessive udev output currently only with a certain hdmi-dvi converter. Using another one works. *** Bug 640884 has been marked as a duplicate of this bug. *** This bug seems to have been eliminated in F14. (There is a MUCH more annoying bug instead: 632031, but that's a different story. Hopefully this other one won't affect you.) I have what is to become F14 here and just had an occurrence of this bug yesterday. And very bad one, it didn't go away for quite a long time, even changing VTs, moving windows around, etc... wasn't helping. I had an office full of students so I didn't have time to investigate. Finally I just shut down and rebooted. The machine is up to date. If anything I would say the problems got more common and more severe since I moved from F13 to F14. Yesterday's storm generated an X log file of 114 megabytes. Kernel is: 2.6.35.6-39.fc14.x86_64 udev-161-4.fc14.x86_64 xorg-x11-drv-intel-2.12.0-6.fc14.1.x86_64 (In reply to comment #141) > This bug seems to have been eliminated in F14. (There is a MUCH more annoying > bug instead: 632031, but that's a different story. Hopefully this other one > won't affect you.) On my system (dell E4200 laptop, intel chipset, x86-64, KDE), I could indeed fix the bug (it didn't show up after 5 suspend/resume cycle) by simply installing and running the F14 kernel kernel-2.6.35.6-39.fc14.x86_64 without changing anything else. And I can still suspend ! This message is a reminder that Fedora 13 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 13. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '13'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 13's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 13 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Please change version to 14 as the problem still exists. Probably even in fc15, not tested yet. I'm not using my affected laptop much anymore, so I've stopped really thinking about this bug. The real fix is known but it's complicated. Maybe someone can be persuaded to do it some day :) -- Fedora Bugzappers volunteer triage team https://fedoraproject.org/wiki/BugZappers This bug is well into tl;dr land, F14 is EOL very soon, and I've not seen any instance of this in ages. Closing. Please file new bugs if this is still an issue in F15 or later. Fair enough. I'm about to resurrect the affected laptop (probably on Ubuntu LTS this time), and I'll repoen a bug somewhere if the problem is still there. My newer Intel machines are unaffected. I'm without affected notebook now too. I still have an affected notebook, but it is fine with F15 and F16. I have a (formerly) affected desktop that's still running F14, but I haven't experienced the bug in a long time. The bug still exists on Acer TravelMate 1810TZ on F15 using kernel 2.6.41.1-1.fc15.x86_64 and disappears when applying the patch from comment #118. Sorry, meant Acer *Aspire* 1810TZ. And nothing changed for 3.4.0... losing my hope to boot with a standard kernel some day ;-) Created attachment 612185 [details]
Ported the workaround patch to kernel 3.5.3
Since this bug is still valid for me I ported this patch to the current kernel of fc17, 3.5.3.
It seems that this bug has been solved in kernel, I didn't expire it anymore. Maybe it was commit d1757408bfe3adca81ff1c88fcb2d578864f8e9d by Jani Nikula: > drm/i915: only enable sdvo hotplug irq if needed or 768b107e4b3be0acf6f58e914afe4f337c00932b by Daniel Vetter > drm/i915: disable sdvo hotplug on i945g/gm > v2: While at it, remove the bogus hotplug_active read, and do not mask > hotplug_active[0] before checking whether the irq is needed Since I seemed to be the only one who has still been struggling with this I think this bug can be closed now. |
Created attachment 364366 [details] /var/log/messages Description of problem: When resuming computer from suspend-to-RAM after a brief pause udevd takes 100% CPU and it won't let computer work for couple of seconds. Then computer works as it is supposed to. Version-Release number of selected component (if applicable): udev-debuginfo-145-9.fc12.x86_64 udev-145-10.fc12.x86_64 How reproducible: 75% (mostly it happens, but no always) Steps to Reproduce: 1.suspend/resume notebook 2. 3. Actual results: see above Expected results: computer should just work upon resume Additional info: