Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 80779

Summary: SMP kernel hangs solid, non-smp is fine
Product: [Retired] Red Hat Linux Reporter: Need Real Name <egil>
Component: kernelAssignee: Jeff Garzik <jgarzik>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 9CC: peterm
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-02-21 18:50:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 79578    
Attachments:
Description Flags
lsmod of running smp kernel none

Description Need Real Name 2002-12-31 08:45:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 Galeon/1.2.7 (X11; Linux i686; U;) Gecko/20021216

Description of problem:
After typically an hour or so, the 2.4.20-2.2smp kernel pretty
consistently freezes solid in X. I presume it hangs at the kernel level, since
even the keyboard numlock etc is totally frozen. I'm running a dual Celeron
system, and it has behaved perfectly on smp with all previous versions,
including the latest rawhide before Xmas. 

The freeze happens under normal light use - when moving mouse cursor etc. I have
found no particular correlation to user actions or programs.

Booting the 2.4.20-2.2 non-smp kernel, everything seems to work fine.

Note that I just now changed the X server to the latest Phoebe variant -- before
this I had the standard 8.0 variant.

Everything freezes solid, so I don't know how to extract debvug information...

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Boot phoebe smp kernel
2.Use machine normally for a hour or so
3.Everything freezes solid (inc. keyboard, network etc.)
    

Actual Results:  Total hang

Additional info: Please tell me some hints about how to find where crash is, and
I will assist as mush as I can.

Comment 1 Arjan van de Ven 2002-12-31 10:23:06 UTC
can you paste your lsmod information to this bug (so that I can make a list of
suspects); in addition can you try to add "acpi=off" to the kernel commandline
("a" in grub, or the vmlinuz line in /boot/grub/grub.conf)

Comment 2 Need Real Name 2002-12-31 12:17:07 UTC
Created attachment 89002 [details]
lsmod of running smp kernel

Comment 3 Need Real Name 2002-12-31 12:21:33 UTC
Comment on attachment 89002 [details]
lsmod of running smp kernel

This is with acpi=off in grub.conf

Will report how this switch  affects stability when I know more...

Comment 4 Need Real Name 2003-01-01 10:16:03 UTC
With acpi=off in grub.conf, the SMP kernel seems to be stable also, so
presemably, it would seem reasonable to conclude that the problem is related to
the combination of ACPI and SMP.

Is there something that could be done to isolate the problem further?

Comment 5 Need Real Name 2003-01-01 14:36:14 UTC
Seems I jumped to conclusions:

The SMP system less ACPI now hung after 26 hours and 3 minutes.

Symptoms just as before: Keyvboard/screen/everything completely dead.

I had an external machine logged in via telnet on Externet, running "top". This
display also froze.

An interesting observation: Ping from the remote machine functioned flawlessly!
Perhaps one CPU was frozen, but the other was still active, being able to serve
the ping requests. Trying to log in via telnet failed, though.

The last "top" status showed a CPU load of 21% user on both, and a system load
of  5 and 7%. 368M memory used, 58M free. The top proceses were X at 28%,
bubblemon-gnome 8%, galeon-bin 6%, gnome-panel 4%, metacity 4%, top 1%,
evolution-mail 1%, evolution 0.5%, mixer_applet2 0.6%, gnome-session 0.1%,
xscreensaver 0.1%, magicdev 0.1%, eggcups 0.1%, yank 0.1%, init 0.0%

Comment 6 Need Real Name 2003-01-02 11:10:04 UTC
It happened again, this time after, say, 20 hours. Same circumstances, same thing.

Comment 7 Need Real Name 2003-01-07 06:45:39 UTC
Just for the record, I'm running the system with the single CPU kernel now, and
the system has been stable for the last 5 days. I think it is pretty safe to
conclude that the problem only occurs with SMP.

If there is anything at all I can do to extract more information from the crash
situation, then please let me know, and I will switch back to SMP mode again.

Comment 8 Jeff Garzik 2003-01-18 23:54:15 UTC

*** This bug has been marked as a duplicate of 82123 ***

Comment 9 Red Hat Bugzilla 2006-02-21 18:50:48 UTC
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.