Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 116494

Summary:

Kernel panics for fatal exception in interrupt when using ipsec

Product:

[Fedora] Fedora

Reporter:

Kimmo Koivisto <kimmo.koivisto>

Component:

kernel

Assignee:

David Miller <davem>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Brian Brock <bbrock>

Severity:

high

Docs Contact:

Priority:

medium

Version:

rawhide

CC:

ckjohnson

Target Milestone:

---

Target Release:

---

Hardware:

athlon

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2004-03-08 19:11:29 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
screen shot from kernel panic from 2.6.3	none
screen shot from kernel panic 2.6.1-1.newer-than-37	none
rpm -qa	none
Full oops from serial console	none
oops with lucent wlan and ipsec	none
oops with via rhine eth and ipsec	none

Description Kimmo Koivisto 2004-02-21 19:31:42 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (compatible; Konqueror/3.2; fi, fi_FI@euro, fi_FI, fi_FI.UTF-8) (KHTML, like Gecko)

Description of problem:
Kernel panics after some time (from 30 seconds to couple of minutes) when using network (for example "wget something-big"). 

Fedora Core 1, some parts from fc1 development, some from test1.
Athlon 2100+, Motherboard with VIA chipset, integrated audio and LAN
Prism2 based PCMCIA WLAN
Buffalo PCMCIA/PCI adapter
Nova-t pci dvb-t tv-card
Bt878 based pci analog tv-card
IP: 192.168.2.2 (IPSec gateway is 192.168.2.1)

I have a WLAN and I'm using ipsec to protect all traffic. I think that these panics could relate to the ipsec but I haven't tested this enough to be sure. 
I've tested this with IBM T23 laptop (prism2-wlan) with FC1 and kernel-2.6.1-newer-than-37 with the same results.

Screenshots from panics, rpm -qa and other system files attached.

Version-Release number of selected component (if applicable):
kernel-2.6.1-1.37, kernel-2.6.3-1.91, ipsec-tools-0.2.2-8

How reproducible:
Always

Steps to Reproduce:
1. Configure ipsec to encrypt all
2. Make some traffic

    

Actual Results:  Kernel panic - atal exception in interrupt

Additional info:

Comment 1 Kimmo Koivisto 2004-02-21 19:35:33 UTC

Created attachment 97914 [details]
screen shot from kernel panic from 2.6.3

Comment 2 Kimmo Koivisto 2004-02-21 19:38:46 UTC

Created attachment 97915 [details]
screen shot from kernel panic 2.6.1-1.newer-than-37

Comment 3 Kimmo Koivisto 2004-02-21 19:39:14 UTC

Created attachment 97916 [details]
rpm -qa

Comment 4 David Miller 2004-02-21 20:13:59 UTC

And if you do such a transfer without using IPSEC, it works just
fine right?

Unfortunately, the top of the OOPS log scrolled off the screen by the
time you took the screenshots, so the most important information is not
there.  If there is any chance to get the rest of the OOPS that would
help a lot, maybe even by making use of serial console.

Also, if you can try with something other than a prism2 card, or over
ethernet, that would eliminate the prism2 card driver as a culprit as
well.

The more we narrow this down, the better chance it will get fixed.

Comment 5 Kimmo Koivisto 2004-02-22 11:54:13 UTC

I'm not 100% sure about the IPSec, I made some tests earlier (month ago) 
but dont remember all the details. 
 
I found the null modem cable and was able to get the full oops with serial 
console, I'll attach it to the bug.  
 
I'll try with Lucent WLAN card and report if it made any difference.

Comment 6 Kimmo Koivisto 2004-02-22 11:56:33 UTC

Created attachment 97923 [details]
Full oops from serial console

Comment 7 Kimmo Koivisto 2004-02-22 12:18:07 UTC

And the Lucent WLAN card was no exception, same results (it's using the 
same orinoco driver... maybe I should try other cards). 
 
Oops with lucent card is attached.  
 
To be more exact, I'm using only ESP with AES128/SHA1, not AH at all.

Comment 8 Kimmo Koivisto 2004-02-22 12:19:20 UTC

Created attachment 97924 [details]
oops with lucent wlan and ipsec

Comment 9 David Miller 2004-02-29 05:41:48 UTC

Both those cards use the orinoco driver, so we're not yet at the point
where the orinoco driver is not suspect.  I somehow think it is since I've
seen people using this kind of setup successfully without crashes over
other drivers.

Could you please try this over normal ethernet?

Comment 10 Christopher Johnson 2004-03-06 14:13:00 UTC

Here's your normal ethernet case.

I have recently been testing ipsec on 2.6 and hit this problem as
well. In my case it occured on an athlon server with ethernet, as well
as athlon laptop with wireless.

kernel-2.6.3-1.91
ipsec-tools-0.2.2-8

This is a home test lab, so I don't have all kinds of resources but I
do have flexibility.  I can attest that with ipsec turned down on both
systems that kernel panics do not occur, and with corresponding ipsec
interfaces up in transport mode so all traffic is encrypted, a kernel
panic does occur after a brief time.

The ethernet interface on server uses r8169 module (RealTek RTL8169
Gigabit Ethernet).
The wireless interface on laptop uses orinoco and orinoco_cs modules
(Intersil PRISM2 11 Mbps Wireless Adapter).

I don't have a serial console configured, but I wrote down a few
messages from a panic on the server:
kernel/sched.c:1799:spin_lock(kernel/sched.c:c035f140) already locked
by kernel/sched.c:1799
(several lines of the same)
kenrel/sched.c:291:spin_lock(kernel/sched.c:c04177e0) already locked
by kernel/sched.c:1634
(several lines of the same)
Kernel panic: Fatal exception in interrupt

Comment 11 Kimmo Koivisto 2004-03-07 13:38:49 UTC

I now managed to test this with wired cards and ipsec, it still panics.  
I used Via Rhine 2 (VIA Technologies, Inc. VT6102 [Rhine-II] (rev 74)) and 
exactly the same config. 
 
kernel-2.6.1-1.37 seems to work okay like it did with wireless cards, but 
kernel-2.6.3-1.91 panics. Capture from panic attached, panic looks quite the 
same that with wireless cards.

Comment 12 Kimmo Koivisto 2004-03-07 13:39:59 UTC

Created attachment 98357 [details]
oops with via rhine eth and ipsec

Comment 13 Christopher Johnson 2004-03-07 19:30:40 UTC

Check out https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=117171

Per the posting on bug 117171 this appears to be fixed as of
kernel-2.6.3-2.1.238.

I updated to kernel-2.6.3-2.1.240 and have been testing for 5 hours
with no panic. To stress it a little I copied the .240 kernel rpm,
achieving ~3.8Mbps effective transfer rate over wireless connection
encrypted by transfer mode ipsec.  No panic.

Comment 14 Kimmo Koivisto 2004-03-07 21:13:38 UTC

Yes, now it seems to work ok with kernel-2.6.3-2.1.242. 
I have now tested this more than an hour without any problems. With 
2.6.3-1.91 it took less than minute to oops, so I think but is fixed.  
 
Any idea where the bug were, what was fixed? 
 
I noticed that tcpdump works quite differently now with 2.6.3-2.1.242. 
tcpdump used to look like this: 
192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x217e) (DF) 
192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x217f) (DF) 
192.168.2.1 > 192.168.2.2: ESP(spi=0x05989fc0,seq=0x211e) 
truncated-ip - 24 bytes missing! 192.168.2.1 > 192.168.2.2: truncated-ip - 
40764 bytes missing! 240.4.249.206 > 192.168.2.1: udp (frag 
17664:40876@672) [tos 0x98]  (ipip-proto-4) 
192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x2180) (DF) 
192.168.2.1 > 192.168.2.2: ESP(spi=0x05989fc0,seq=0x211e) 
192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x2181) (DF) 
 
but now the tcpdump looks like this: 
192.168.2.1 > 192.168.2.2: ESP(spi=0x04a44fad,seq=0x9c4) 
192.168.123.123 > 192.168.2.2: icmp 9: echo request seq 35841 
192.168.2.2 > 192.168.2.1: ESP(spi=0x3b2cdf1f,seq=0x989) 
192.168.2.1 > 192.168.2.2: ESP(spi=0x04a44fad,seq=0x9c5) 
192.168.123.123 > 192.168.2.2: icmp 9: echo request seq 36097 
192.168.2.2 > 192.168.2.1: ESP(spi=0x3b2cdf1f,seq=0x98a) 
 
I can see some of the traffic as plain text, but all the traffic is encrypted (I 
verified this with external sniffer, everything was ok).

Comment 15 David Miller 2004-03-08 19:11:29 UTC

Traffic over tunnels is seen twice, and tcpdump is able to see the
traffic in both instances.  In the first case, pre-tunnel, the traffic
is not encrypted yet.  In the second case, after going into the tunnel,
the traffic is encrypted.

ANyways, I'm closing this bug now that it is fixed and no I have
no idea what fixed it, probably some random change that occurred
in 2.6.x development.