Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 116494
Summary: | Kernel panics for fatal exception in interrupt when using ipsec | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Kimmo Koivisto <kimmo.koivisto> | ||||||||||||||
Component: | kernel | Assignee: | David Miller <davem> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | rawhide | CC: | ckjohnson | ||||||||||||||
Target Milestone: | --- | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | athlon | ||||||||||||||||
OS: | Linux | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2004-03-08 19:11:29 UTC | Type: | --- | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Attachments: |
|
Description
Kimmo Koivisto
2004-02-21 19:31:42 UTC
Created attachment 97914 [details]
screen shot from kernel panic from 2.6.3
Created attachment 97915 [details]
screen shot from kernel panic 2.6.1-1.newer-than-37
Created attachment 97916 [details]
rpm -qa
And if you do such a transfer without using IPSEC, it works just fine right? Unfortunately, the top of the OOPS log scrolled off the screen by the time you took the screenshots, so the most important information is not there. If there is any chance to get the rest of the OOPS that would help a lot, maybe even by making use of serial console. Also, if you can try with something other than a prism2 card, or over ethernet, that would eliminate the prism2 card driver as a culprit as well. The more we narrow this down, the better chance it will get fixed. I'm not 100% sure about the IPSec, I made some tests earlier (month ago) but dont remember all the details. I found the null modem cable and was able to get the full oops with serial console, I'll attach it to the bug. I'll try with Lucent WLAN card and report if it made any difference. Created attachment 97923 [details]
Full oops from serial console
And the Lucent WLAN card was no exception, same results (it's using the same orinoco driver... maybe I should try other cards). Oops with lucent card is attached. To be more exact, I'm using only ESP with AES128/SHA1, not AH at all. Created attachment 97924 [details]
oops with lucent wlan and ipsec
Both those cards use the orinoco driver, so we're not yet at the point where the orinoco driver is not suspect. I somehow think it is since I've seen people using this kind of setup successfully without crashes over other drivers. Could you please try this over normal ethernet? Here's your normal ethernet case. I have recently been testing ipsec on 2.6 and hit this problem as well. In my case it occured on an athlon server with ethernet, as well as athlon laptop with wireless. kernel-2.6.3-1.91 ipsec-tools-0.2.2-8 This is a home test lab, so I don't have all kinds of resources but I do have flexibility. I can attest that with ipsec turned down on both systems that kernel panics do not occur, and with corresponding ipsec interfaces up in transport mode so all traffic is encrypted, a kernel panic does occur after a brief time. The ethernet interface on server uses r8169 module (RealTek RTL8169 Gigabit Ethernet). The wireless interface on laptop uses orinoco and orinoco_cs modules (Intersil PRISM2 11 Mbps Wireless Adapter). I don't have a serial console configured, but I wrote down a few messages from a panic on the server: kernel/sched.c:1799:spin_lock(kernel/sched.c:c035f140) already locked by kernel/sched.c:1799 (several lines of the same) kenrel/sched.c:291:spin_lock(kernel/sched.c:c04177e0) already locked by kernel/sched.c:1634 (several lines of the same) Kernel panic: Fatal exception in interrupt I now managed to test this with wired cards and ipsec, it still panics. I used Via Rhine 2 (VIA Technologies, Inc. VT6102 [Rhine-II] (rev 74)) and exactly the same config. kernel-2.6.1-1.37 seems to work okay like it did with wireless cards, but kernel-2.6.3-1.91 panics. Capture from panic attached, panic looks quite the same that with wireless cards. Created attachment 98357 [details]
oops with via rhine eth and ipsec
Check out https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=117171 Per the posting on bug 117171 this appears to be fixed as of kernel-2.6.3-2.1.238. I updated to kernel-2.6.3-2.1.240 and have been testing for 5 hours with no panic. To stress it a little I copied the .240 kernel rpm, achieving ~3.8Mbps effective transfer rate over wireless connection encrypted by transfer mode ipsec. No panic. Yes, now it seems to work ok with kernel-2.6.3-2.1.242. I have now tested this more than an hour without any problems. With 2.6.3-1.91 it took less than minute to oops, so I think but is fixed. Any idea where the bug were, what was fixed? I noticed that tcpdump works quite differently now with 2.6.3-2.1.242. tcpdump used to look like this: 192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x217e) (DF) 192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x217f) (DF) 192.168.2.1 > 192.168.2.2: ESP(spi=0x05989fc0,seq=0x211e) truncated-ip - 24 bytes missing! 192.168.2.1 > 192.168.2.2: truncated-ip - 40764 bytes missing! 240.4.249.206 > 192.168.2.1: udp (frag 17664:40876@672) [tos 0x98] (ipip-proto-4) 192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x2180) (DF) 192.168.2.1 > 192.168.2.2: ESP(spi=0x05989fc0,seq=0x211e) 192.168.2.2 > 192.168.2.1: ESP(spi=0x71412363,seq=0x2181) (DF) but now the tcpdump looks like this: 192.168.2.1 > 192.168.2.2: ESP(spi=0x04a44fad,seq=0x9c4) 192.168.123.123 > 192.168.2.2: icmp 9: echo request seq 35841 192.168.2.2 > 192.168.2.1: ESP(spi=0x3b2cdf1f,seq=0x989) 192.168.2.1 > 192.168.2.2: ESP(spi=0x04a44fad,seq=0x9c5) 192.168.123.123 > 192.168.2.2: icmp 9: echo request seq 36097 192.168.2.2 > 192.168.2.1: ESP(spi=0x3b2cdf1f,seq=0x98a) I can see some of the traffic as plain text, but all the traffic is encrypted (I verified this with external sniffer, everything was ok). Traffic over tunnels is seen twice, and tcpdump is able to see the traffic in both instances. In the first case, pre-tunnel, the traffic is not encrypted yet. In the second case, after going into the tunnel, the traffic is encrypted. ANyways, I'm closing this bug now that it is fixed and no I have no idea what fixed it, probably some random change that occurred in 2.6.x development. |