Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 865022

Summary: soft lockup bug in kernel-kirkwood on seagate dockstar
Product: [Fedora] Fedora Reporter: Till Maas <opensource>
Component: kernelAssignee: Jon Masters <jcm>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, opensource, pbrobinson
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-04-13 08:16:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 245418    
Attachments:
Description Flags
bug reproduced with Fedora 18 Beta none

Description Till Maas 2012-10-10 16:11:50 UTC
Description of problem:
I try to run cnucnu (the current implementation of the upstream release monitoring) on my seagate dockstar running Fedora ARM 17. Once it seemd to have crashed the system, i.e. ssh died. I rebooted and restarted it with a serial console attached. There I see bug messages from the kernel:
fedora-arm login: [ 1243.004901] BUG: soft lockup - CPU#0 stuck for 22s! [cnucnu:586]
[ 1243.010934] Modules linked in: lockd sunrpc vfat fat mtdchar ofpart orion_nand nand nand_ecc nand_ids mtd mv643xx_eth mv_cesa usb_storage [last unloaded: scsi_wait_scan]
[ 1243.026253]
[ 1243.027747] Pid: 586, comm:               cnucnu
[ 1243.032388] CPU: 0    Not tainted  (3.4.2-3.fc17.armv5tel.kirkwood #1)
[ 1243.038955] PC is at feroceon_l2_inv_range+0x18/0xb8
[ 1243.043950] LR is at ___dma_page_dev_to_cpu+0x5c/0xcc
[ 1243.049027] pc : [<c00137c4>]    lr : [<c000fb9c>]    psr: 20000013
[ 1243.049032] sp : c6b5bc68  ip : c022e7c8  fp : 05a80864
[ 1243.060569] r10: 00000db4  r9 : c71209e8  r8 : 03beadb4
[ 1243.065819] r7 : 00000002  r6 : 00000db4  r5 : 03beadf5  r4 : 03beadb4
[ 1243.072377] r3 : c00137ac  r2 : 00000000  r1 : 03beadf5  r0 : 03beadb4
[ 1243.078934] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
[ 1243.086102] Control: 0005397f  Table: 06b4c000  DAC: 00000015
[ 1243.091893] [<c000f1e4>] (unwind_backtrace+0x0/0x124) from [<c0077a18>] (watchdog_timer_fn+0xf0/0x144)
[ 1243.101248] [<c0077a18>] (watchdog_timer_fn+0xf0/0x144) from [<c00388fc>] (__run_hrtimer+0xb0/0x1d0)
[ 1243.110430] [<c00388fc>] (__run_hrtimer+0xb0/0x1d0) from [<c0039138>] (hrtimer_interrupt+0x104/0x248)
[ 1243.119701] [<c0039138>] (hrtimer_interrupt+0x104/0x248) from [<c001602c>] (orion_timer_interrupt+0x24/0x34)
[ 1243.129581] [<c001602c>] (orion_timer_interrupt+0x24/0x34) from [<c00781a4>] (handle_irq_event_percpu+0x38/0x240)
[ 1243.139894] [<c00781a4>] (handle_irq_event_percpu+0x38/0x240) from [<c00783dc>] (handle_irq_event+0x30/0x40)
[ 1243.149775] [<c00783dc>] (handle_irq_event+0x30/0x40) from [<c007aa1c>] (handle_level_irq+0xbc/0xd0)
[ 1243.158957] [<c007aa1c>] (handle_level_irq+0xbc/0xd0) from [<c0077bd0>] (generic_handle_irq+0x28/0x38)
[ 1243.168312] [<c0077bd0>] (generic_handle_irq+0x28/0x38) from [<c0009b84>] (handle_IRQ+0x68/0x8c)
[ 1243.177145] [<c0009b84>] (handle_IRQ+0x68/0x8c) from [<c042bcf4>] (__irq_svc+0x34/0x80)
[ 1243.185195] [<c042bcf4>] (__irq_svc+0x34/0x80) from [<c00137c4>] (feroceon_l2_inv_range+0x18/0xb8)
[ 1243.194205] [<c00137c4>] (feroceon_l2_inv_range+0x18/0xb8) from [<c000fb9c>] (___dma_page_dev_to_cpu+0x5c/0xcc)
[ 1243.204350] [<c000fb9c>] (___dma_page_dev_to_cpu+0x5c/0xcc) from [<c022c6c8>] (dma_async_memcpy_buf_to_pg+0x15c/0x1a8)
[ 1243.215106] [<c022c6c8>] (dma_async_memcpy_buf_to_pg+0x15c/0x1a8) from [<c022daf0>] (dma_memcpy_to_iovec+0xd4/0x158)
[ 1243.225688] [<c022daf0>] (dma_memcpy_to_iovec+0xd4/0x158) from [<c036bf10>] (dma_skb_copy_datagram_iovec+0x5c/0x1d4)
[ 1243.236273] [<c036bf10>] (dma_skb_copy_datagram_iovec+0x5c/0x1d4) from [<c0392654>] (tcp_recvmsg+0x5a8/0x9f4)
[ 1243.246249] [<c0392654>] (tcp_recvmsg+0x5a8/0x9f4) from [<c03b0298>] (inet_recvmsg+0x48/0x5c)
[ 1243.254822] [<c03b0298>] (inet_recvmsg+0x48/0x5c) from [<c03455c0>] (sock_recvmsg+0xc0/0xe8)
[ 1243.263311] [<c03455c0>] (sock_recvmsg+0xc0/0xe8) from [<c0347010>] (sys_recvfrom+0xa0/0x10c)
[ 1243.271882] [<c0347010>] (sys_recvfrom+0xa0/0x10c) from [<c0347098>] (sys_recv+0x1c/0x20)
[ 1243.280103] [<c0347098>] (sys_recv+0x1c/0x20) from [<c0008c40>] (ret_fast_syscall+0x0/0x2c)
[ 1271.003110] BUG: soft lockup - CPU#0 stuck for 22s! [cnucnu:586]

Version-Release number of selected component (if applicable):
3.4.2-3.fc17.armv5tel.kirkwood 

How reproducible:
always

Steps to Reproduce:
1. Install Fedora ARM 17 on a seagate dockstar
2. Install cnucnu from Fedora 18 (it is a noarch python package)
3. run cnucnu:
cnucnu.py report-outdated  |&tee -a cnucnu.log | tee -a cnucnu-last.log
  
Actual results:
ssh dies after some seconds, there are kernel bug messages.

Expected results:
cnucnu should just run

Additional info:
The newer kernel-kirkwood package does not allow the dockstar to boot, therefore I could not test it.

Comment 1 Peter Robinson 2013-01-09 10:45:58 UTC
Can you retest with a F-18 3.6.x kernel (or image) or even a rawhide 3.7 kernel?

Comment 2 Till Maas 2013-01-09 23:11:13 UTC
(In reply to comment #1)
> Can you retest with a F-18 3.6.x kernel (or image) or even a rawhide 3.7
> kernel?

I cannot run the tool that triggered the bug on F18:
$ cnucnu report-outdated
Illegal instruction

$ uname -a
Linux kirkwood-f18-v5tel 3.6.10-6.fc18.armv5tel.kirkwood #1 Mon Dec 17 14:58:08 EST 2012 armv5tel armv5tel armv5tel GNU/Linux

Comment 3 Till Maas 2013-01-11 20:28:09 UTC
Created attachment 677084 [details]
bug reproduced with Fedora 18 Beta

The bug is still present in 3.6.10-6.fc18.armv5tel.kirkwood #1)

Comment 4 Peter Robinson 2013-03-31 18:56:14 UTC
Have you tested a 3.8.x kernel?

Comment 5 Peter Robinson 2013-04-13 08:16:40 UTC
Closing as no response and likely fixed in 3.8.x