Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1194366
Summary: | ftrace writes to random memory when loading a module | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | rawhide | CC: | drjones, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, mchehab, msalter | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | aarch64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | kernel-4.0.0-0.rc1.git0.2.fc23 | Doc Type: | Bug Fix | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-03-25 12:24:59 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 910269 | ||||||||
Attachments: |
|
Description
Richard W.M. Jones
2015-02-19 16:28:23 UTC
I should note that what specifically happens is the guest starts up, and then suddenly aborts (when inserting the guest kernel module). In the test above, I was using guest kernel == host kernel == 3.20.0-0.rc0.git7.3.bz1193875.fc23. I tried this again, with: guest kernel == 3.20.0-0.rc0.git7.3.bz1193875.fc23 host kernel == 3.19.0-0.rc7.git1.1.fc22 This *also* crashes when loading the kernel module in the guest. So it seems as if the problem is some new processor instruction is used in a kernel module (possibly crc32-arm64.ko), which KVM is unable to emulate. A couple of other observations: (1) Doesn't fail with host == guest == 3.19.0-0.rc7.git1.1.fc22.aarch64 (2) I'm very certain the troublesome kernel module is either `crc32-arm64.ko' or `crc32.ko', and I'm about 90% certain it is `crc32-arm64.ko'. Created attachment 993700 [details]
crc32-arm64.ko
It occurred to me that maybe people wouldn't be able to
download the suspected modules, so I'll attached them here.
Created attachment 993701 [details]
crc32.ko
Here is a diff of the instructions used in crc32-arm64 3.19 vs 3.20. ldrh mov mvn +nop orr ret Seems strange. I checked the 3.20 module and there is a nop instruction inserted after every ret or unconditional branch. I have no idea if nop would be a problem on aarch64. Maybe this is a wild goose chase. I noticed that qemu dumps the registers on stderr before exiting: error: kvm run failed Function not implemented PC=fffffe000046cf4c SP=fffffe0028383ba0 X00=fffffdfffaa20020 X01=fffffe0028383c00 X02=fffffffffffffffc X03=00000000d503201f X04=fffffdfffaa20024 X05=ffffffffffffffff X06=0000000000000bb0 X07=fffffe0001a3c3b8 X08=fffffe0028380000 X09=fffffe0000f91000 X10=fffffe0001cfc000 X11=fffffe000123b000 X12=0000000000000000 X13=fffffe0001a3b808 X14=ffff000000000000 X15=ffffffffffffffff X16=fffffe0000165898 X17=0000000000000001 X18=0000000000000d71 X19=0000040000000000 X20=fffffdfffc000020 X21=0000000000000140 X22=fffffe0029471180 X23=fffffe0000f218e8 X24=0000000000000000 X25=0000000000000000 X26=fffffe0001d6d000 X27=fffffdfffc0007a8 X28=fffffe0029660000 X29=fffffe0028383ba0 X30=fffffe00001e1bb0 PSTATE=600001c5 (flags -ZC-) Not very helpful without knowing the address space layout of the guest kernel. I resolved PC against the symbol table, and it happens in the guest kernel function '__copy_to_user', at the place marked with <<< below: fffffe00003e3040 <__copy_to_user>: fffffe00003e3040: 8b020004 add x4, x0, x2 fffffe00003e3044: f1002042 subs x2, x2, #0x8 fffffe00003e3048: 540000a4 b.mi fffffe00003e305c <__copy_to_user+0x1c> fffffe00003e304c: f8408423 ldr x3, [x1],#8 fffffe00003e3050: f1002042 subs x2, x2, #0x8 fffffe00003e3054: f8008403 str x3, [x0],#8 fffffe00003e3058: 54ffffa5 b.pl fffffe00003e304c <__copy_to_user+0xc> fffffe00003e305c: b1001042 adds x2, x2, #0x4 fffffe00003e3060: 54000084 b.mi fffffe00003e3070 <__copy_to_user+0x30> fffffe00003e3064: b8404423 ldr w3, [x1],#4 fffffe00003e3068: d1001042 sub x2, x2, #0x4 fffffe00003e306c: b8004403 str w3, [x0],#4 <<<<<<< fffffe00003e3070: b1000842 adds x2, x2, #0x2 fffffe00003e3074: 54000084 b.mi fffffe00003e3084 <__copy_to_user+0x44> fffffe00003e3078: 78402423 ldrh w3, [x1],#2 fffffe00003e307c: d1000842 sub x2, x2, #0x2 fffffe00003e3080: 78002403 strh w3, [x0],#2 fffffe00003e3084: b1000442 adds x2, x2, #0x1 fffffe00003e3088: 54000064 b.mi fffffe00003e3094 <__copy_to_user+0x54> fffffe00003e308c: 39400023 ldrb w3, [x1] fffffe00003e3090: 39000003 strb w3, [x0] fffffe00003e3094: d2800000 mov x0, #0x0 // #0 fffffe00003e3098: d65f03c0 ret Unfortunately qemu doesn't dump a stack trace before it exits. I will try to attach gdb to see if that gives any extra information. gdb gives this stack trace, which looks bogus to me: Program received signal SIGABRT, Aborted. __copy_to_user () at arch/arm64/lib/copy_to_user.S:43 43 USER(9f, str w3, [x0], #4 ) (gdb) bt #0 __copy_to_user () at arch/arm64/lib/copy_to_user.S:43 #1 0xfffffe00001a6558 in __probe_kernel_write (dst=<optimized out>, src=<optimized out>, size=<optimized out>) at mm/maccess.c:56 #2 0x0000000000000000 in ?? () More gdb information: (gdb) info registers x0 0xfffffdfffaa20020 -2199113301984 x1 0xfffffe0028343c20 -2198348743648 x2 0xfffffffffffffffc -4 x3 0xd503201f 3573751839 x4 0xfffffdfffaa20024 -2199113301980 x5 0xffffffffffffffff -1 x6 0xfffffe0000a1b588 -2199012657784 x7 0xfffffe0000a1b570 -2199012657808 x8 0xfffffe0000a1b558 -2199012657832 x9 0xfffffdfee01a4480 -2203853372288 x10 0x101010101010101 72340172838076673 x11 0x6 6 x12 0x0 0 x13 0xffffffffffffffff -1 x14 0xffff000000000000 -281474976710656 x15 0xffffffffffffffff -1 x16 0xfffffe000013a5e0 -2199021967904 x17 0x1 1 x18 0x0 0 x19 0x40000000000 4398046511104 x20 0xfffffdfffc000020 -2199090364384 x21 0x140 320 x22 0x0 0 x23 0xfffffe0000dc17d8 -2199008831528 x24 0xfffffe000009c5b0 -2199022615120 x25 0xfffffe0000f65000 -2199007113216 x26 0x0 0 x27 0x0 0 x28 0xfffffe0029120000 -2198334210048 x29 0xfffffe0028343bc0 -2198348743744 x30 0xfffffe00001a6558 -2199021525672 sp 0xfffffe0028343bc0 0xfffffe0028343bc0 pc 0xfffffe00003e306c 0xfffffe00003e306c <__copy_to_user+44> cpsr 0x600001c5 1610613189 fpsr 0x0 0 fpcr 0x0 0 (gdb) frame 1 #1 0xfffffe00001a6558 in __probe_kernel_write (dst=<optimized out>, src=<optimized out>, size=<optimized out>) at mm/maccess.c:56 56 ret = __copy_to_user_inatomic((__force void __user *)dst, src, size); ftrace is implicated: https://lists.cs.columbia.edu/pipermail/kvmarm/2015-February/013652.html Marc Zyngier posted a patch here which works for me: http://lists.infradead.org/pipermail/linux-arm-kernel/2015-February/325445.html I intend to add this to the kernel package in Rawhide unless someone gets there first. Similar new bug in 4.2.0: https://bugzilla.redhat.com/show_bug.cgi?id=1269779 |