Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1769600
Summary: | power9 boxes cannot successfully boot any Fedora image with qemu-4.1.0-2.fc31 (pseries-4.1 machine) | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Adam Williamson <awilliam> | ||||
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | urgent | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 31 | CC: | airlied, amit, berrange, bskeggs, cfergeau, clg, dan, dwmw2, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, kevin, linville, lvivier, masami256, mchehab, menantea, mjg59, normand, pbonzini, rhcn, rjones, steved, virt-maint | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | ppc64le | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | kernel-5.3.15-300.fc31 kernel-5.3.15-200.fc30 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2019-12-10 02:55:05 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1071880 | ||||||
Attachments: |
|
Description
Adam Williamson
2019-11-06 22:56:38 UTC
So I rebuilt the current F30 qemu - qemu-3.1.1-2.fc30 - for F31 (I had to disable tests and backport a couple of build fix patches). With that qemu, things work again. So the problem is something between that version of qemu and the version in F31 (qemu-4.1.0-2.fc31). I just testing running qemu directly at a console without a graphical device, and that gets me a traceback: [ 0.015468] ------------[ cut here ]------------ [ 0.015518] kernel BUG at arch/powerpc/include/asm/book3s/64/pgtable.h:612! [ 0.015578] Oops: Exception in kernel mode, sig: 5 [#1] [ 0.015627] LE PAGE_SIZE=64K MMU=Radix MMU=Hash SMP NR_CPUS=1024 NUMA pSeries [ 0.015697] Modules linked in: [ 0.015739] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.4.0-0.rc6.git0.1.fc32.ppc64le #1 [ 0.015812] NIP: c000000000f63294 LR: c000000000f62e44 CTR: 0000000000000000 [ 0.015889] REGS: c0000000fa45f0d0 TRAP: 0700 Not tainted (5.4.0-0.rc6.git0.1.fc32.ppc64le) [ 0.015971] MSR: 8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 44000424 XER: 00000000 [ 0.016050] CFAR: c000000000f63128 IRQMASK: 0 [ 0.016050] GPR00: c000000000f62e44 c0000000fa45f360 c000000001be5400 0000000000000000 [ 0.016050] GPR04: c0000000019c7d38 c0000000fa340030 00000000fa330009 c000000001c15e18 [ 0.016050] GPR08: 0000000000000040 ffe0000000000000 0000000000000000 8418dd352dbd190f [ 0.016050] GPR12: 0000000000000000 c000000001e00000 c00a000080060000 c00a000080060000 [ 0.016050] GPR16: 0000ffffffffffff 80000000000001ae c000000001c24d98 ffffffffffff0000 [ 0.016050] GPR20: c00a00008007ffff c000000001cafca0 c00a00008007ffff ffffffffffff0000 [ 0.016050] GPR24: c00a000080080000 c00a000080080000 c000000001cafca8 c00a000080080000 [ 0.016050] GPR28: c0000000fa32e010 c00a000080060000 ffffffffffff0000 c0000000fa330000 [ 0.016711] NIP [c000000000f63294] ioremap_page_range+0x4c4/0x6e0 [ 0.016778] LR [c000000000f62e44] ioremap_page_range+0x74/0x6e0 [ 0.016846] Call Trace: [ 0.016876] [c0000000fa45f360] [c000000000f62e44] ioremap_page_range+0x74/0x6e0 (unreliable) [ 0.016969] [c0000000fa45f460] [c0000000000934bc] do_ioremap+0x8c/0x120 [ 0.017037] [c0000000fa45f4b0] [c0000000000938e8] __ioremap_caller+0x128/0x140 [ 0.017116] [c0000000fa45f500] [c0000000000931a0] ioremap+0x30/0x50 [ 0.017184] [c0000000fa45f520] [c0000000000d1380] xive_spapr_populate_irq_data+0x170/0x260 [ 0.017263] [c0000000fa45f5c0] [c0000000000cc90c] xive_irq_domain_map+0x8c/0x170 [ 0.017344] [c0000000fa45f600] [c000000000219124] irq_domain_associate+0xb4/0x2d0 [ 0.017424] [c0000000fa45f690] [c000000000219fe0] irq_create_mapping+0x1e0/0x3b0 [ 0.017506] [c0000000fa45f730] [c00000000021ad6c] irq_create_fwspec_mapping+0x27c/0x3e0 [ 0.017586] [c0000000fa45f7c0] [c00000000021af68] irq_create_of_mapping+0x98/0xb0 [ 0.017666] [c0000000fa45f830] [c0000000008d4e48] of_irq_parse_and_map_pci+0x168/0x230 [ 0.017746] [c0000000fa45f910] [c000000000075428] pcibios_setup_device+0x88/0x250 [ 0.017826] [c0000000fa45f9a0] [c000000000077b84] pcibios_setup_bus_devices+0x54/0x100 [ 0.017906] [c0000000fa45fa10] [c0000000000793f0] __of_scan_bus+0x160/0x310 [ 0.017973] [c0000000fa45faf0] [c000000000075fc0] pcibios_scan_phb+0x330/0x390 [ 0.018054] [c0000000fa45fba0] [c00000000139217c] pcibios_init+0x8c/0x128 [ 0.018121] [c0000000fa45fc20] [c0000000000107b0] do_one_initcall+0x60/0x2c0 [ 0.018201] [c0000000fa45fcf0] [c000000001384624] kernel_init_freeable+0x290/0x378 [ 0.018280] [c0000000fa45fdb0] [c000000000010d24] kernel_init+0x2c/0x148 [ 0.018348] [c0000000fa45fe20] [c00000000000bdbc] ret_from_kernel_thread+0x5c/0x80 [ 0.018427] Instruction dump: [ 0.018468] 41820014 3920fe7f 7d494838 7d290074 7929d182 f8e10038 69290001 0b090000 [ 0.018552] 7a098420 0b090000 7bc95960 7929a802 <0b090000> 7fc68b78 e8610048 7dc47378 [ 0.018636] ---[ end trace 85d1e7e46925cee9 ]--- Using machine type pseries-3.1 or pseries-4.0 - instead of the default pseries-4.1 - works. So this is something to do with the pseries-4.1 machine type. *** Bug 1769445 has been marked as a duplicate of this bug. *** This happens because by default interrupt mode is dual with pseries-4.1 and on POWER9 it will switch to xive. You can try starting the default machine forcing the interrupt mode with "-M pseries,ic-mode=xics" Cédric, any idea about this problem with XIVE? What is the host kernel ? and the firmware being used on the system ? [root@openqa-ppc64le-02 adamwill][PROD]# uname -r 5.3.7-301.fc31.ppc64le [root@openqa-ppc64le-02 adamwill][PROD]# lsmcode Version of System Firmware : Product Name : OpenPOWER Firmware Product Version : SUPERMICRO-P9DSU-V1.16-20180531-prod Product Extra : skiboot-v6.0-p1da203b Product Extra : bmc-firmware-version-1.27 Product Extra : occ-77bb5e6-p623d1cd Product Extra : hostboot-f911e5c-pda8239f Product Extra : machine-xml-218a77a Product Extra : sbe-8e0105e Product Extra : hcode-hw051018a.op920 Product Extra : petitboot-v1.7.1-pf773c0d Product Extra : linux-4.16.7-openpower2-pbc45895 This is a boston system. KVM XIVE native support on these systems is partial because the FW is a little old and QEMU runs with an equivalent of kernel_irqchip=off. Could you attach the .config file of the guest kernel please ? I am seeing this also on another power9 box. It had: Version of System Firmware : Product Name : OpenPOWER Firmware Product Version : SUPERMICRO-P9DSU-V2.10-20190208-prod Product Extra : skiboot-v6.0.16 Product Extra : bmc-firmware-version-2.04 Product Extra : occ-39d7745 Product Extra : hostboot-3c093dc-pc0ab4f8 Product Extra : buildroot-2018.05.1-9-gc99f2ee Product Extra : capp-ucode-p9-dd2-v4 Product Extra : machine-xml-218a77a Product Extra : hostboot-binaries-hw020419a.op920 Product Extra : sbe-9515af0 Product Extra : hcode-hw020719a.op920 Product Extra : petitboot-v1.7.5-p79ec4a8 Product Extra : linux-4.17.12-openpower1-ped131c9 and I updated to the latest firmware I could find: Product Name : OpenPOWER Firmware Product Version : SUPERMICRO-P9DSU-V2.14-20190807-prod Product Extra : skiboot-v6.0.20 Product Extra : bmc-firmware-version-2.07 Product Extra : occ-8fa3854 Product Extra : hostboot-8591ded-p4f715ce Product Extra : buildroot-2018.11.3-12-g222837a Product Extra : capp-ucode-p9-dd2-v4 Product Extra : machine-xml-734a35e Product Extra : hostboot-binaries-hw072719a.op920 Product Extra : sbe-b6ee17b Product Extra : hcode-hw072719a.op920 Product Extra : petitboot-v1.7.5-p11ed908 Product Extra : linux-4.19.57-openpower1-p48ee860 no change. Passing -machine pseries-4.0 works fine. Any news here? This is causing some rawhide images to fail... perhaps the default could be moved back to pseries-4.0 in f31 on ppc64le for now? Or is there any workaround that would let us change that default globally? (passing -machine is not really an option since we would need to modify all the various things that call qemu: imagefactory/oz/virt-install/etc). I can not reproduce with mainline. Could you share the kernel .config file please ? I am using 5.3.11-300.fc31.ppc64le stock fedora kernel here. I could reproduce with these guest kernels : https://dl.fedoraproject.org/pub/alt/rawhide-kernel-nodebug/ppc64le/ I suspect a ioremap(-1) in the kernel which was not failing before. This is a kernel bug in the XIVE sPAPR driver for the INTx PCI interrupts which are LSI. These are special and we should not be doing the ioremap. This is failing in Linux 5.4 (+CONFIG_DEBUG_VM). Created attachment 1641728 [details]
powerpc/xive: skip ioremap() of ESB pages for LSI interrupts
Thanks, Cedric. The patch has been posted as https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-December/201480.html FEDORA-2019-7795371386 has been submitted as an update to Fedora 30. https://bodhi.fedoraproject.org/updates/FEDORA-2019-7795371386 Two questions here... Does this fix need to be in the host? Or the guest? or both? Should this work for stable kernels too? or is there something in newer kernels that would cause it to work, but not work backported to older releases? kernel-5.3.15-200.fc30 has been pushed to the Fedora 30 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-7795371386 kernel-5.3.15-300.fc31 has been pushed to the Fedora 31 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-985cb39611 > Does this fix need to be in the host? Or the guest? or both? Guest side only. > Should this work for stable kernels too? Yes. I sent the patch to stable@ also. > or is there something in newer kernels that would cause it to > work, but not work backported to older releases? Backports should be fine. The issue only shows up on Linux 5.4 plus CONFIG_DEBUG_VM. kernel-5.3.15-300.fc31 has been pushed to the Fedora 31 stable repository. If problems still persist, please make note of it in this bug report. kernel-5.3.15-200.fc30 has been pushed to the Fedora 30 stable repository. If problems still persist, please make note of it in this bug report. Kevin asked me to test this, but the easiest thing for me to test with is Rawhide images. Two issues there: I am not 100% sure whether the fix for this is actually in the Rawhide kernel yet (it seems clear it was specifically backported to f30 and f31 kernels, but I cannot tell for sure if it's in Rawhide kernel too), and we haven't had ppc64le images in Rawhide composes since 20191205.n.0. This seems to be because ppc64le kernel builds were turned off for some reason around then and only turned back on yesterday. The next Rawhide compose should get ppc64le images, I'll see if those work when they show up. Don't know if it helps here, but libguestfs has started working again on Rawhide ppc64le, whereas it was broken until yesterday because of (variously) missing kernel or kernel didn't boot on qemu TCG. For example this build uses libguestfs for some testing: https://koji.fedoraproject.org/koji/buildinfo?buildID=1421132 could be indicative, yeah. I know why the kernels went missing and it doesn't have anything to do with this bug besides making it harder to verify the fix, but if cases where we *had* kernels were previously failing to boot but are now booting with the recently-completed kernel build, that's a good sign. This does seem to be fixed for me, at least - I booted today's Rawhide Server netinst image on openqa-ppc64le-02 with `-M pseries-4.1` and it seems to have booted fine, didn't hit the traceback. I guess I'm hitting some other issue... libguestfs-test-tool doesn't work either on host or in guests: Preparing to boot Linux version 5.3.15-300.fc31.ppc64le (mockbuild.fedoraproject.org) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Thu Dec 5 14:47:38 UTC 2019 Detected machine type: 0000000000000101 command line: panic=1 console=hvc0 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 TERM=screen Max number of cores passed to firmware: 1024 (NR_CPUS = 1024) Calling ibm,client-architecture-support...libguestfs: error: appliance closed the connection unexpectedly, see earlier error messages libguestfs: child_cleanup: 0x14089dad0: child process died libguestfs: error: guestfs_launch failed, see earlier error messages libguestfs: trace: launch = -1 (error) libguestfs: trace: close libguestfs: closing guestfs handle 0x14089dad0 (state 0) The cloud and container images are failing in a step where it runs libguestfs on the image... https://koji.fedoraproject.org/koji/taskinfo?taskID=39510967 Shall I file a new libguestfs bug on that and we can debug further? I think so? At least, I'm pretty sure the issue as I first filed it is fixed. A new bug can't hurt. FYI, I've filed: https://bugzilla.redhat.com/show_bug.cgi?id=1784961 on the guestfs issues. This is what is preventing f30/f31/rawhide cloud and containers from composing on ppc64le. It might be related to qemu starting, erroring and restarting: qemu-system-ppc64: warning: kernel_irqchip allowed but unavailable: IRQ_XIVE capability must be present for KVM Falling back to kernel-irqchip=off QEMU warns that it is using the XIVE emulated device and not the KVM XIVE device because the support is not available on the host, the reason being the lack of migration support in the FW, like on Boston systems. It should work just the same, a little slower if you measure performance. Cedric, the problem we are experiencing is the silent restart of the VM after it warns about "IRQ_XIVE capability". Some tools like libguestfs don't expect such behaviour and fails, see bug 1784961. When a new interrupt mode is negotiated (XICS -> XIVE) between the guest OS and the hypervisor, the device tree is updated and the machine is reseted. This is a "standard" procedure in the PAPR environment but yes, it can be a problem for the libvirt tools. QEMU 5.0 has a set of changes that get rid of this reset. Thanks, Cedric, makes sense. |