1652845 – VM installation hangs with error "Fixing recursive fault but reboot is needed!" on F29 ppc64le box

Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1652845 - VM installation hangs with error "Fixing recursive fault but reboot is needed!" on F29 ppc64le box

Summary: VM installation hangs with error "Fixing recursive fault but reboot is needed...

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	qemu
Sub Component:
Version:	29
Hardware:	ppc64le
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Fedora Virtualization Maintainers
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-11-23 09:15 UTC by Sinny Kumari
Modified:	2019-02-21 05:15 UTC (History)
CC List:	30 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-02-21 02:01:05 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
virt-install log (26.20 KB, text/plain) 2018-11-23 09:16 UTC, Sinny Kumari	no flags	Details
View All

Description Sinny Kumari 2018-11-23 09:15:40 UTC

Description of problem:
Installing a vm on Fedora 29 hangs with error message "Fixing recursive fault but reboot is needed!"

# virt-install --name fah29-20181123 --ram 2048 --vcpus 1 --os-type=linux --os-variant=fedora26 --disk path=/var/lib/libvirt/images/f29_20181123.qcow2,size=6,bus=virtio,format=qcow2 --network bridge=virbr0 --nographic -c  Fedora-AtomicHost-ostree-ppc64le-29-20181123.0.iso

Starting install...
Allocating 'f29_20181123.qcow2'                                                                                                    | 6.0 GB  00:00:00     
Connected to domain fah29-20181123
Escape character is ^]
 26
ERROR: Unhandled relocation (A) type 26

...

[    2.027490] Instruction dump:
[    2.027519] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[    2.027574] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX 
[    2.027632] ---[ end trace ed4dce15a6022c16 ]---
[    2.027668] 
[    3.027692] Fixing recursive fault but reboot is needed!


Note: Same command works fine on Fedora 28
ISO Link - https://kojipkgs.fedoraproject.org/compose/updates/Fedora-29-updates-20181123.0/compose/AtomicHost/ppc64le/iso/Fedora-AtomicHost-ostree-ppc64le-29-20181123.0.iso

Comment 1 Sinny Kumari 2018-11-23 09:16:37 UTC

Created attachment 1508237 [details]
virt-install log

Comment 2 Richard W.M. Jones 2018-11-23 09:17:20 UTC

Which version of qemu?  Which host?

Comment 3 Dan Horák 2018-11-23 09:32:48 UTC

for the record - works fine with F-28 ppc64le host and latest virt stack from https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/virt-preview/ (qemu-system-ppc-3.1.0-0.1.rc1.fc28.ppc64le)

Comment 4 Sinny Kumari 2018-11-23 09:34:42 UTC

qemu version - 3.0.0-1.fc29
Kernel - 4.18.16-300.fc29.ppc64le

I hope by host you meant kernel version on which virt-install is running

Comment 5 Richard W.M. Jones 2018-11-23 10:05:06 UTC

I was also thinking about whether it's POWER 8/9/etc.

Comment 6 Sinny Kumari 2018-11-23 10:08:38 UTC

(In reply to Richard W.M. Jones from comment #5)
> I was also thinking about whether it's POWER 8/9/etc.

It's Power 8

Comment 7 Laurent Vivier 2018-11-23 10:17:24 UTC

(In reply to Sinny Kumari from comment #0)
...
> Connected to domain fah29-20181123
> Escape character is ^]
>  26
> ERROR: Unhandled relocation (A) type 26


Looks like a SLOF bug fixed by:

commit 031cb1b921694f0c8676e0b478a3dccbae9d1639
Author: Thomas Huth <thuth>
Date:   Wed Jun 13 17:58:16 2018 +0200

    libelf: Add REL32 to the list of ignored relocations
    
    When compiling SLOF with GCC 8.1, I currently get a lot of these errors:
    
    ERROR: Unhandled relocation (A) type 26
    
    Type 26 is the "relative 32-bit" relocation. We can simply ignore it
    (like the other relative relocations - REL14, REL24 and REL64) to
    shut up these error messages.
    
    Signed-off-by: Thomas Huth <thuth>
    Reviewed-by: Laurent Vivier <lvivier>
    Signed-off-by: Alexey Kardashevskiy <aik>

Comment 8 Laurent Vivier 2018-11-23 10:24:31 UTC

(In reply to Laurent Vivier from comment #7)
> (In reply to Sinny Kumari from comment #0)
> ...
> > Connected to domain fah29-20181123
> > Escape character is ^]
> >  26
> > ERROR: Unhandled relocation (A) type 26
> 
> 
> Looks like a SLOF bug fixed by:
> 
> commit 031cb1b921694f0c8676e0b478a3dccbae9d1639
> Author: Thomas Huth <thuth>
> Date:   Wed Jun 13 17:58:16 2018 +0200
> 
>     libelf: Add REL32 to the list of ignored relocations

But this should not fix the crash as this commit only mutes the error message.

But perhaps you could try at least a more up-to-date SLOF.

Comment 9 Sinny Kumari 2018-11-26 06:52:52 UTC

(In reply to Laurent Vivier from comment #8)
> (In reply to Laurent Vivier from comment #7)
> > (In reply to Sinny Kumari from comment #0)
> > ...
> > > Connected to domain fah29-20181123
> > > Escape character is ^]
> > >  26
> > > ERROR: Unhandled relocation (A) type 26
> > 
> > 
> > Looks like a SLOF bug fixed by:
> > 
> > commit 031cb1b921694f0c8676e0b478a3dccbae9d1639
> > Author: Thomas Huth <thuth>
> > Date:   Wed Jun 13 17:58:16 2018 +0200
> > 
> >     libelf: Add REL32 to the list of ignored relocations
> 
> But this should not fix the crash as this commit only mutes the error
> message.
> 
> But perhaps you could try at least a more up-to-date SLOF.

Did a scratch build of SLOF from latest master branch (0198ba7) - https://koji.fedoraproject.org/koji/taskinfo?taskID=31121042 and installed it. It only suppresses relocation (A) type 26 error. VM creation still hangs at "[    3.027284] Fixing recursive fault but reboot is needed!"

Comment 10 Sinny Kumari 2018-11-26 08:15:49 UTC

(In reply to Dan Horák from comment #3)
> for the record - works fine with F-28 ppc64le host and latest virt stack
> from https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/virt-preview/
> (qemu-system-ppc-3.1.0-0.1.rc1.fc28.ppc64le)

Using virt stack from copr repo doesn't help on F29

Comment 11 David Gibson 2018-11-30 05:50:39 UTC

Can this be reproduced with an upstream kernel?

Comment 12 Paul Mackerras 2018-11-30 08:28:39 UTC

The real problem seems to be:

[    0.021906] kernel tried to execute exec-protected page (c000000001600a44) -exploit attempt? (uid: 0)
[    0.021972] Unable to handle kernel paging request for instruction fetch
[    0.022018] Faulting instruction address: 0xc000000001600a44

and several similar faults following.  No idea why that would be happening, though.

Comment 13 Laurent Vivier 2018-11-30 10:53:35 UTC

I've not been able to reproduce the problem with the same ISO and the same command line.

virt-install --name fah29-20181123 --ram 2048 --vcpus 1 --os-type=linux \
             --os-variant=fedora26 \
             --disk path=/var/lib/libvirt/images/f29_20181123.qcow2,size=6,bus=virtio,format=qcow2 \
             --network bridge=virbr0 --nographic \
             -c  /tmp/Fedora-AtomicHost-ostree-ppc64le-29-20181123.0.iso

qemu-system-ppc-core-3.0.0-0.2.rc3.fc29.ppc64le
SLOF-0.1.git20180621-1.fc29.noarch
kernel-4.19.4-300.fc29.ppc64le
virt-install-1.6.0-0.3.git3bc7ff24c.fc29.noarch

# lscpu
Architecture:         ppc64le
Byte Order:           Little Endian
CPU(s):               88
On-line CPU(s) list:  0,8,16,24,32,48,56,64,72,80,88
Off-line CPU(s) list: 1-7,9-15,17-23,25-31,33-39,49-55,57-63,65-71,73-79,81-87,89-95
Thread(s) per core:   1
Core(s) per socket:   5
Socket(s):            2
NUMA node(s):         2
Model:                2.1 (pvr 004b 0201)
Model name:           POWER8E (raw), altivec supported
CPU max MHz:          3325.0000
CPU min MHz:          2061.0000
L1d cache:            64K
L1i cache:            32K
L2 cache:             512K
L3 cache:             8192K
NUMA node0 CPU(s):    0,8,16,24,32
NUMA node1 CPU(s):    48,56,64,72,80,88

Comment 14 Laurent Vivier 2018-11-30 10:58:01 UTC

The problem happens only if I use kvm_pr instead of kvm_hv.

Comment 15 Dan Horák 2018-11-30 11:04:38 UTC

and kvm_pr is used for nested virt, which is what Sinny used AFAIK.

Comment 16 Sinny Kumari 2018-11-30 13:45:30 UTC

(In reply to Dan Horák from comment #15)
> and kvm_pr is used for nested virt, which is what Sinny used AFAIK.

Thanks Dan! yes it is using kvm_pr

Comment 18 Laurent Vivier 2018-11-30 14:32:01 UTC

It seems the problem happens with kvm_pr but only with qemu 3.0.0 not with qemu-2.12.0

Comment 19 Sinny Kumari 2018-11-30 15:00:01 UTC

(In reply to Laurent Vivier from comment #18)
> It seems the problem happens with kvm_pr but only with qemu 3.0.0 not with
> qemu-2.12.0

yeah, downgrading to 2.12.0 seems to work fine for me as well

Comment 20 Laurent Vivier 2018-11-30 15:19:55 UTC

Bisected to:

commit 9dceda5fc34a5868012260ee7271c7a6f36cc1f4
Author: David Gibson <david.id.au>
Date:   Mon Apr 16 16:47:19 2018 +1000

    spapr: Limit available pagesizes to provide a consistent guest environment
    
    KVM HV has some limitations (deriving from the hardware) that mean not all
    host-cpu supported pagesizes may be usable in the guest.  At present this
    means that KVM guests and TCG guests may see different available page sizes
    even if they notionally have the same vcpu model.  This is confusing and
    also prevents migration between TCG and KVM.
    
    This patch makes the environment consistent by always allowing the same set
    of pagesizes.  Since we can't remove the KVM limitations, we do this by
    always applying the same limitations it has, even to TCG guests.
    
    Signed-off-by: David Gibson <david.id.au>
    Reviewed-by: Cédric Le Goater <clg>
    Reviewed-by: Greg Kurz <groug>

Comment 21 David Gibson 2018-12-02 23:05:40 UTC

Ah, right.  I'd forgotten about that.

Basically, I sacrificed KVM PR support in order to get consistent guest behaviour.  To fix this we'd have to fix PR in a bunch of places - which means finding someone with the time to do it.

Comment 22 Sinny Kumari 2018-12-03 07:06:01 UTC

Removing needinfo since we know that this issue was introduced with qemu changes in commit 9dceda . Testing with upstream kernel shouldn't be necessary now.

Comment 23 Sinny Kumari 2019-01-03 09:07:09 UTC

This bug is also affecting our ppc64le Fedora imagebuilder - https://pagure.io/fedora-infrastructure/issue/7463
It will be nice to have this issue fixed sooner.

Comment 24 David Gibson 2019-01-04 00:59:52 UTC

It'd be nice, yes, but no-one is currently working on it AFAIK, since people with the knowledge to work on KVM PR are few, and those with the time as well basically non-existent.

Comment 25 Dan Horák 2019-01-04 12:19:52 UTC

David, do you know that status of nested KVM HV? I saw a presentation recently saying it's coming very soon :-)

Comment 26 David Gibson 2019-01-07 04:37:00 UTC

Nested HV is already merged upstream (as of 4.19, though some important bug fixes may still only be in rcs), and is in RHEL8.

Note however that it will only work on POWER9 hosts, and only supports RPT at all levels.  There are plans to support HPT (and thereby POWER8 compat mode) guests at the bottom level, but every other level will need to have a POWER9 RPT kernel.

Comment 27 David Gibson 2019-01-07 04:37:24 UTC

Correction, nested HV was merged upstream in 4.20, not 4.19.

Comment 28 Dan Horák 2019-01-07 08:53:57 UTC

The Power9 requirement should be OK as Fedora infra has Power9 HW. Kernel 4.20 will go the stable releases at some point, qemu 3.1 is in Rawhide/F-30 only, but it could be upgraded on the builders similarly as it's downgraded now. I think we are almost ready to switch :-)

Comment 29 Dan Horák 2019-01-07 09:17:28 UTC

Opened https://pagure.io/fedora-infrastructure/issue/7475 for the infra team.

Comment 30 Dan Horák 2019-01-07 10:14:42 UTC

So with kernel 4.20 and qemu 3.1 (from the virt preview repo) in both host and guest I see

[dan@localhost ~]$ dmesg | grep kvm
[    0.780384] systemd[1]: Detected virtualization kvm.
[    6.031295] systemd[1]: Detected virtualization kvm.
[   26.595043] kvm-hv: Parent hypervisor does not support nesting (rc=-2)

in the guest.

Should the comment #26 read as "important fixes still only in 4.21/5.0 rc"?

Comment 31 David Gibson 2019-01-08 03:09:09 UTC

So, I think there are some fairly important bugs which might only be fixed later than 4.20, but those are mostly with dirty map / migration and some edge cases.  It should basically work in 4.20.

I think what's probably going on here is that you need to enable nesting for each specific L1 guest when you start it on the L0.  Specifically you need to add -machine cap-nested-hv=on to the qemu command line for your L1 (or any level you want to be able to run nested guests).

Without that the L0 won't advertise capability to run nested guests to the L1, which I think is what you're seeing here.

Comment 32 Dan Horák 2019-01-08 10:52:00 UTC

yes, that was it :-)

I've added the following snippet to the L1 domain XML and then it boots an L2 guest
  <features>
    <nested-hv state='on'/>
  </features>

Comment 33 David Gibson 2019-02-20 01:49:34 UTC

Dan, sounds like we could close this with NOTABUG, or maybe WONTFIX, yes?

Comment 34 Dan Horák 2019-02-20 08:35:04 UTC

I would say either WONTFIX or CANTFIX would be appropriate.

Comment 35 Michel Normand 2019-02-20 09:23:54 UTC

At least we should add some documentation to F29 for the restriction, where to do it ?

Comment 36 Suraj 2019-02-21 05:15:29 UTC

If you boot the KVM-PR guest with the following added to the qemu command line:
-machine pseries,cap-hpt-max-page-size=16777216
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Does it make any difference?

Note You need to log in before you can comment on or make changes to this bug.

airlied
amit
berrange
bskeggs
cfergeau
dan
dgibson
dwmw2
ewk
hdegoede
ichavero
itamar
jarodwilson
jglisse
john.j5live
jonathan
josef
kernel-maint
linville
lvivier
mchehab
mjg59
normand
pbonzini
pmac
rjones
skumari
steved
surajjs
virt-maint