Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1271387

Summary: systemd segfaults when starting up, possibly in 'detect_virtualization'
Product: [Fedora] Fedora Reporter: Richard W.M. Jones <rjones>
Component: binutilsAssignee: Nick Clifton <nickc>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: jakub, johannbg, jsynacek, lnykryn, mjuszkie, msekleta, nickc, pbrobinson, riku.voipio, s, systemd-maint, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: aarch64   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-08 11:44:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 922257    
Attachments:
Description Flags
serial port log none

Description Richard W.M. Jones 2015-10-13 19:53:14 UTC
Created attachment 1082618 [details]
serial port log

Description of problem:

I don't have much detail, but my Fedora Rawhide/aarch64 machine
is now unbootable.  The last log messages are:

         Starting Switch Root...
[    4.937969] systemd-journald[225]: Received SIGTERM from PID 1 (systemd).
[    5.824280] audit_printk_skb: 105 callbacks suppressed
[    5.829396] audit: type=1403 audit(1444765672.070:46): policy loaded auid=4294967295 ses=4294967295
[    5.856387] systemd[1]: Successfully loaded SELinux policy in 245.214ms.
[    6.021822] systemd[1]: Relabelled /dev and /run in 44.831ms.
[    6.062282] systemd[1]: unhandled level 0 translation fault (11) at 0x6aa484d8a50, esr 0x92000004
[    6.071117] pgd = fffffe00c81f0000
[    6.074499] [6aa484d8a50] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[    6.083013] 
[    6.084498] CPU: 7 PID: 1 Comm: systemd Tainted: G        W       4.2.0-1.fc24.aarch64 #1
[    6.092634] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Aug 26 2015
[    6.099903] task: fffffe03dc080000 ti: fffffe03dc100000 task.ti: fffffe03dc100000
[    6.107351] PC is at 0x2aabcc03d98
[    6.110734] LR is at 0x2aabcc03d80
[    6.114116] pc : [<000002aabcc03d98>] lr : [<000002aabcc03d80>] pstate: a0000000
[    6.121475] sp : 000003ffe90b1ba0
[    6.124770] x29: 000003ffe90b1bc0 x28: 000002aabcd30000 
[    6.130075] x27: 000002aabcd2f000 x26: 000003ffe90b1fc8 
[    6.135382] x25: 00052201b8524844 x24: 0000000000000005 
[    6.140687] x23: 000002aaeee79ea0 x22: 000003ffe90b1d08 
[    6.145990] x21: 000002aabcd31000 x20: 000002aabcd30000 
[    6.151295] x19: 000002aaeee79ea0 x18: 000002aabccbae38 
[    6.156599] x17: 000003ff8b38f0e0 x16: 000002aabcd2f4a0 
[    6.161904] x15: 000002aabcca3353 x14: 000002aabccc6a10 
[    6.167204] x13: 000002aabccbae38 x12: 000002aabcca3353 
[    6.172508] x11: 000002aabcca3353 x10: 000002aabcca3353 
[    6.177811] x9 : 000003ffe90b0700 x8 : 00000000000000d3 
[    6.183119] x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff7dff284d 
[    6.188420] x5 : 00000000000000b0 x4 : 0000000000000000 
[    6.193724] x3 : 0000000000000004 x2 : 0000000000000014 
[    6.199027] x1 : 000003ff8b7a9a50 x0 : 000002aabcd2f000 
[    6.204332] 
[    6.206064] audit: type=1701 audit(1444765672.450:47): auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:init_t:s0 pid=505 comm="systemd" exe="/usr/lib/systemd/systemd" sig=11

The full messages are attached.

Version-Release number of selected component (if applicable):

Probably systemd-227-1.fc24

How reproducible:

100%

Steps to Reproduce:
1. Install systemd, reboot.

Comment 1 Richard W.M. Jones 2015-10-14 09:19:09 UTC
It was systemd-227-1.fc24 which is broken.

I recovered the system by booting it with 'init=/bin/bash', manually
bringing up LVM, network etc., and then dnf downgrading to the previous
working version of systemd.

It looks as if systemd-coredump collected a core file from when
dnf installed the broken systemd - it seems as if it also core dumped
during the service reload.  Hopefully this has the same root cause
as the crash on start up.  Here is the stack trace:

#0  0x000003ff935044d8 in kill () from /lib64/libc.so.6
#1  0x000002aaae9c616c in crash.lto_priv.246 (sig=11) at src/core/main.c:185
#2  <signal handler called>
#3  0x000002aaae993d98 in detect_vm () at src/basic/virt.c:263
#4  detect_virtualization () at src/basic/virt.c:410
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Comment 2 Richard W.M. Jones 2015-10-14 09:25:21 UTC
Crash happens here:

int detect_vm(void) {
        static thread_local int cached_found = _VIRTUALIZATION_INVALID;
        int r;

        if (cached_found >= 0)        <--- line 263
                return cached_found;

So could be something to do with thread-local variables & aarch64.

Comment 3 Marcin Juszkiewicz 2015-11-19 14:01:49 UTC
*** Bug 1282392 has been marked as a duplicate of this bug. ***

Comment 4 Marcin Juszkiewicz 2015-11-19 19:41:13 UTC
It is even older. systemd 226-3 and "sudo systemd-nspawn -D $PWD/ROOTFS/ -b" also ends with error:

[ 2251.691640] systemd-nspawn[1109]: unhandled level 0 translation fault (11) at 0x6aa70af64a0, esr 0x92000044
[ 2251.701357] pgd = fffffe03dfe10000
[ 2251.704739] [6aa70af64a0] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[ 2251.713258] 
[ 2251.714742] CPU: 0 PID: 1109 Comm: systemd-nspawn Tainted: G        W       4.3.0-1.fc24.aarch64 #1
[ 2251.723743] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Oct 20 2015
[ 2251.731016] task: fffffe03dfcb8980 ti: fffffe03dade8000 task.ti: fffffe03dade8000
[ 2251.738464] PC is at 0x2aac8080fcc
[ 2251.741845] LR is at 0x2aac80872d4
[ 2251.745227] pc : [<000002aac8080fcc>] lr : [<000002aac80872d4>] pstate: 00000000
[ 2251.752586] sp : 000003fff808b810
[ 2251.755882] x29: 000003fff808ba80 x28: 000002aaf81210d0 
[ 2251.761189] x27: 0000000000000001 x26: 0000000000000001 
[ 2251.766493] x25: 000002aac810067c x24: 000002aaf81210d0 
[ 2251.771799] x23: 0000000000000001 x22: 0000000000000000 
[ 2251.777100] x21: 0000000000000000 x20: 0000000000000000 
[ 2251.782407] x19: 000002aaf8120030 x18: 0000000000000001 
[ 2251.787711] x17: 000003ffa86444d8 x16: 000002aac80ff940 
[ 2251.793014] x15: 0000000000000060 x14: 0000000000000000 
[ 2251.798321] x13: 0000000000000000 x12: 0000000000000000 
[ 2251.803623] x11: 0000000000000000 x10: 0000000000000000 
[ 2251.808930] x9 : 0000000000000000 x8 : 0000000000000087 
[ 2251.814232] x7 : 0000000000000000 x6 : 0000000000000001 
[ 2251.819539] x5 : 0000000000000000 x4 : 0000000000000001 
[ 2251.824842] x3 : 0000000000000000 x2 : 00000000ffffffff 
[ 2251.830151] x1 : 000003ffa89f74a0 x0 : 000002aac80ff000 
[ 2251.835454]

Comment 5 Zbigniew Jędrzejewski-Szmek 2015-12-07 04:32:59 UTC
Is there a fedora-developer-accessible aarch64 machine for debugging?

Comment 6 Peter Robinson 2015-12-07 06:52:30 UTC
(In reply to Zbigniew Jędrzejewski-Szmek from comment #5)
> Is there a fedora-developer-accessible aarch64 machine for debugging?

https://lists.fedoraproject.org/pipermail/arm/2015-November/010142.html

Comment 7 Richard W.M. Jones 2016-01-04 09:59:40 UTC
This post shows you how to boot an aarch64 VM on x86-64:

  https://rwmj.wordpress.com/2015/05/26/fedora-22-aarch64-virt-builder-image/

Replace s/22/23/ in that, but everything should otherwise work.

However it's super-slow.  Do we have aarch64 remote servers
available for debugging?  I know we have power machines for a
similar purpose.

Comment 8 Peter Robinson 2016-01-04 10:50:25 UTC
There are machines in beaker just like ppc

Comment 9 Riku Voipio 2016-01-04 14:16:37 UTC
I've just debugged similar issue in Debian. The bug appears to be in binutils. systemd compiled with:

binutils_2.25.1-7 -> crash
binutils_2.25.51.20151113-1 -> boots fine

it would appear somewhere between 2.25.1 and Git master on 20151113 a fix to binutils has been applied. probably related to TLS relocations...  

I reverified it locally that systemd 228 compiled with 2.25.1 crashed and todays snapshot for git head worked - unfortunately don't have time to dig the commit for backporting.

Comment 10 Marcin Juszkiewicz 2016-01-05 10:47:10 UTC
Built binutils 2.26.51 for Fedora. Rebuilt systemd 228 with it. Installed in F23 vm, updated initramfs, rebooted.

Works.

Now the question is: when binutils 2.26 will be released...

Comment 11 Marcin Juszkiewicz 2016-01-13 13:50:00 UTC
Looks like it is not only systemd ;(

Are there chances for binutils 2.26.snapshot before mass rebuilt will take place?

[26739.411673] libvirtd[4669]: unhandled level 2 translation fault (11) at 0x2ae1e70fdda, esr 0x92000006
[26739.420907] pgd = fffffe00bedc0000
[26739.424316] [2ae1e70fdda] *pgd=0000000000000000, *pud=0000000000000000, *pmd=0000000000000000
[26739.432880] 
[26739.434386] CPU: 0 PID: 4669 Comm: libvirtd Tainted: G        W       4.4.0-0.rc8.git1.1.fc24.aarch64 #1
[26739.443833] Hardware name: AppliedMicro Mustang/Mustang, BIOS 1.1.0 Oct 20 2015
[26739.451124] task: fffffe03d52ad700 ti: fffffe03d6778000 task.ti: fffffe03d6778000
[26739.458584] PC is at 0x3ff802c1964
[26739.461975] LR is at 0x3ff802c18bc
[26739.465371] pc : [<000003ff802c1964>] lr : [<000003ff802c18bc>] pstate: 20000000
[26739.472756] sp : 000003ff76dfdc20
[26739.476065] x29: 000003ff76dfdc20 x28: 000002ab0e70fdd0 
[26739.481385] x27: 0000000031000000 x26: 000003ff76dfdd70 
[26739.486718] x25: 0000000000000000 x24: 0000000000000000 
[26739.492050] x23: 000003ff5400b3e0 x22: 0000000000000001 
[26739.497367] x21: 0000000000000003 x20: 0000000000100006 
[26739.502703] x19: 000002ab0e70fa70 x18: 000003ff540086f2 
[26739.508037] x17: 0000000000000001 x16: 0000000000000001 
[26739.513358] x15: 0000000000000004 x14: 000002ab0e70f2a0 
[26739.518694] x13: 000003ff5400a110 x12: 0000000000000006 
[26739.524030] x11: 0000000000000006 x10: 000003ff76dfde20 
[26739.529356] x9 : 000003ff76dfdae8 x8 : 000003ff540086f1 
[26739.534688] x7 : 0000000000000004 x6 : 0000000000000030 
[26739.540024] x5 : 0000000000000113 x4 : 0000000000000000 
[26739.545349] x3 : 000003ff76dfdd70 x2 : 000003ff5400b3e0 
[26739.550675] x1 : 0000000000000002 x0 : 000002ae1e70fdd0

Comment 12 Peter Robinson 2016-01-13 14:48:36 UTC
Hi Nick, any chance you could take a look at this for us?

Comment 13 Nick Clifton 2016-01-13 16:52:05 UTC
Hi Marcin,

> Are there chances for binutils 2.26.snapshot before mass rebuilt will take
> place?

2.26 should be coming out next week.  Will that be OK, or would you prefer me to create a tarball from today's current sources and upload that ?

Cheers
  Nick

Comment 14 Marcin Juszkiewicz 2016-01-13 17:21:43 UTC
Yes, we can wait. Need to have it before any mass rebuilds take place.

Comment 15 Richard W.M. Jones 2016-01-26 10:21:24 UTC
FWIW binutils 2.26 has been released:

https://release-monitoring.org/project/7981/

Comment 16 Marcin Juszkiewicz 2016-01-26 10:35:47 UTC
APM Mustang boots fine with systemd 228-7 built using binutils 2.26-2 (both built locally).

Comment 17 Marcin Juszkiewicz 2016-01-26 10:55:08 UTC
https://fedora.juszkiewicz.com.pl/20160105-systemd-binutils/ has both binutils 2.26-2 packages and systemd 228-7 built with them.

Comment 18 Nick Clifton 2016-01-29 12:48:34 UTC
Rawhide binutils appears to fix the problem.

Comment 19 Richard W.M. Jones 2016-02-03 13:52:20 UTC
(In reply to Marcin Juszkiewicz from comment #17)
> https://fedora.juszkiewicz.com.pl/20160105-systemd-binutils/ has both
> binutils 2.26-2 packages and systemd 228-7 built with them.

Can confirm that this systemd package works.

The binutils package is no longer needed since arm.koji providers
a newer version.

Comment 20 Jan Kurik 2016-02-24 13:50:19 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 24 development cycle.
Changing version to '24'.

More information and reason for this action is here:
https://fedoraproject.org/wiki/Fedora_Program_Management/HouseKeeping/Fedora24#Rawhide_Rebase

Comment 21 Peter Robinson 2016-03-08 11:44:20 UTC
new binutils done, new systemd now built.