Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1058991

Summary: gcc PCH bug causes segfaults on aarch64
Product: [Fedora] Fedora Reporter: Brendan Conoboy <blc>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED UPSTREAM QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: jakub, jcm, kmcmartin, law, msalter, pbrobinson
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: gcc-4.8.2-14.fc21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-02-04 15:39:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 922257    
Attachments:
Description Flags
define TRY_EMPTY_VM_SPACE on aarch64 none

Description Brendan Conoboy 2014-01-28 23:14:28 UTC
Description of problem:

Building large packages such as java-1.7.0-openjdk and wxGTK generally results in an ICE


Version-Release number of selected component (if applicable):
This appears to happen with any gcc 4.8, from Fedora 19 through rawhide.  The kernel records a message like the following:

[362391.181971] cc1plus[12825]: unhandled level 3 translation fault (11) at 0x7fae518488, esr 0x92000007
[362391.191200] pgd = ffffffc342e62000
[362391.194682] [7fae518488] *pgd=00000041a9c66003, *pmd=00000042ded58003, *pte=0000000000000000

[362391.204835] Pid: 12825, comm:              cc1plus
[362391.209733] CPU: 7    Not tainted  (3.8.0-mustang_sw_1.08.12-beta_rc.jkkm4 #9)
[362391.217019] PC is at 0xd30ee0
[362391.220116] LR is at 0xc6f474
[362391.223166] pc : [<0000000000d30ee0>] lr : [<0000000000c6f474>] pstate: 60000000
[362391.230656] sp : 0000007ff6a2b700
[362391.234048] x29: 0000007ff6a2b700 x28: 0000000024636b68 
[362391.239491] x27: 0000000000000000 x26: 0000000000e271e8 
[362391.244904] x25: 0000000000000000 x24: 0000000024658650 
[362391.250346] x23: 0000007facf26000 x22: 0000000000000003 
[362391.255757] x21: 0000000000001d22 x20: 0000007fae518450 
[362391.261193] x19: 0000000000001d22 x18: 000000000000000f 
[362391.266605] x17: 0000000000001d22 x16: 0000000000000000 
[362391.272049] x15: 0000000024603650 x14: 0000000000000fe0 
[362391.277460] x13: 0000000024603650 x12: 0000007ff6a2b6d0 
[362391.282905] x11: 00000000fffffff0 x10: 0000000024603200 
[362391.288343] x9 : fefefefefeff736b x8 : 746c756166206e6f 
[362391.293753] x7 : 0000007ff6a2b6d0 x6 : 0000000024603643 
[362391.299190] x5 : 00000000246041e0 x4 : 000000000071f118 
[362391.304600] x3 : 0000000000001d22 x2 : 0000007fae518450 
[362391.310036] x1 : 0000000000001d22 x0 : 0000007fae518450 

Here are two example builds of wxGTK which failed:
http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2225239
...
g++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
make: *** [basedll_convauto.o] Error 4
...

http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2225210
...
g++ -c -o basedll_fs_arc.o -I./.pch/wxprec_basedll -D__WXGTK__     -DWXBUILDING      -I./src/regex  -DwxUSE_GUI=0 -DWXMAKINGDLL_BASE -DwxUSE_BASE=1 -fPIC -DPIC -D_FILE_OFFSET_BITS=64 -D_LARGE_FILES -I/builddir/build/BUILD/wxGTK-2.8.12/lib/wx/include/gtk2-unicode-release-2.8 -I./include -pthread -I/usr/include/gtk-2.0 -I/usr/lib64/gtk-2.0/include -I/usr/include/pango-1.0 -I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/libdrm -I/usr/include/libpng16 -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/libpng16 -I/usr/include/pango-1.0 -I/usr/include/harfbuzz -I/usr/include/pango-1.0 -I/usr/include/freetype2 -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -pthread -I/usr/include/gstreamer-0.10 -I/usr/include/libxml2 -I/usr/include/gconf/2 -I/usr/include/dbus-1.0 -I/usr/lib64/dbus-1.0/include -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -DWX_PRECOMP -pthread -Wall -Wundef -Wno-ctor-dtor-privacy -g -O0 -pthread -I/usr/include/libgnomeprintui-2.2 -I/usr/include/libgnomeprint-2.2 -I/usr/include/libxml2 -I/usr/include/libgnomecanvas-2.0 -I/usr/include/gail-1.0 -I/usr/include/libart-2.0 -I/usr/include/gtk-2.0 -I/usr/lib64/gtk-2.0/include -I/usr/include/pango-1.0 -I/usr/include/atk-1.0 -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/libdrm -I/usr/include/libpng16 -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/libpng16 -I/usr/include/pango-1.0 -I/usr/include/harfbuzz -I/usr/include/pango-1.0 -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/freetype2 -I/usr/include/SDL -D_GNU_SOURCE=1 -D_REENTRANT -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -fno-stack-protector -fno-strict-aliasing ./src/common/fs_arc.cpp
g++: internal compiler error: Segmentation fault (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
make: *** [basedll_fs_arc.o] Error 4
...

Note during these failed builds the value of /proc/sys/kernel/randomize_va_space is "2".  If set to 0 the build will succeed:

http://arm.koji.fedoraproject.org/koji/buildinfo?buildID=183839

How reproducible:
It happens every time, but not in the same place every time.

Steps to Reproduce:
1. Set /proc/sys/kernel/randomize_va_space to 2
2. Build wxGTK with rawhide
3. Boom

Actual results:

ICE

Expected results:

Successful build.

Additional info:

This isuse was originally discovered building java-1.7.0-openjdk, but wxGTK produces the error faster and doesn't pull Java into the picture.

Comment 1 Jakub Jelinek 2014-01-28 23:22:20 UTC
First of all, that looks like a kernel bug, what normal process doesn't shouldn't result in such messages.

Second, supposedly aarch64 should add it's own define to gcc/config/host-linux.c, but you really want to file/discuss this upstream, I have no idea what address would be appropriate for that, no idea what the memory layout on aarch64 is etc.
If it is added upstream, I can consider backporting it.

Comment 2 Kyle McMartin 2014-01-29 23:45:55 UTC
The kernel message is just the result of the SIGSEGV... it means we had a valid translation for the vaddr through two levels of the page table, but not the third... It'll probably not generate such descriptive fault messages in production when print-fatal-signals is off, but since it's such a new port, such messages are a bit instructive.

In any event, thanks for the hint at looking at host-linux.c, I think I've got a theory as to why this is occuring as a result of it!

Comment 3 Kyle McMartin 2014-01-30 00:17:45 UTC
Created attachment 857311 [details]
define TRY_EMPTY_VM_SPACE on aarch64

http://gcc.gnu.org/bugzilla//show_bug.cgi?id=45979
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14940

looks like the same issue as in these PRs...

I've attached a ``fix''. It looks like PCH is basically relying on our mmap being effectively a MAP_FIXED and the address being unused... on AArch64, we're mapping executables at 4MB, so the attempt to mmap at 0 would fail for two reasons (we also disallow mmap to the first page or so CONFIG_MMAP_MIN_ADDR.)

Anyway, things look hunky dory in my testing when using 0x100000000 as X86_64 and others do.

Comment 5 Peter Robinson 2014-02-04 15:39:28 UTC
Now patched locally and sent upstream