Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1167501 - glibc-2.20.90-9.fc22 is FTBFS on aarch64
Summary: glibc-2.20.90-9.fc22 is FTBFS on aarch64
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: glibc
Version: rawhide
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: ---
Assignee: Carlos O'Donell
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARM64, F-ExcludeArch-aarch64
TreeView+ depends on / blocked
 
Reported: 2014-11-24 23:43 UTC by Peter Robinson
Modified: 2016-11-24 12:32 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-12-18 15:34:54 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Peter Robinson 2014-11-24 23:43:05 UTC
glibc-2.20.90-9.fc22 is FTBFBS on aarch64. glibc-2.20.90-8.fc22 was OK

http://arm.koji.fedoraproject.org/koji/taskinfo?taskID=2802673

There's lots of seg faults right through the build. A few examples:

aa_DJ.ISO-8859-1aa_DJ.UTF-8aa_ER.UTF-8aa_ER.UTF-8aa_ET.UTF-8af_ZA.UTF-8af_ZA.ISO-8859-1.........ak_GH.UTF-8@saaho.............../bin/sh: line 13:   713 Segmentation fault      I18NPATH=. GCONV_PATH=/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/iconvdata LC_ALL=C /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/elf/ld-linux-aarch64.so.1 --library-path /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/math:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/elf:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/dlfcn:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nss:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nis:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/rt:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/resolv:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/crypt:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nptl /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/locale/localedef --alias-file=../intl/locale.alias --no-archive -i locales/$input -c -f charmaps/$charset --prefix=/builddir/build/BUILDROOT/glibc-2.20.90-9.fc22.aarch64 $locale



/bin/sh: line 13:   711 Segmentation fault      (core dumped) I18NPATH=. GCONV_PATH=/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/iconvdata LC_ALL=C /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/elf/ld-linux-aarch64.so.1 --library-path /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/math:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/elf:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/dlfcn:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nss:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nis:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/rt:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/resolv:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/crypt:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nptl /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/locale/localedef --alias-file=../intl/locale.alias --no-archive -i locales/$input -c -f charmaps/$charset --prefix=/builddir/build/BUILDROOT/glibc-2.20.90-9.fc22.aarch64 $locale
 done
/bin/sh: line 13:  1340 Segmentation fault      (core dumped) I18NPATH=. GCONV_PATH=/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/iconvdata LC_ALL=C /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/elf/ld-linux-aarch64.so.1 --library-path /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/math:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/elf:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/dlfcn:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nss:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nis:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/rt:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/resolv:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/crypt:/builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/nptl /builddir/build/BUILD/glibc-2.20-205-ga39208b/build-aarch64-redhat-linux/locale/localedef --alias-file=../intl/locale.alias --no-archive -i locales/$input -c -f charmaps/$charset --prefix=/builddir/build/BUILDROOT/glibc-2.20.90-9.fc22.aarch64 $locale
ar_SY.UTF-8... done

Comment 1 Carlos O'Donell 2014-11-26 16:24:42 UTC
I've started a build on a an aarch64 box, and will continue to push this forward to analyze what went wrong.

Rawhide is effectively upstream glibc master, so I wonder why ARM or Linaro didn't see this failure, thus I suspect our tools are slightly out of sync with upstream.

Comment 2 Peter Robinson 2014-11-26 16:27:55 UTC
> Rawhide is effectively upstream glibc master, so I wonder why ARM or Linaro
> didn't see this failure, thus I suspect our tools are slightly out of sync
> with upstream.

It isn't the first time they've submitted something upstream that has seen little to no testing to see if it actually works :-/

Comment 3 Carlos O'Donell 2014-11-27 20:33:19 UTC
OK, so upstream build on an aarch64 box is just fine with no problems.

I need to look at the builder and do a mock chroot build to see what's possibly wrong.

Comment 4 Carlos O'Donell 2014-11-27 20:34:19 UTC
(In reply to Peter Robinson from comment #2)
> > Rawhide is effectively upstream glibc master, so I wonder why ARM or Linaro
> > didn't see this failure, thus I suspect our tools are slightly out of sync
> > with upstream.
> 
> It isn't the first time they've submitted something upstream that has seen
> little to no testing to see if it actually works :-/

Well, I always do a `fedpkg build --scrach --srcpm foo.src.rpm` and then look at the logs before a final push. The problem is that in this process I don't get to peek at the secondary arch builders, and often forget.

Comment 5 Peter Robinson 2014-11-28 11:09:54 UTC
> Well, I always do a `fedpkg build --scrach --srcpm foo.src.rpm` and then
> look at the logs before a final push. The problem is that in this process I
> don't get to peek at the secondary arch builders, and often forget.

"<arch>-koji build --scratch f22 foo.src.rpm" where <arch> is arm/ppc/s390 and that will give you the same for the secondary arches

Comment 6 Carlos O'Donell 2014-11-28 14:56:51 UTC
(In reply to Peter Robinson from comment #5)
> > Well, I always do a `fedpkg build --scrach --srcpm foo.src.rpm` and then
> > look at the logs before a final push. The problem is that in this process I
> > don't get to peek at the secondary arch builders, and often forget.
> 
> "<arch>-koji build --scratch f22 foo.src.rpm" where <arch> is arm/ppc/s390
> and that will give you the same for the secondary arches

Dear RCM,

Please collate all secondary build results in a master review page for the build ;-)

Sincerely,
Carlos.

Comment 7 Carlos O'Donell 2014-11-29 06:16:49 UTC
I can reproduce it at will on the builders.

Switching to mock builds.

Comment 8 Kyle McMartin 2014-12-03 15:15:56 UTC
Reading symbols from /builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/elf/ld-linux-aarch64.so.1...done.
(gdb) run
Starting program: /builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/elf/ld-linux-aarch64.so.1 --library-path /builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/math:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/elf:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/dlfcn:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/nss:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/nis:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/rt:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/resolv:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/crypt:/builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/nptl /builddir/build/BUILD/glibc-2.20-276-g0e7e69b/build-aarch64-redhat-linux/locale/localedef --alias-file=../intl/locale.alias --no-archive -i locales/en_US -c -f charmaps/UTF-8 --prefix=/builddir/build/BUILDROOT/glibc-2.20.90-10.fc22.aarch64 en_US.UTF-8

Program received signal SIGSEGV, Segmentation fault.
0x000003ffb7eab55c in _IO_vfprintf_internal (s=s@entry=0x3ffffffec00, format=format@entry=0x43d940 "%s%s/%.*s%s%s%c", ap=...) at vfprintf.c:1642
1642              process_string_arg (((struct printf_spec *) NULL));
(gdb) bt
#0  0x000003ffb7eab55c in _IO_vfprintf_internal (s=s@entry=0x3ffffffec00, format=format@entry=0x43d940 "%s%s/%.*s%s%s%c", ap=...) at vfprintf.c:1642
#1  0x000003ffb7ed1ec0 in _IO_vasprintf (result_ptr=0x3ffffffee88, format=0x43d940 "%s%s/%.*s%s%s%c", args=...) at vasprintf.c:62
#2  0x000003ffb7eb2070 in ___asprintf (string_ptr=<optimized out>, format=<optimized out>) at asprintf.c:35
#3  0x0000000000403bd0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

fun.

Comment 9 Kyle McMartin 2014-12-03 15:29:31 UTC
      /* We put an additional '\0' at the end of the string because at
	 the end of the function we need another byte for the trailing
	 '/'.  */
      ssize_t n;
      if (normal == NULL)
	n = asprintf (&result, "%s%s/%s%c",
		      output_prefix ?: "", LOCALEDIR, path, '\0');
      else
	n = asprintf (&result, "%s%s/%.*s%s%s%c",
		      output_prefix ?: "", LOCALEDIR,
		      (int) (startp - path), path, normal, endp, '\0');

looks like construct_output_path line 463.

Comment 10 Carlos O'Donell 2014-12-03 15:45:33 UTC
(In reply to Kyle McMartin from comment #9)
>       /* We put an additional '\0' at the end of the string because at
> 	 the end of the function we need another byte for the trailing
> 	 '/'.  */
>       ssize_t n;
>       if (normal == NULL)
> 	n = asprintf (&result, "%s%s/%s%c",
> 		      output_prefix ?: "", LOCALEDIR, path, '\0');
>       else
> 	n = asprintf (&result, "%s%s/%.*s%s%s%c",
> 		      output_prefix ?: "", LOCALEDIR,
> 		      (int) (startp - path), path, normal, endp, '\0');
> 
> looks like construct_output_path line 463.

This really looks like a compiler issue. You can swap out vfprintf.os from another build, and relink, and see if the newly relinked version works properly.

Comment 11 Kyle McMartin 2014-12-03 16:08:23 UTC
inclined to agree, yes.

Comment 12 Kyle McMartin 2014-12-03 17:50:43 UTC
i've bisected this down to:

master@glibc:.% git log sysdeps/aarch64/strchrnul.S                                                                                                                           (kyle@dreadnought:~/src/glibc)
commit be9d4ccc7fe62751db1a5fdcb31958561dbbda9a
Author: Richard Earnshaw <rearnsha>
Date:   Wed Nov 5 13:51:56 2014 +0000

    [AArch64] Add optimized strchrnul.
    
    Here is an optimized implementation of __strchrnul.  The
    simplification that we don't have to track precisely why the loop
    terminates (match or end-of-string) means we have to do less work in
    both setup and the core inner loop.  That means this should never be
    slower than strchr.
    
    As with strchr, the use of LD1 means we do not need different versions
    for big-/little-endian.
master@glibc:.%    

which is more than a little strange...

Comment 13 Carlos O'Donell 2014-12-03 18:48:33 UTC
(In reply to Kyle McMartin from comment #12)
> i've bisected this down to:
> 
> master@glibc:.% git log sysdeps/aarch64/strchrnul.S                         
> (kyle@dreadnought:~/src/glibc)
> commit be9d4ccc7fe62751db1a5fdcb31958561dbbda9a
> Author: Richard Earnshaw <rearnsha>
> Date:   Wed Nov 5 13:51:56 2014 +0000
> 
>     [AArch64] Add optimized strchrnul.
>     
>     Here is an optimized implementation of __strchrnul.  The
>     simplification that we don't have to track precisely why the loop
>     terminates (match or end-of-string) means we have to do less work in
>     both setup and the core inner loop.  That means this should never be
>     slower than strchr.
>     
>     As with strchr, the use of LD1 means we do not need different versions
>     for big-/little-endian.
> master@glibc:.%    
> 
> which is more than a little strange...

Could be an assembly bug in that it violates the EABI, or a compiler bug in calling this assembly function with incorrect EABI invariants e.g. stack alignment etc.

I'm inclined to look for the caller violating the EABI, the assembly expecting something and not getting it, and stomping on the stack, and a failure happening later in the chain of events.

Comment 14 Kyle McMartin 2014-12-03 18:54:23 UTC
indeed... currently hypothesis is that __find_specmb is failing interestingly. following that line of thinking now.

Comment 17 Kyle McMartin 2014-12-18 15:34:54 UTC
this ended up being a caller saved vector register getting clobbered by strchrnul. fixed in fd8c9e7.


Note You need to log in before you can comment on or make changes to this bug.