Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1799408 - git: FTBFS in Fedora rawhide/f32
Summary: git: FTBFS in Fedora rawhide/f32
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: git
Version: 32
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Todd Zullinger
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1799087 (view as bug list)
Depends On:
Blocks: ZedoraTracker F32FTBFS GCC10 1799531
TreeView+ depends on / blocked
 
Reported: 2020-02-06 16:59 UTC by Fedora Release Engineering
Modified: 2020-03-26 08:05 UTC (History)
12 users (show)

Fixed In Version: gcc-10.0.1-0.9.fc33
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-16 20:39:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
build.log (deleted)
2020-02-06 16:59 UTC, Fedora Release Engineering
no flags Details
root.log (deleted)
2020-02-06 16:59 UTC, Fedora Release Engineering
no flags Details
state.log (deleted)
2020-02-06 16:59 UTC, Fedora Release Engineering
no flags Details
changes to build git with clang (deleted)
2020-02-12 01:33 UTC, Todd Zullinger
no flags Details | Diff
root.log (deleted)
2020-03-25 19:40 UTC, IBM Bug Proxy
no flags Details
build.log (deleted)
2020-03-25 19:41 UTC, IBM Bug Proxy
no flags Details
state.log (deleted)
2020-03-25 19:41 UTC, IBM Bug Proxy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 93908 0 P2 RESOLVED [8/9/10 Regression] git miscompilation on s390x-linux with -O2 -march=zEC12 -mtune=z13 starting with r8-1288 2020-03-24 13:03:06 UTC

Description Fedora Release Engineering 2020-02-06 16:59:33 UTC
git failed to build from source in Fedora rawhide/f32

https://koji.fedoraproject.org/koji/taskinfo?taskID=41317422


For details on the mass rebuild see:

https://fedoraproject.org/wiki/Fedora_32_Mass_Rebuild
Please fix git at your earliest convenience and set the bug's status to
ASSIGNED when you start fixing it. If the bug remains in NEW state for 8 weeks,
git will be orphaned. Before branching of Fedora 33,
git will be retired, if it still fails to build.

For more details on the FTBFS policy, please visit:
https://fedoraproject.org/wiki/Fails_to_build_from_source

Comment 1 Fedora Release Engineering 2020-02-06 16:59:39 UTC
Created attachment 1659041 [details]
build.log

file build.log too big, will only attach last 32768 bytes

Comment 2 Fedora Release Engineering 2020-02-06 16:59:41 UTC
Created attachment 1659042 [details]
root.log

file root.log too big, will only attach last 32768 bytes

Comment 3 Fedora Release Engineering 2020-02-06 16:59:43 UTC
Created attachment 1659043 [details]
state.log

Comment 4 Ben Cotton 2020-02-11 17:08:23 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 5 Todd Zullinger 2020-02-12 01:31:51 UTC
This appears to be a problem caused or exposed by gcc-10 on s390x.  The same git packages built fine before the mass rebuild on all architectures.

Building with clang works fine on s390x as well.  Here's a scratch build with clang:

https://koji.fedoraproject.org/koji/taskinfo?taskID=41462126

There aren't any errors during compilation with gcc, but there are a number of warnings which I suspect are at the root of the issue:

$ egrep '[0-9]: (warning|error):' /tmp/build.log | sort | uniq -c | sort -rn
     13 revision.c:322:22: warning: array subscript [1, 2147483647] is outside 
        array bounds of 'char[1]' [-Warray-bounds]
      1 trace2/tr2_dst.c:296:10: warning: 'fd' may be used uninitialized in 
        this function [-Wmaybe-uninitialized]
      1 read-cache.c:2661:18: warning: 'saved_namelen' may be used 
        uninitialized in this function [-Wmaybe-uninitialized]
      1 parse-options.c:218:8: warning: 'arg' may be used uninitialized in this
        function [-Wmaybe-uninitialized]
      1 ll-merge.c:74:4: warning: '%s' directive argument is null 
        [-Wformat-overflow=]
      1 commit.h:144:35: warning: 'commit' may be used uninitialized in this 
        function [-Wmaybe-uninitialized]

Running the test suite, there are many, many segfaults from git builtin commands:

$ grep -aic 'segmentation fault .*(core dumped).* git' /tmp/build.log 
1413

I'm not sure how to best proceed here. If these are truly issues in the git code, shouldn't the compiler toss errors rather than merely warning and producing broken binaries?

Comment 6 Todd Zullinger 2020-02-12 01:33:26 UTC
Created attachment 1662549 [details]
changes to build git with clang

Here are the changes I used to build with clang for testing, in case anyone is curious.

Comment 7 Jakub Jelinek 2020-02-12 19:47:51 UTC
I'd recommend to try gcc with -O0 as opposed to -O2 or whatever git is normally built with, if that one works, bisect between -O0 and -O2 built objects to narrow the problematic one; similar bisection can be done even among functions in the TU if needed.
And/or try -fsanitize=address,undefined.

Comment 8 Todd Zullinger 2020-02-13 15:37:04 UTC
Thanks Jakub!

I don't have shell access to an s390x system (if I did, I'd quickly be out of my depth trying to bisect the differences in the compiled binaries).  I suspect that it's revision.o that's broken, just based on the amount and type of warnings from gcc for revision.c.  That's a pretty core part of git, so it would affect many commands, like we see failing the test suite.

I built git with -O0 and it passes the test suite:

https://koji.fedoraproject.org/koji/taskinfo?taskID=41472527

If I adjust the git spec to continue despite the test suite failures with the normal -O2 optimization, will it be feasible to compare the binaries from the packages on a non-s390x host?

Such a build is here:

https://koji.fedoraproject.org/koji/taskinfo?taskID=41480185

Thank you for your help.

Comment 9 Jakub Jelinek 2020-02-22 16:59:59 UTC
Sorry for the delay, I've managed to reproduce this now and bisected a little bit.
Seems (at least the t0020-crlf.sh test I have been testing it with) cares about how diff.o is compiled,
if it is compiled with -O2 -march=zEC12 -mtune=zEC12, then it works fine, if it is compiled with
-O2 -march=zEC12 -mtune=z13, then the test FAILs.  And the change started with GCC http://gcc.gnu.org/r278218 + http://gcc.gnu.org/r278219 ,
r278217 still works fine even with -mtune=z13, while r278219 fails.  Unfortunately that change was to inlining decisions, so finding out where exactly the problem is and whether it is a GCC problem or git problem will be harder.
Guess I'll try to do some bisection within diff.i.

Comment 10 Jakub Jelinek 2020-02-22 17:00:55 UTC
And a workaround would be -mtune=zEC12, either for just diff.c in the toplevel directory, or for all files.

Comment 11 Todd Zullinger 2020-02-23 00:17:46 UTC
Thank you very much for digging into this Jakub!  Your time and expertise are very much appreciated.

I updated the git spec file to do `s/-mtune=z13/-mtune=zEC12/` on the %build_cflags macro for s390x builds (https://src.fedoraproject.org/rpms/git/c/9a7edd).  That was easier and less invasive than it would be to adjust only the diff.c compiler options.

I'll be interested to know whether this turns out to be an issue in gcc or git -- though I'll likely barely understand the results. :)

Comment 12 Jakub Jelinek 2020-03-12 10:34:58 UTC
Should be fixed in gcc-10.0.1-0.9.fc{32,33}, the f32 version hasn't finished building yet.

Comment 13 Todd Zullinger 2020-03-12 14:14:59 UTC
I started a scratch build when I saw your post on the devel list. That completed successfully for f33 (https://koji.fedoraproject.org/koji/taskinfo?taskID=42431001).  I expect that'll be the same for f32, but just to be sure I'll wait for the gcc build to complete and run a scratch build there before pushing the change to remove the workaround.  Thanks Jakub!

Comment 14 Fedora Update System 2020-03-13 10:48:36 UTC
FEDORA-2020-d927e07eb1 has been submitted as an update to Fedora 32. https://bodhi.fedoraproject.org/updates/FEDORA-2020-d927e07eb1

Comment 15 Fedora Update System 2020-03-13 18:33:38 UTC
annobin-9.06-4.fc32, gcc-10.0.1-0.9.fc32 has been pushed to the Fedora 32 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2020-d927e07eb1

Comment 16 Fedora Update System 2020-03-16 20:39:04 UTC
annobin-9.06-4.fc32, gcc-10.0.1-0.9.fc32 has been pushed to the Fedora 32 stable repository. If problems still persist, please make note of it in this bug report.

Comment 17 Severin Gehwolf 2020-03-25 19:25:17 UTC
*** Bug 1799087 has been marked as a duplicate of this bug. ***

Comment 18 IBM Bug Proxy 2020-03-25 19:40:56 UTC
------- Comment From Andreas.Krebbel.com 2020-03-12 11:20 EDT-------
Jakub recently fixed a nasty bug in combine which triggered miscompiles on S/390 (e.g. git). Was that patch already part of your GCC when you tried to build the package?

------- Comment From Andreas.Krebbel.com 2020-03-12 11:28 EDT-------
This was the patch from Jakub:

commit 73dc4ae47418aef2eb470b8f71cef57dce37349e
Author: Jakub Jelinek <jakub>
Date:   Tue Feb 25 13:56:47 2020 +0100

combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908]

Comment 19 IBM Bug Proxy 2020-03-25 19:40:58 UTC
Created attachment 1673611 [details]
root.log

Comment 20 IBM Bug Proxy 2020-03-25 19:41:01 UTC
Created attachment 1673612 [details]
build.log

Comment 21 IBM Bug Proxy 2020-03-25 19:41:02 UTC
Created attachment 1673613 [details]
state.log


Note You need to log in before you can comment on or make changes to this bug.