Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1940964
Summary: | FTBFS: LLVM JIT related tests fail mesarably on s390x: incompatible data layouts | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Honza Horak <hhorak> |
Component: | postgresql | Assignee: | Filip Januš <fjanus> |
Status: | ASSIGNED --- | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 34 | CC: | anon.amish, devrim, fjanus, hhorak, jmlich83, panovotn, pkubat, praiskup, sguelton, tgl, tstellar |
Target Milestone: | --- | Flags: | hhorak:
needinfo?
(sguelton) |
Target Release: | --- | ||
Hardware: | s390x | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | Type: | Bug | |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 467765, 1883001, 1571215 |
Description
Honza Horak
2021-03-19 16:24:51 UTC
@sguelton @tstellar Does it sound familiar? Any tips how to debug/fix? (I'm personally clueless so far) FWIW, this is thought to work upstream, though not with PG releases older than 13.2. Our resident expert thinks it sounds like a mismatch between libllvm and clang versions: https://www.postgresql.org/message-id/20210319190047.7o4bwhbp5dzkqif3%40alap3.anarazel.de (In reply to Tom Lane from comment #3) > FWIW, this is thought to work upstream, though not with PG releases older > than 13.2. Our resident expert thinks it sounds like a mismatch between > libllvm and clang versions: > > https://www.postgresql.org/message-id/20210319190047. > 7o4bwhbp5dzkqif3%40alap3.anarazel.de Thanks for the pointer, Tom. However, I see it with these versions that do not seem to be in a mismatch: $> rpm -q clang llvm clang-12.0.0-0.7.rc3.fc35.s390x llvm-12.0.0-0.7.rc3.fc35.s390x As F34 getting close and plpython2 removal (https://src.fedoraproject.org/rpms/postgresql/pull-request/28) being blocked by this now, it makes me think we can disable llvmjit for s390x till this is solved, as removing plpython2 later will not be possible. (In reply to Honza Horak from comment #4) > (In reply to Tom Lane from comment #3) > > FWIW, this is thought to work upstream, though not with PG releases older > > than 13.2. Our resident expert thinks it sounds like a mismatch between > > libllvm and clang versions: > > > > https://www.postgresql.org/message-id/20210319190047. > > 7o4bwhbp5dzkqif3%40alap3.anarazel.de > > Thanks for the pointer, Tom. > > However, I see it with these versions that do not seem to be in a mismatch: > $> rpm -q clang llvm > clang-12.0.0-0.7.rc3.fc35.s390x > llvm-12.0.0-0.7.rc3.fc35.s390x Actually, I indeed see some llvm v11 artifact left in the buildroot: llvm11-libs annobin pulls it in. annobin is pulled in by redhat-rpm-config. I didn't investigate properly yet, but hopefully successful rebuild of annobin might help. I will test this in copr. Disabling JIT until this is fixed seems like a reasonable idea to me. I'll update here when I have this tested. I tried to rebuild annobin in copr to get rid of llvm11-libs and while postgresql is still failing on s390x, the failures look differently: https://copr.fedorainfracloud.org/coprs/hhorak/test-pgsql-llvmjit/build/2092931/ FAILED (test process exited with exit code 2) https://download.copr.fedorainfracloud.org/results/hhorak/test-pgsql-llvmjit/fedora-34-s390x/02092931-postgresql/build.log.gz Even after getting rid of llvm11-libs from the buildroot (it is not pulled in in F34 any more) it does not work, still same error. (In reply to Honza Horak from comment #8) > Even after getting rid of llvm11-libs from the buildroot (it is not pulled > in in F34 any more) it does not work, still same error. Visible on the scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=66082182 From what I can tell, this is a bug in postgresql. At runtime, it creates a JIT instance using the host CPU target, which has the DataLayout of the host. However, when compiling JIT code, it is pulling the DataLayout from a bitcode file that is compiled at build time with no specific CPU target and thus a different DataLayout. Proposed fix for Fedora: https://src.fedoraproject.org/rpms/postgresql/pull-request/29 Related upstream discussion on bugs list: https://www.postgresql.org/message-id/20210420225228.qr4x6zv3hqjorh5t%40alap3.anarazel.de Same issue with postgresql 12.7: +ERROR: failed to JIT module: Added modules have incompatible data layouts: E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-a:8:16-n32:64 (module) vs E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit) https://koji.fedoraproject.org/koji/taskinfo?taskID=68313568 The proposed workaround[1] needs to be updated to be suitable for postgresql12.7 [1] https://src.fedoraproject.org/rpms/postgresql/blob/41cd60000b91c121e1286c194284bffec770081b/f/postgresql-datalayout-mismatch-on-s390.patch |