Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1499260
Summary: | Failing HTM tbegin for z Series guests despite claiming support. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dan Horák <dan> | ||||
Component: | glibc | Assignee: | Carlos O'Donell <codonell> | ||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | unspecified | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 27 | CC: | admiller, aoliva, arjun.is, awilliam, codonell, dan, dj, fweimer, gmarr, hannsj_uhl, jakub, law, mboddu, mfabian, pfrankli, rth, siddhesh | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | s390x | ||||||
OS: | Unspecified | ||||||
Whiteboard: | AcceptedFreezeException | ||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1509162 (view as bug list) | Environment: | |||||
Last Closed: | 2017-10-26 22:30:44 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 467765, 1396705, 1509162 | ||||||
Attachments: |
|
Description
Dan Horák
2017-10-06 13:42:37 UTC
I find it suspicious that this is after a 'tbegin' instruction has started executing a transactional region. Exactly what hardware is this and does it claim to support HWCAP_S390_TE? (In reply to Carlos O'Donell from comment #1) > I find it suspicious that this is after a 'tbegin' instruction has started > executing a transactional region. > > Exactly what hardware is this and does it claim to support HWCAP_S390_TE? It's RH zEC12 which supports TE, and I read in z/VM 6.4 news that it brings "Guest Transactional Execution support". That's a change in our environment since last week. (In reply to Dan Horák from comment #2) > (In reply to Carlos O'Donell from comment #1) > > I find it suspicious that this is after a 'tbegin' instruction has started > > executing a transactional region. > > > > Exactly what hardware is this and does it claim to support HWCAP_S390_TE? > > It's RH zEC12 which supports TE, and I read in z/VM 6.4 news that it brings > "Guest Transactional Execution support". That's a change in our environment > since last week. Is there any way to disable TE at the hardware level so the kernel doesn't report it and then see if this fixes the boot issue? Otherwise I will have to rebuild F27 glibc for s390x with elision turned off until I get the upstream tunables in place. (In reply to Carlos O'Donell from comment #3) > (In reply to Dan Horák from comment #2) > > (In reply to Carlos O'Donell from comment #1) > > > I find it suspicious that this is after a 'tbegin' instruction has started > > > executing a transactional region. > > > > > > Exactly what hardware is this and does it claim to support HWCAP_S390_TE? > > > > It's RH zEC12 which supports TE, and I read in z/VM 6.4 news that it brings > > "Guest Transactional Execution support". That's a change in our environment > > since last week. > > Is there any way to disable TE at the hardware level so the kernel doesn't > report it and then see if this fixes the boot issue? > > Otherwise I will have to rebuild F27 glibc for s390x with elision turned off > until I get the upstream tunables in place. ... if that's the issue. so with a glibc that correctly disables the lock elision (https://koji.fedoraproject.org/koji/taskinfo?taskID=22348456) the boot of the installation image continues correctly, without the segfault ... The installation then succeeds, it installs glibc-2.26-8.fc27.s390x from the Fedora repos and the installed system boots without an issue. for the record, rawhide compose has the same problem (https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20171011.n.0/compose/Server/s390x/os/images/) Fixed scratch build with --disable-lock-elision: https://koji.fedoraproject.org/koji/taskinfo?taskID=22424738 It goes without saying that we are very interested in the root cause analysis of this issue with input from IBM, since this should "just work" (tm). Scratch build passes and final libpthread.so.0 has no tbegin/tend. Final F27 build here: https://koji.fedoraproject.org/koji/taskinfo?taskID=22435457 (In reply to Carlos O'Donell from comment #10) > Scratch build passes and final libpthread.so.0 has no tbegin/tend. > > Final F27 build here: > https://koji.fedoraproject.org/koji/taskinfo?taskID=22435457 Dan, Can you please check these builds and see if they work and get back to me quickly? The sooner I hear back the faster I'll put this into a Bodhi update for F27. Thanks! glibc-2.26-14.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f glibc-2.26-14.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f There is (only) one difference I'm aware of between the working and failing scenario - the installer boot is initiated from the CMS shell using the virtual card reader, while the installed systems starts from CP using a DASD. Proposed as a Freeze Exception for 27-final by Fedora user sharkcz using the blocker tracking app because: The installer doesn't boot on a s390x system when glibc with lock elision is used. (In reply to Carlos O'Donell from comment #11) > (In reply to Carlos O'Donell from comment #10) > > Scratch build passes and final libpthread.so.0 has no tbegin/tend. > > > > Final F27 build here: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=22435457 > > Dan, > > Can you please check these builds and see if they work and get back to me > quickly? The sooner I hear back the faster I'll put this into a Bodhi update > for F27. thanks for the update, the installer image boots with your glibc build, karma will follow glibc-2.26-14.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f +1 FE +1 FE glibc-2.26-15.fc27 has been submitted as an update to Fedora 27. https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f glibc-2.26-15.fc27 has been pushed to the Fedora 27 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2017-0d3fdd3d1f Discussed during the 2017-10-23 blocker review meeting: [1] The decision to classify this bug as an AcceptedFreezeException was made as this breaks install boot on a non-blocking arch and can't be fixed via an update. [1] https://meetbot.fedoraproject.org/fedora-blocker-review/2017-10-23/f27-blocker-review.2017-10-23-16.00.txt glibc-2.26-15.fc27 has been pushed to the Fedora 27 stable repository. If problems still persist, please make note of it in this bug report. Went stable, closing bug. Please re-open if anything somehow still needs doing here. |