Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1582555
Summary: | regression in zlib-1.2.11-8: ARM optimizations broke git log on aarch64 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Pavel Cahyna <pcahyna> |
Component: | git | Assignee: | Petr Stodulka <pstodulk> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 28 | CC: | adenilson.cavalcanti, amahdal, besser82, c.david86, chrisw, dave.rodgman, jaromir.capik, jbowes, jchaloup, jeremy.linton, jlinton, pbrobinson, perobins, praiskup, pstodulk, skisela, tmz |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | aarch64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | git-2.17.1-3.fc28 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-22 15:04:44 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Pavel Cahyna
2018-05-25 14:49:27 UTC
Thanks for tracking this down Pavel. I hit this last night while trying to build a minor git bugfix release, as it breaks the test suite badly. I would imagine this causes breakage in many other packages too, many which may not have test suites that catch the issue before it reaches users. Jeremy can you take a look at this? Yes, I will get the patch authors to take a look. Thanks, Jeremy, can we please also have the links to the upstream pull requests? Or all the links pointing to (attempts for) upstream discussions? (In reply to Pavel Raiskup from comment #4) > Jeremy, can we please also have the links to the upstream pull requests? > Or all the links pointing to (attempts for) upstream discussions? They're all AFAIK at the upstream maintainers githhub (along with the 100 odd others): https://github.com/madler/zlib/pulls Joining the discussion, I will start looking into it ASAP. I thought this sounded like a zlib bug, but I think I've tracked it down to what appears to be a bug in git, which i've fixed and now all the unit tests have passed. I'm going to get someone else to take a quick peek and doublecheck the patch. Interesting. Thanks for poking at this Jeremy and adenilson! So, I think I fully understand what is happening now. "Bug" is a bit strong, more a misinterpretation of whats allowed. So git, in unpack_compressed_entry() is allocating a buffer of 'size' using xmallocz_gently() which actually allocates a slightly larger buffer and null terminates it. The code in unpack_compressed_entry() is utilizing this feature to go ahead and pass size+1 to zlib as the size of the buffer. To zlib it considers that it owns a buffer of size+1 and does a correct decompression (sets consumed/remaining sizes correctlt) but poisons the remainder of the buffer. The zlib code then gets the return it expects and continues on its merry way not considering that the null may have been poisoned. Initially a quck hack to set the buffer passed to zlib as 'size' cleans up all the unit test failures, despite the fact that zlib is triggering an error case (which is ignored due to a successful return) due to the extra byte being consumed (AKA git's overconsumption check fails if the overconsumption is exactly 1 byte). Posted a patch here: https://public-inbox.org/git/20180525231713.23047-1-lintonrjeremy@gmail.com/T/#u I'm changing the component. Comment #9 above should read "The git code.." Thanks for the quick diagnosis and patch Jeremy! I have a scratch build running now: https://koji.fedoraproject.org/koji/taskinfo?taskID=27194025 I'll check back on that shortly and follow up on your patch on the git list. Pavel, can you give test the scratch build on aarch64 and confirm it fixes things? The test suite passes so I'm confident it will, but it's always good to have extra confirmation. Plus this has been broken on f28 aarch64 for about a month. Another day or two won't hurt at this point. I pushed this to the master branch of my fork: https://src.fedoraproject.org/fork/tmz/rpms/git Please, consider waiting for upstream git fix till we break the upstream-first rule again. Yeah, I don't intend to push anything until upstream has a consensus. I'm following that discussion closely. I built those scratch packages just for testing to help confirm that it resolved the issue. I don't have access to any aarch64 hardware, so the only testing I'm able to do is running the test suite in koji. That's why I asked about testing the case where you noticed the issue. :) In unfortunate timing, a git released an update today which fixes 2 security issues (CVE-2018-11233 & CVE-2018-11235)¹. This forces our hand a bit with regard to fixing the aarch64 builds. We could disable the tests on aarch64 to allow a successful git build (which we know to be broken). That's certainly not ideal. Instead, I've applied Jeremy's packfile patch to aarch64 only. Upstream has some questions about the patch, which can hopefully be resolved soon and the patch (or some version of it) can be accepted. (I wish I had seen the broken git builds in koschei when they showed up weeks ago. I don't know if those failed builds didn't generate any notification or if I've disabled the notifications because they too frequently were due to transient issues.) ¹ I've alerted the Red Hat security team so they can create tracking bugs for the issues git-2.17.1-2.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-75f7624a9f Still there's possibility that our zlib is broken; could we fix the optimization patches so the input/output API isn't changed at all? I hope that the conversation on the git list continues and a consensus is reached on the proper changes and in which components, of course. Since there is a relatively high risk vulnerability requiring a git build (#1583862), I didn't feel that we had much choice but to apply the git patch on aarch64. Seeing that git was fairly well broken on aarch64 already, it's unlikely that it's going to be worse with the patch. :) So far as I can tell, upstream is (rightly) concerned that the patch undoes a previous change to the deflate handling in packfile.c and wants to be sure that the fix isn't going to cause any regressions. The tests suite still passes, including the test which was added at that time (in 39eea7bdd9 ("Fix incorrect error check while reading deflated pack data", 2009-10-21)). That makes me think that even if the patch isn't the ideal fix, it's not going to cause problems. I only applied the current git patches on aarch64 and I don't have the update set to close this bug when the update goes stable. I'm a bit split about the "correct" fix. The zlib changes are as I understand it intentionally poisioning the buffer, but I don't really think the git code is correct either (despite the attempt at defensive coding, it needs to protect that null and its not doing it). I would have a tendency to fix both of them... git-2.17.1-2.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-75f7624a9f git-2.17.1-2.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report. It may well be that changes are needed in both git and zlib. I don't think anyone upstream at git is opposed to that idea, they just want to ensure that the changes don't lose a fix for a previous issue. Without the zlib patches upstream, the only people affected by this are us, and only on aarch64. That makes it harder to get much traction or interest and leaves us in the less-than-ideal position of carrying non-upstream patches in both git and zlib. It would be great if the folks who best understand these changes could work with git and zlib upstream to get any necessary patches accepted.
> It would be great if the folks who best understand these changes could work
> with git and zlib upstream to get any necessary patches accepted.
They are, jlinton is the one that debugged and sent the git patch, the upstream zlib maintainer is non responsive, these same patches are in chromeOS/chromium and other places too.
Alright. I asked because upstream git had some questions and I haven't seen any responses there. yes, last questions in the thread were in the message from Junio C Hamano on the 28th and there was no reply. Hi Pavel, do you have a link to the upstream thread you mention? I'll check that we're following this up with upstream. I Dave, the thread is at https://public-inbox.org/git/20180525231713.23047-1-lintonrjeremy@gmail.com/T/#u (from comment 10). :) Thanks for looking into this further! Yes, I'm going to probably just post the null terminator version RSN.. AFAIK, I posted a response last week to that affect although its not showing up in that public-inbox thread for whatever reason. Thanks Jeremy. FWIW, the message never arrived in my mailbox either, so I imagine it didn't reach the git list (or was dropped by the list server itself). git-2.17.1-3.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2018-972fa3841b I replaced the initial patch with what's now queued upstream (https://github.com/git/git/commit/b611396e97), and applied it to all architectures. I set the update to close this ticket when it reaches stable. If there's any reason to keep it open, let me know and I'll adjust the update settings. Thanks to Jeremy and everyone else that helped get this fixed. git-2.17.1-3.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-972fa3841b git-2.17.1-3.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report. |