Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1859689

Summary: cr_xml_parser_generic_from_string fails on large inputs
Product: Red Hat Enterprise Linux 8 Reporter: Daniel Alley <dalley>
Component: createrepo_cAssignee: amatej
Status: CLOSED ERRATA QA Contact: Eva Mrakova <emrakova>
Severity: unspecified Docs Contact:
Priority: high    
Version: CentOS StreamCC: amatej, bstinson, carl, dmach, jmracek, jrohel, jwboyer, mblaha, pkratoch, rpm-software-management, tmlcoch
Target Milestone: rcKeywords: Triaged
Target Release: 8.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: createrepo_c-0.15.11-2.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-04 03:09:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
error message none

Description Daniel Alley 2020-07-22 17:13:27 UTC
Created attachment 1702118 [details]
error message

Description of problem:

When passing a very large XML string to cr_xml_parser_generic_from_string, an error message is printed. In this case the XML is from filelists for the "flat-remix-icon-theme" package present in Fedora 30 and 31 (and probably others but I haven't looked). This package has many tens of thousands of icon files.

Version-Release number of selected component (if applicable):

0.16.0

How reproducible:

Always

Additional info:

It looks like the file-based XML parsing function uses a buffer of a bounded size whereas the string-based XML parsing function simply passes along the entire string to xmlParseChunk()

https://github.com/rpm-software-management/createrepo_c/blob/7fbb4f9258e6d2f00b4add4da05f34adf43078db/src/xml_parser.c#L252

Comment 1 amatej 2020-07-24 11:33:11 UTC
You are correct, the buffer size is the problem. Thanks for the investigation!

Here is a PR that should help: https://github.com/rpm-software-management/createrepo_c/pull/225
It also contains a python unit test.

Comment 2 Daniel Alley 2020-07-24 13:33:34 UTC
I tested your branch against the case where we initially discovered this and I can confirm that the patch does fix the issue for us, thank you!

Comment 9 errata-xmlrpc 2020-11-04 03:09:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (createrepo_c bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4700