1959006 – python-lz4 fails to build with Python 3.10: MemoryError in tests

Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1959006 - python-lz4 fails to build with Python 3.10: MemoryError in tests

Summary: python-lz4 fails to build with Python 3.10: MemoryError in tests

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	python-lz4
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	unspecified
Target Milestone:	---
Assignee:	Miro Hrončok
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1969020 (view as bug list)
Depends On:
Blocks:	PYTHON3.10 F35FTBFS, RAWHIDEFTBFS F35FailsToInstall, RAWHIDEFailsToInstall 1927646 1959011 1968973 1968984
TreeView+	depends on / blocked

Reported:	2021-05-10 14:49 UTC by Miro Hrončok
Modified:	2021-06-12 23:27 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-06-12 23:27:47 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Miro Hrončok 2021-05-10 14:49:36 UTC

python-lz4 fails to build with Python 3.10.0b1:
=================================== FAILURES ===================================
_________________ test_frame_open_decompress_mem_usage[data0] __________________
data = b'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
    def test_frame_open_decompress_mem_usage(data):
        tracemalloc = pytest.importorskip('tracemalloc')
        tracemalloc.start()
    
        with lz4.frame.open('test.lz4', 'w') as f:
            f.write(data)
    
        prev_snapshot = None
    
        for i in range(1000):
            with lz4.frame.open('test.lz4', 'r') as f:
>               decompressed = f.read()  # noqa: F841
data       = b'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
f          = <lz4.frame.LZ4FrameFile object at 0x7faf48083e80>
i          = 0
prev_snapshot = None
tracemalloc = <module 'tracemalloc' from '/usr/lib64/python3.10/tracemalloc.py'>
tests/frame/test_frame_5.py:80: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameFile object at 0x7faf48083e80>, size = -1
    def read(self, size=-1):
        """Read up to ``size`` uncompressed bytes from the file.
    
        If ``size`` is negative or omitted, read until ``EOF`` is reached.
        Returns ``b''`` if the file is already at ``EOF``.
    
        Args:
            size(int): If non-negative, specifies the maximum number of
                uncompressed bytes to return.
    
        Returns:
            bytes: uncompressed data
    
        """
        self._check_can_read()
>       return self._buffer.read(size)
self       = <lz4.frame.LZ4FrameFile object at 0x7faf48083e80>
size       = -1
lz4/frame/__init__.py:635: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <_compression.DecompressReader object at 0x7faf48080a30>
    def readall(self):
        chunks = []
        # sys.maxsize means the max length of output buffer is unlimited,
        # so that the whole input buffer can be decompressed within one
        # .decompress() call.
>       while data := self.read(sys.maxsize):
chunks     = []
self       = <_compression.DecompressReader object at 0x7faf48080a30>
/usr/lib64/python3.10/_compression.py:118: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <_compression.DecompressReader object at 0x7faf48080a30>
size = 9223372036854775807
    def read(self, size=-1):
        if size < 0:
            return self.readall()
    
        if not size or self._eof:
            return b""
        data = None  # Default if EOF is encountered
        # Depending on the input data, our call to the decompressor may not
        # return any data. In this case, try again after reading another block.
        while True:
            if self._decompressor.eof:
                rawblock = (self._decompressor.unused_data or
                            self._fp.read(BUFFER_SIZE))
                if not rawblock:
                    break
                # Continue to next stream.
                self._decompressor = self._decomp_factory(
                    **self._decomp_args)
                try:
                    data = self._decompressor.decompress(rawblock, size)
                except self._trailing_error:
                    # Trailing data isn't a valid compressed stream; ignore it.
                    break
            else:
                if self._decompressor.needs_input:
                    rawblock = self._fp.read(BUFFER_SIZE)
                    if not rawblock:
                        raise EOFError("Compressed file ended before the "
                                       "end-of-stream marker was reached")
                else:
                    rawblock = b""
>               data = self._decompressor.decompress(rawblock, size)
data       = None
rawblock   = b'\x04"M\x18@@\xc0\x0b\x01\x00\x00\x1fa\x01\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf...\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xe8Paaaaa\x00\x00\x00\x00'
self       = <_compression.DecompressReader object at 0x7faf48080a30>
size       = 9223372036854775807
/usr/lib64/python3.10/_compression.py:103: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7faf480801f0>
data = b'\x04"M\x18@@\xc0\x0b\x01\x00\x00\x1fa\x01\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf...\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xe8Paaaaa\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
data       = b'\x04"M\x18@@\xc0\x0b\x01\x00\x00\x1fa\x01\x00\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xf...\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xe8Paaaaa\x00\x00\x00\x00'
max_length = 9223372036854775807
self       = <lz4.frame.LZ4FrameDecompressor object at 0x7faf480801f0>
lz4/frame/__init__.py:394: MemoryError
=========================== short test summary info ============================
FAILED tests/frame/test_frame_5.py::test_frame_open_decompress_mem_usage[data0]
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!
================= 1 failed, 19015 passed in 160.81s (0:02:40) ==================

For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.10/fedora-rawhide-x86_64/02174997-python-lz4/

For all our attempts to build python-lz4 with Python 3.10, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.10/package/python-lz4/

Testing and mass rebuild of packages is happening in copr. You can follow these instructions to test locally in mock if your package builds with Python 3.10:
https://copr.fedorainfracloud.org/coprs/g/python/python3.10/

Let us know here if you have any questions.

Python 3.10 will be included in Fedora 35. To make that update smoother, we're building Fedora packages with early pre-releases of Python 3.10.
A build failure prevents us from testing all dependent packages (transitive [Build]Requires), so if this package is required a lot, it's important for us to get it fixed soon.
We'd appreciate help from the people who know this package best, but if you don't want to work on this now, let us know so we can try to work around it on our side.

Comment 1 Miro Hrončok 2021-06-04 20:14:52 UTC

This is a mass-posted update. Sorry if it is not 100% accurate to this bugzilla.


The Python 3.10 rebuild is in progress in a Koji side tag. If you manage to fix the problem, please commit the fix in the rawhide branch, but don't build the package in regular rawhide.

You can either build the package in the side tag, with:

    $ fedpkg build --target=f35-python

Or you can the build and we will eventually build it for you.

Note that the rebuild is still in progress, so not all (build) dependencies of this package might be available right away.

Thanks.

See also https://fedoraproject.org/wiki/Changes/Python3.10

If you have general questions about the rebuild, please use this mailing list thread: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/G47SGOYIQLRDTWGOSLSWERZSSHXDEDH5/

Comment 2 Miro Hrončok 2021-06-07 22:59:24 UTC

The f35-python side tag has been merged to Rawhide. From now on, build as you would normally build.

Comment 3 Miro Hrončok 2021-06-08 11:27:21 UTC

*** Bug 1969020 has been marked as a duplicate of this bug. ***

Comment 4 Zbigniew Jędrzejewski-Szmek 2021-06-10 19:25:59 UTC

This is blocking a whole chain of packages, hence raising priority.

Comment 5 Jos de Kloe 2021-06-12 10:42:40 UTC

might be related to this upstream issue?
https://github.com/python-lz4/python-lz4/issues/219

Comment 6 Miro Hrončok 2021-06-12 11:08:29 UTC

Likely. Let's see: https://src.fedoraproject.org/rpms/python-lz4/pull-request/2

Comment 7 Miro Hrončok 2021-06-12 20:20:52 UTC

ppc64le still has MemoryError :/

=================================== FAILURES ===================================
__________________ test_invalid_config_d_4[store_comp_size2] ___________________
store_comp_size = {'store_comp_size': 4}
    def test_invalid_config_d_4(store_comp_size):
        d_kwargs = {}
        d_kwargs['strategy'] = "double_buffer"
        d_kwargs['buffer_size'] = 1 << (8 * store_comp_size['store_comp_size'])
        d_kwargs.update(store_comp_size)
    
        if store_comp_size['store_comp_size'] >= 4:
    
            if os.environ.get('TRAVIS') is not None:
                pytest.skip('Skipping test on Travis due to insufficient memory')
    
            if sys.maxsize < 0xffffffff:
                pytest.skip('Py_ssize_t too small for this test')
    
            if psutil.virtual_memory().available < 4 * d_kwargs['buffer_size']:
                # The internal LZ4 context will request at least 3 times buffer_size
                # as memory (2 buffer_size for the double-buffer, and 1.x buffer_size
                # for the output buffer), so round up to 4 buffer_size.
                pytest.skip('Insufficient system memory for this test')
    
            # Make sure the page size is larger than what the input bound will be,
            # but still fit in 4 bytes
            d_kwargs['buffer_size'] -= 1
    
        # No failure expected during instanciation/initialization
>       lz4.stream.LZ4StreamDecompressor(**d_kwargs)
d_kwargs   = {'buffer_size': 4294967295, 'store_comp_size': 4, 'strategy': 'double_buffer'}
store_comp_size = {'store_comp_size': 4}
tests/stream/test_stream_1.py:199: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.stream.LZ4StreamDecompressor object at 0x7ffe344a7bb0>
strategy = 'double_buffer', buffer_size = 4294967295, return_bytearray = 0
store_comp_size = 4, dictionary = ''
    def __init__(self, strategy, buffer_size, return_bytearray=False, store_comp_size=4, dictionary=""):
        """ Instantiates and initializes a LZ4 stream decompression context.
    
            Args:
                strategy (str): Buffer management strategy. Can be: ``double_buffer``.
                buffer_size (int): Size of one buffer of the double-buffer used
                    internally for stream decompression in the case of ``double_buffer``
                    strategy.
    
            Keyword Args:
                return_bytearray (bool): If ``False`` (the default) then the function
                    will return a ``bytes`` object. If ``True``, then the function will
                    return a ``bytearray`` object.
                store_comp_size (int): Specify the size in bytes of the following
                    compressed block. Can be: ``1``, ``2`` or ``4`` (default: ``4``).
                dictionary (str, bytes or buffer-compatible object): If specified,
                    perform decompression using this initial dictionary.
    
            Raises:
                Exceptions occuring during the context initialization.
    
                OverflowError: raised if the ``dictionary`` parameter is too large
                    for the LZ4 context.
                ValueError: raised if some parameters are invalid.
                MemoryError: raised if some internal resources cannot be allocated.
                RuntimeError: raised if some internal resources cannot be initialized.
    
        """
        return_bytearray = 1 if return_bytearray else 0
    
>       self._context = _create_context(strategy, "decompress", buffer_size,
                                        return_bytearray=return_bytearray,
                                        store_comp_size=store_comp_size,
                                        dictionary=dictionary)
E       MemoryError: Could not allocate output buffer
buffer_size = 4294967295
dictionary = ''
return_bytearray = 0
self       = <lz4.stream.LZ4StreamDecompressor object at 0x7ffe344a7bb0>
store_comp_size = 4
strategy   = 'double_buffer'
lz4/stream/__init__.py:45: MemoryError

Comment 8 Miro Hrončok 2021-06-12 20:23:08 UTC

I think we are OK to skip this test. It is also skipped on Travis, etc.

Comment 9 Miro Hrončok 2021-06-12 21:19:49 UTC

I see some arches have tests disabled entirely. Running a fully tested scratch build to see what to skip on what arches.

Comment 10 Miro Hrončok 2021-06-12 22:04:47 UTC

I've got the same MemoryError on aarch64 and even on x86_64. Hence not ppc64le specific. Skipping it appears to suceed the tests on all Little Endian architectures. Trying to run at least some tests on s390x now, but they were already skipped entirely there so if I won't succeed, I'll keep it that way.

Comment 11 Miro Hrončok 2021-06-12 23:27:00 UTC

https://src.fedoraproject.org/rpms/python-lz4/pull-request/3

Note You need to log in before you can comment on or make changes to this bug.