Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.

Bug 1959011

Summary: python-fsspec fails to build with Python 3.10: MemoryError in lz4
Product: [Fedora] Fedora Reporter: Miro Hrončok <mhroncok>
Component: python-fsspecAssignee: Elliott Sales de Andrade <quantum.analyst>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: mhroncok, python-sig, quantum.analyst, thrnciar
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-fsspec-2021.6.0-1.fc35~bootstrap Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-13 03:43:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1959006    
Bug Blocks: 1890881, 1927309, 1927313, 1968947, 1968988    

Description Miro Hrončok 2021-05-10 14:55:34 UTC
python-fsspec fails to build with Python 3.10.0b1:

=================================== FAILURES ===================================
__________________________ test_compressions[lz4-rt] ___________________________
fmt = 'lz4', mode = 'rt'
tmpdir = '/tmp/pytest-of-mockbuild/pytest-0/test_compressions_lz4_rt_0'
    @pytest.mark.parametrize("mode", ["rt", "rb"])
    @pytest.mark.parametrize("fmt", list(compression.compr))
    def test_compressions(fmt, mode, tmpdir):
        if fmt == "zip" and sys.version_info < (3, 6):
            pytest.xfail("zip compression requires python3.6 or higher")
    
        tmpdir = str(tmpdir)
        fn = os.path.join(tmpdir, ".tmp.getsize")
        fs = LocalFileSystem()
        f = OpenFile(fs, fn, compression=fmt, mode="wb")
        data = b"Long line of readily compressible text"
        with f as fo:
            fo.write(data)
        if fmt is None:
            assert fs.size(fn) == len(data)
        else:
            assert fs.size(fn) != len(data)
    
        f = OpenFile(fs, fn, compression=fmt, mode=mode)
        with f as fo:
            if mode == "rb":
                assert fo.read() == data
            else:
>               assert fo.read() == data.decode()
fsspec/implementations/tests/test_local.py:184: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:635: in read
    return self._buffer.read(size)
/usr/lib64/python3.10/_compression.py:118: in readall
    while data := self.read(sys.maxsize):
/usr/lib64/python3.10/_compression.py:103: in read
    data = self._decompressor.decompress(rawblock, size)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7f61d8b00550>
data = b'\x04"M\x18@@\xc0&\x00\x00\x80Long line of readily compressible text\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:394: MemoryError
__________________________ test_compressions[lz4-rb] ___________________________
fmt = 'lz4', mode = 'rb'
tmpdir = '/tmp/pytest-of-mockbuild/pytest-0/test_compressions_lz4_rb_0'
    @pytest.mark.parametrize("mode", ["rt", "rb"])
    @pytest.mark.parametrize("fmt", list(compression.compr))
    def test_compressions(fmt, mode, tmpdir):
        if fmt == "zip" and sys.version_info < (3, 6):
            pytest.xfail("zip compression requires python3.6 or higher")
    
        tmpdir = str(tmpdir)
        fn = os.path.join(tmpdir, ".tmp.getsize")
        fs = LocalFileSystem()
        f = OpenFile(fs, fn, compression=fmt, mode="wb")
        data = b"Long line of readily compressible text"
        with f as fo:
            fo.write(data)
        if fmt is None:
            assert fs.size(fn) == len(data)
        else:
            assert fs.size(fn) != len(data)
    
        f = OpenFile(fs, fn, compression=fmt, mode=mode)
        with f as fo:
            if mode == "rb":
>               assert fo.read() == data
fsspec/implementations/tests/test_local.py:182: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:635: in read
    return self._buffer.read(size)
/usr/lib64/python3.10/_compression.py:118: in readall
    while data := self.read(sys.maxsize):
/usr/lib64/python3.10/_compression.py:103: in read
    data = self._decompressor.decompress(rawblock, size)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7f61d8b7be20>
data = b'\x04"M\x18@@\xc0&\x00\x00\x80Long line of readily compressible text\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:394: MemoryError
_____________________________ test_lz4_compression _____________________________
tmpdir = local('/tmp/pytest-of-mockbuild/pytest-0/test_lz4_compression0')
    def test_lz4_compression(tmpdir):
        """Infer lz4 compression for .lz4 files if lz4 is available."""
        tmp_path = pathlib.Path(str(tmpdir))
    
        lz4 = pytest.importorskip("lz4")
    
        tmp_path.mkdir(exist_ok=True)
    
        tdat = "foobar" * 100
    
        with fsspec.core.open(
            str(tmp_path / "out.lz4"), mode="wt", compression="infer"
        ) as outfile:
            outfile.write(tdat)
    
        compressed = (tmp_path / "out.lz4").open("rb").read()
        assert lz4.frame.decompress(compressed).decode() == tdat
    
        with fsspec.core.open(
            str(tmp_path / "out.lz4"), mode="rt", compression="infer"
        ) as infile:
>           assert infile.read() == tdat
fsspec/tests/test_compression.py:85: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:635: in read
    return self._buffer.read(size)
/usr/lib64/python3.10/_compression.py:118: in readall
    while data := self.read(sys.maxsize):
/usr/lib64/python3.10/_compression.py:103: in read
    data = self._decompressor.decompress(rawblock, size)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7f61d8b7aef0>
data = b'\x04"M\x18@@\xc0\x12\x00\x00\x00ofoobar\x06\x00\xff\xff<Poobar\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:394: MemoryError

For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.10/fedora-rawhide-x86_64/02174947-python-fsspec/

For all our attempts to build python-fsspec with Python 3.10, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.10/package/python-fsspec/

Testing and mass rebuild of packages is happening in copr. You can follow these instructions to test locally in mock if your package builds with Python 3.10:
https://copr.fedorainfracloud.org/coprs/g/python/python3.10/

Let us know here if you have any questions.

Python 3.10 will be included in Fedora 35. To make that update smoother, we're building Fedora packages with early pre-releases of Python 3.10.
A build failure prevents us from testing all dependent packages (transitive [Build]Requires), so if this package is required a lot, it's important for us to get it fixed soon.
We'd appreciate help from the people who know this package best, but if you don't want to work on this now, let us know so we can try to work around it on our side.

Comment 1 Miro Hrončok 2021-06-04 20:13:04 UTC
This is a mass-posted update. Sorry if it is not 100% accurate to this bugzilla.


The Python 3.10 rebuild is in progress in a Koji side tag. If you manage to fix the problem, please commit the fix in the rawhide branch, but don't build the package in regular rawhide.

You can either build the package in the side tag, with:

    $ fedpkg build --target=f35-python

Or you can the build and we will eventually build it for you.

Note that the rebuild is still in progress, so not all (build) dependencies of this package might be available right away.

Thanks.

See also https://fedoraproject.org/wiki/Changes/Python3.10

If you have general questions about the rebuild, please use this mailing list thread: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/G47SGOYIQLRDTWGOSLSWERZSSHXDEDH5/

Comment 2 Miro Hrončok 2021-06-07 22:57:58 UTC
The f35-python side tag has been merged to Rawhide. From now on, build as you would normally build.

Comment 3 Miro Hrončok 2021-06-08 11:25:34 UTC
*** Bug 1968984 has been marked as a duplicate of this bug. ***