Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1959011 - python-fsspec fails to build with Python 3.10: MemoryError in lz4
Summary: python-fsspec fails to build with Python 3.10: MemoryError in lz4
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: python-fsspec
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Elliott Sales de Andrade
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1968984 (view as bug list)
Depends On: 1959006
Blocks: PYTHON3.10 F35FTBFS F35FailsToInstall 1968947 1968988
TreeView+ depends on / blocked
 
Reported: 2021-05-10 14:55 UTC by Miro Hrončok
Modified: 2021-06-13 03:43 UTC (History)
4 users (show)

Fixed In Version: python-fsspec-2021.6.0-1.fc35~bootstrap
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-13 03:43:16 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Miro Hrončok 2021-05-10 14:55:34 UTC
python-fsspec fails to build with Python 3.10.0b1:

=================================== FAILURES ===================================
__________________________ test_compressions[lz4-rt] ___________________________
fmt = 'lz4', mode = 'rt'
tmpdir = '/tmp/pytest-of-mockbuild/pytest-0/test_compressions_lz4_rt_0'
    @pytest.mark.parametrize("mode", ["rt", "rb"])
    @pytest.mark.parametrize("fmt", list(compression.compr))
    def test_compressions(fmt, mode, tmpdir):
        if fmt == "zip" and sys.version_info < (3, 6):
            pytest.xfail("zip compression requires python3.6 or higher")
    
        tmpdir = str(tmpdir)
        fn = os.path.join(tmpdir, ".tmp.getsize")
        fs = LocalFileSystem()
        f = OpenFile(fs, fn, compression=fmt, mode="wb")
        data = b"Long line of readily compressible text"
        with f as fo:
            fo.write(data)
        if fmt is None:
            assert fs.size(fn) == len(data)
        else:
            assert fs.size(fn) != len(data)
    
        f = OpenFile(fs, fn, compression=fmt, mode=mode)
        with f as fo:
            if mode == "rb":
                assert fo.read() == data
            else:
>               assert fo.read() == data.decode()
fsspec/implementations/tests/test_local.py:184: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:635: in read
    return self._buffer.read(size)
/usr/lib64/python3.10/_compression.py:118: in readall
    while data := self.read(sys.maxsize):
/usr/lib64/python3.10/_compression.py:103: in read
    data = self._decompressor.decompress(rawblock, size)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7f61d8b00550>
data = b'\x04"M\x18@@\xc0&\x00\x00\x80Long line of readily compressible text\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:394: MemoryError
__________________________ test_compressions[lz4-rb] ___________________________
fmt = 'lz4', mode = 'rb'
tmpdir = '/tmp/pytest-of-mockbuild/pytest-0/test_compressions_lz4_rb_0'
    @pytest.mark.parametrize("mode", ["rt", "rb"])
    @pytest.mark.parametrize("fmt", list(compression.compr))
    def test_compressions(fmt, mode, tmpdir):
        if fmt == "zip" and sys.version_info < (3, 6):
            pytest.xfail("zip compression requires python3.6 or higher")
    
        tmpdir = str(tmpdir)
        fn = os.path.join(tmpdir, ".tmp.getsize")
        fs = LocalFileSystem()
        f = OpenFile(fs, fn, compression=fmt, mode="wb")
        data = b"Long line of readily compressible text"
        with f as fo:
            fo.write(data)
        if fmt is None:
            assert fs.size(fn) == len(data)
        else:
            assert fs.size(fn) != len(data)
    
        f = OpenFile(fs, fn, compression=fmt, mode=mode)
        with f as fo:
            if mode == "rb":
>               assert fo.read() == data
fsspec/implementations/tests/test_local.py:182: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:635: in read
    return self._buffer.read(size)
/usr/lib64/python3.10/_compression.py:118: in readall
    while data := self.read(sys.maxsize):
/usr/lib64/python3.10/_compression.py:103: in read
    data = self._decompressor.decompress(rawblock, size)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7f61d8b7be20>
data = b'\x04"M\x18@@\xc0&\x00\x00\x80Long line of readily compressible text\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:394: MemoryError
_____________________________ test_lz4_compression _____________________________
tmpdir = local('/tmp/pytest-of-mockbuild/pytest-0/test_lz4_compression0')
    def test_lz4_compression(tmpdir):
        """Infer lz4 compression for .lz4 files if lz4 is available."""
        tmp_path = pathlib.Path(str(tmpdir))
    
        lz4 = pytest.importorskip("lz4")
    
        tmp_path.mkdir(exist_ok=True)
    
        tdat = "foobar" * 100
    
        with fsspec.core.open(
            str(tmp_path / "out.lz4"), mode="wt", compression="infer"
        ) as outfile:
            outfile.write(tdat)
    
        compressed = (tmp_path / "out.lz4").open("rb").read()
        assert lz4.frame.decompress(compressed).decode() == tdat
    
        with fsspec.core.open(
            str(tmp_path / "out.lz4"), mode="rt", compression="infer"
        ) as infile:
>           assert infile.read() == tdat
fsspec/tests/test_compression.py:85: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:635: in read
    return self._buffer.read(size)
/usr/lib64/python3.10/_compression.py:118: in readall
    while data := self.read(sys.maxsize):
/usr/lib64/python3.10/_compression.py:103: in read
    data = self._decompressor.decompress(rawblock, size)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
self = <lz4.frame.LZ4FrameDecompressor object at 0x7f61d8b7aef0>
data = b'\x04"M\x18@@\xc0\x12\x00\x00\x00ofoobar\x06\x00\xff\xff<Poobar\x00\x00\x00\x00'
max_length = 9223372036854775807
    def decompress(self, data, max_length=-1):  # noqa: F811
        """Decompresses part or all of an LZ4 frame of compressed data.
    
        The returned data should be concatenated with the output of any
        previous calls to `decompress()`.
    
        If ``max_length`` is non-negative, returns at most ``max_length`` bytes
        of decompressed data. If this limit is reached and further output can
        be produced, the `needs_input` attribute will be set to ``False``. In
        this case, the next call to `decompress()` may provide data as
        ``b''`` to obtain more of the output. In all cases, any unconsumed data
        from previous calls will be prepended to the input data.
    
        If all of the input ``data`` was decompressed and returned (either
        because this was less than ``max_length`` bytes, or because
        ``max_length`` was negative), the `needs_input` attribute will be set
        to ``True``.
    
        If an end of frame marker is encountered in the data during
        decompression, decompression will stop at the end of the frame, and any
        data after the end of frame is available from the `unused_data`
        attribute. In this case, the `LZ4FrameDecompressor` instance is reset
        and can be used for further decompression.
    
        Args:
            data (str, bytes or buffer-compatible object): compressed data to
                decompress
    
        Keyword Args:
            max_length (int): If this is non-negative, this method returns at
                most ``max_length`` bytes of decompressed data.
    
        Returns:
            bytes: Uncompressed data
    
        """
    
        if self._unconsumed_data:
            data = self._unconsumed_data + data
    
>       decompressed, bytes_read, eoframe = decompress_chunk(
            self._context,
            data,
            max_length=max_length,
            return_bytearray=self._return_bytearray,
        )
E       MemoryError
/usr/lib64/python3.10/site-packages/lz4/frame/__init__.py:394: MemoryError

For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.10/fedora-rawhide-x86_64/02174947-python-fsspec/

For all our attempts to build python-fsspec with Python 3.10, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.10/package/python-fsspec/

Testing and mass rebuild of packages is happening in copr. You can follow these instructions to test locally in mock if your package builds with Python 3.10:
https://copr.fedorainfracloud.org/coprs/g/python/python3.10/

Let us know here if you have any questions.

Python 3.10 will be included in Fedora 35. To make that update smoother, we're building Fedora packages with early pre-releases of Python 3.10.
A build failure prevents us from testing all dependent packages (transitive [Build]Requires), so if this package is required a lot, it's important for us to get it fixed soon.
We'd appreciate help from the people who know this package best, but if you don't want to work on this now, let us know so we can try to work around it on our side.

Comment 1 Miro Hrončok 2021-06-04 20:13:04 UTC
This is a mass-posted update. Sorry if it is not 100% accurate to this bugzilla.


The Python 3.10 rebuild is in progress in a Koji side tag. If you manage to fix the problem, please commit the fix in the rawhide branch, but don't build the package in regular rawhide.

You can either build the package in the side tag, with:

    $ fedpkg build --target=f35-python

Or you can the build and we will eventually build it for you.

Note that the rebuild is still in progress, so not all (build) dependencies of this package might be available right away.

Thanks.

See also https://fedoraproject.org/wiki/Changes/Python3.10

If you have general questions about the rebuild, please use this mailing list thread: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/G47SGOYIQLRDTWGOSLSWERZSSHXDEDH5/

Comment 2 Miro Hrončok 2021-06-07 22:57:58 UTC
The f35-python side tag has been merged to Rawhide. From now on, build as you would normally build.

Comment 3 Miro Hrončok 2021-06-08 11:25:34 UTC
*** Bug 1968984 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.