Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1728060 - mpi4py fails to build in rawhide
Summary: mpi4py fails to build in rawhide
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Fedora
Classification: Fedora
Component: mpi4py
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Thomas Spura
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: PYTHON38
TreeView+ depends on / blocked
 
Reported: 2019-07-08 23:14 UTC by Miro Hrončok
Modified: 2019-07-31 05:48 UTC (History)
4 users (show)

Fixed In Version: mpi4py-3.0.2-2.fc31
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-31 05:48:32 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Miro Hrončok 2019-07-08 23:14:36 UTC
mpi4py fails to build with Python 3.8.0b1.


======================================================================
FAIL: testCompareAndSwap (test_rma.TestRMASelf)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 228, in testCompareAndSwap
    self.assertEqual(rbuf[1], -1)
AssertionError: 0 != -1

======================================================================
FAIL: testFetchAndOp (test_rma.TestRMASelf)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 190, in testFetchAndOp
    self.assertEqual(rbuf[1], -1)
AssertionError: 47 != -1

======================================================================
FAIL: testCompareAndSwap (test_rma.TestRMAWorld)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 228, in testCompareAndSwap
    self.assertEqual(rbuf[1], -1)
AssertionError: 0 != -1

======================================================================
FAIL: testFetchAndOp (test_rma.TestRMAWorld)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 190, in testFetchAndOp
    self.assertEqual(rbuf[1], -1)
AssertionError: -69 != -1

----------------------------------------------------------------------
Ran 1102 tests in 7.053s

FAILED (failures=4, skipped=46)
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[14054,1],0]
  Exit code:    1
--------------------------------------------------------------------------
[1562606212.394756] [22b764337d874def8754761c8bb283ea:4889 :0]            sys.c:618  UCX  ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed: Operation not permitted, please check shared memory limits by 'ipcs -l'
[1562606212.540543] [22b764337d874def8754761c8bb283ea:4889 :0]            sys.c:618  UCX  ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed: Operation not permitted, please check shared memory limits by 'ipcs -l'
[1562606213.398255] [22b764337d874def8754761c8bb283ea:4889 :0]            sys.c:618  UCX  ERROR shmget(size=2097152 flags=0xb80) for ucp_am_bufs failed: Operation not permitted, please check shared memory limits by 'ipcs -l'

This might actually be a copr problem, not sure. Let me know if you cannot reproduce it outside of mock.

For the build logs, see:
https://copr-be.cloud.fedoraproject.org/results/@python/python3.8/fedora-rawhide-x86_64/00964785-mpi4py/

For all our attempts to build mpi4py with Python 3.8, see:
https://copr.fedorainfracloud.org/coprs/g/python/python3.8/package/mpi4py/

Testing and mass rebuild of packages is happening in copr. You can follow these instructions to test locally in mock if your package builds with Python 3.8:
https://copr.fedorainfracloud.org/coprs/g/python/python3.8/

Let us know here if you have any questions.

Comment 1 Miro Hrončok 2019-07-30 15:02:19 UTC
Zbyszek, would you be able to help here?

Comment 2 Zbigniew Jędrzejewski-Szmek 2019-07-30 17:25:26 UTC
It fails the same in normal rawhide on amd64. No idea.
I'll update mpich to the lastest version, maybe that'll help.

Comment 3 Zbigniew Jędrzejewski-Szmek 2019-07-30 18:09:20 UTC
[1564354515.215559] [08dfc006c2a24ed0bf7d9276d6077ef3:4889 :0]            sys.c:618  UCX  ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed: Operation not permitted, please check shared memory limits by 'ipcs -l'
[1564354515.363010] [08dfc006c2a24ed0bf7d9276d6077ef3:4889 :0]            sys.c:618  UCX  ERROR shmget(size=2097152 flags=0xfb0) for mm_recv_desc failed: Operation not permitted, please check shared memory limits by 'ipcs -l'
[1564354516.244026] [08dfc006c2a24ed0bf7d9276d6077ef3:4889 :0]            sys.c:618  UCX  ERROR shmget(size=2097152 flags=0xb80) for ucp_am_bufs failed: Operation not permitted, please check shared memory limits by 'ipcs -l'

This might be the cause. But I get the same failure on my machine, and it seems the limits are very high:
$  ipcs -l

------ Messages Limits --------
max queues system wide = 32000
max size of message (bytes) = 8192
default max size of queue (bytes) = 16384

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 18014398509465599
max total shared memory (kbytes) = 18014398509481980
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 32000
max semaphores per array = 32000
max semaphores system wide = 1024000000
max ops per semop call = 500
semaphore max value = 32767

Comment 4 Zbigniew Jędrzejewski-Szmek 2019-07-30 18:16:04 UTC
python3-mpich-3.1.1-1.fc31.x86_64 makes no difference ;(

Comment 6 Zbigniew Jędrzejewski-Szmek 2019-07-31 05:48:32 UTC
I made the build pass by ignoring the test failures. I don't think we gain much by keeping the package
in FTBFS state. Maybe upstream will know how to fix this.


Note You need to log in before you can comment on or make changes to this bug.