Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1705301 - mpi4py FTBFS with Python 3.8
Summary: mpi4py FTBFS with Python 3.8
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: mpi4py
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Thomas Spura
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: PYTHON38 1705296
TreeView+ depends on / blocked
 
Reported: 2019-05-01 23:52 UTC by Miro Hrončok
Modified: 2019-06-03 15:47 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-03 15:47:07 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Full log from Copr (342.27 KB, text/plain)
2019-05-01 23:52 UTC, Miro Hrončok
no flags Details

Description Miro Hrončok 2019-05-01 23:52:00 UTC
Created attachment 1561224 [details]
Full log from Copr

After the symptoms described in bz1705296 I've rebuilt mpich and openmpi, but now mpi4py no longer builds. That is  mpi4py-3.0.1-4.fc31:


======================================================================
FAIL: testCompareAndSwap (test_rma.TestRMASelf)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 228, in testCompareAndSwap
    self.assertEqual(rbuf[1], -1)
AssertionError: 0 != -1

======================================================================
FAIL: testFetchAndOp (test_rma.TestRMASelf)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 190, in testFetchAndOp
    self.assertEqual(rbuf[1], -1)
AssertionError: -116 != -1

======================================================================
FAIL: testCompareAndSwap (test_rma.TestRMAWorld)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 228, in testCompareAndSwap
    self.assertEqual(rbuf[1], -1)
AssertionError: 0 != -1

======================================================================
FAIL: testFetchAndOp (test_rma.TestRMAWorld)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test/test_rma.py", line 190, in testFetchAndOp
    self.assertEqual(rbuf[1], -1)
AssertionError: -124 != -1

----------------------------------------------------------------------
Ran 1100 tests in 3.549s

FAILED (failures=4, skipped=61)
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[9253,1],0]
  Exit code:    1
--------------------------------------------------------------------------
error: Bad exit status from /var/tmp/rpm-tmp.VaRIOu (%check)

Full log attached.

Comment 1 Zbigniew Jędrzejewski-Szmek 2019-05-02 18:32:33 UTC
I opened https://bitbucket.org/mpi4py/mpi4py/issues/124/test-failure-with-openmpi-401.

Comment 2 Miro Hrončok 2019-05-27 10:20:02 UTC
There is a new failure after 3.8.0a4:

src/mpi4py.MPI.c:314:11: error: too few arguments to function ‘PyCode_New’
  314 |           PyCode_New(a, k, l, s, f, code, c, n, v, fv, cell, fn, name, fline, lnos)
      |           ^~~~~~~~~~

This means that the sources need to be recythonized.

Comment 3 Miro Hrončok 2019-05-27 10:23:57 UTC
Adding this to %prep seems to help:

# Remove precythonized C sources
rm $(grep -rl '/\* Generated by Cython')



Building in Copr to see if the previous failure is still there.

Comment 4 Miro Hrončok 2019-05-27 10:39:50 UTC
Recythonizing the sources leads to:

+ mpiexec -np 1 python3 test/runtests.py -v --no-builddir --thread-level=serialized -e spawn
[41f0acf557e440989184fec990a11425:4660 :0:4660] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7ff83a90b948)
==== backtrace ====
    0  /lib64/libucs.so.0(+0x194a3) [0x7ff83a8934a3]
    1  /lib64/libucs.so.0(+0x1965a) [0x7ff83a89365a]
    2  /lib64/libuct.so.0(+0x1b72b) [0x7ff83aa4172b]
    3  /lib64/ld-linux-x86-64.so.2(+0xfe4a) [0x7ff83da9fe4a]
    4  /lib64/ld-linux-x86-64.so.2(+0xff51) [0x7ff83da9ff51]
    5  /lib64/ld-linux-x86-64.so.2(+0x13eae) [0x7ff83daa3eae]
    6  /lib64/libc.so.6(_dl_catch_exception+0x79) [0x7ff83d9ff1f9]
    7  /lib64/ld-linux-x86-64.so.2(+0x1372e) [0x7ff83daa372e]
    8  /lib64/libdl.so.2(+0x239c) [0x7ff83d53739c]
    9  /lib64/libc.so.6(_dl_catch_exception+0x79) [0x7ff83d9ff1f9]
   10  /lib64/libc.so.6(_dl_catch_error+0x33) [0x7ff83d9ff293]
   11  /lib64/libdl.so.2(+0x2b09) [0x7ff83d537b09]
   12  /lib64/libdl.so.2(dlopen+0x4a) [0x7ff83d53742a]
   13  /usr/lib64/openmpi/lib/libopen-pal.so.40(+0x6ead7) [0x7ff83cb23ad7]
   14  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_component_repository_open+0x1f4) [0x7ff83cb01524]
   15  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_component_find+0x35b) [0x7ff83cb004eb]
   16  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_components_register+0x2e) [0x7ff83cb0bdfe]
   17  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_register+0x256) [0x7ff83cb0c2e6]
   18  /usr/lib64/openmpi/lib/libopen-pal.so.40(mca_base_framework_open+0x14) [0x7ff83cb0c344]
   19  /usr/lib64/openmpi/lib/libmpi.so.40(ompi_mpi_init+0x695) [0x7ff83cc76795]
   20  /usr/lib64/openmpi/lib/libmpi.so.40(PMPI_Init_thread+0x99) [0x7ff83cca6bf9]
   21  /builddir/build/BUILDROOT/mpi4py-3.0.1-2.fc31.x86_64/usr/lib64/python3.8/site-packages/openmpi/mpi4py/MPI.cpython-38-x86_64-linux-gnu.so(+0x329bc) [0x7ff83cd849bc]
   22  /lib64/libpython3.8.so.1.0(PyModule_ExecDef+0x77) [0x7ff83d724b27]
   23  /lib64/libpython3.8.so.1.0(+0x1c7b93) [0x7ff83d724b93]
   24  /lib64/libpython3.8.so.1.0(_PyMethodDef_RawFastCallDict+0x350) [0x7ff83d67f9e0]
   25  /lib64/libpython3.8.so.1.0(_PyCFunction_FastCallDict+0x23) [0x7ff83d67fa93]
   26  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x640d) [0x7ff83d6e82ad]
   27  /lib64/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x311) [0x7ff83d66e721]
   28  /lib64/libpython3.8.so.1.0(_PyFunction_FastCallKeywords+0x196) [0x7ff83d6ac346]
   29  /lib64/libpython3.8.so.1.0(+0x159bbf) [0x7ff83d6b6bbf]
   30  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x57ea) [0x7ff83d6e768a]
   31  /lib64/libpython3.8.so.1.0(_PyFunction_FastCallKeywords+0xfa) [0x7ff83d6ac2aa]
   32  /lib64/libpython3.8.so.1.0(+0x159bbf) [0x7ff83d6b6bbf]
   33  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0xd7d) [0x7ff83d6e2c1d]
   34  /lib64/libpython3.8.so.1.0(_PyFunction_FastCallKeywords+0xfa) [0x7ff83d6ac2aa]
   35  /lib64/libpython3.8.so.1.0(+0x159bbf) [0x7ff83d6b6bbf]
   36  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0xc1c) [0x7ff83d6e2abc]
   37  /lib64/libpython3.8.so.1.0(_PyFunction_FastCallKeywords+0xfa) [0x7ff83d6ac2aa]
   38  /lib64/libpython3.8.so.1.0(+0x159bbf) [0x7ff83d6b6bbf]
   39  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0xc1c) [0x7ff83d6e2abc]
   40  /lib64/libpython3.8.so.1.0(_PyFunction_FastCallDict+0x11a) [0x7ff83d66f44a]
   41  /lib64/libpython3.8.so.1.0(+0x121787) [0x7ff83d67e787]
   42  /lib64/libpython3.8.so.1.0(_PyObject_CallMethodIdObjArgs+0xb9) [0x7ff83d6a65d9]
   43  /lib64/libpython3.8.so.1.0(PyImport_ImportModuleLevelObject+0x26b) [0x7ff83d67263b]
   44  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0x3219) [0x7ff83d6e50b9]
   45  /lib64/libpython3.8.so.1.0(_PyFunction_FastCallKeywords+0xfa) [0x7ff83d6ac2aa]
   46  /lib64/libpython3.8.so.1.0(+0x159bbf) [0x7ff83d6b6bbf]
   47  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0xc1c) [0x7ff83d6e2abc]
   48  /lib64/libpython3.8.so.1.0(+0x1da7df) [0x7ff83d7377df]
   49  /lib64/libpython3.8.so.1.0(+0x159bbf) [0x7ff83d6b6bbf]
   50  /lib64/libpython3.8.so.1.0(_PyEval_EvalFrameDefault+0xc1c) [0x7ff83d6e2abc]
   51  /lib64/libpython3.8.so.1.0(_PyEval_EvalCodeWithName+0x311) [0x7ff83d66e721]
   52  /lib64/libpython3.8.so.1.0(PyEval_EvalCodeEx+0x39) [0x7ff83d66f329]
   53  /lib64/libpython3.8.so.1.0(PyEval_EvalCode+0x1b) [0x7ff83d6ff84b]
   54  /lib64/libpython3.8.so.1.0(+0x20ee30) [0x7ff83d76be30]
   55  /lib64/libpython3.8.so.1.0(PyRun_FileExFlags+0x97) [0x7ff83d76c3b7]
   56  /lib64/libpython3.8.so.1.0(PyRun_SimpleFileExFlags+0x19a) [0x7ff83d7736da]
   57  /lib64/libpython3.8.so.1.0(_Py_RunMain+0x353) [0x7ff83d774d13]
   58  /lib64/libpython3.8.so.1.0(+0x217eb6) [0x7ff83d774eb6]
   59  /lib64/libpython3.8.so.1.0(_Py_UnixMain+0x35) [0x7ff83d774f55]
   60  /lib64/libc.so.6(__libc_start_main+0xf3) [0x7ff83d8eb193]
===================
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node 41f0acf557e440989184fec990a11425 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Comment 5 Orion Poplawski 2019-05-27 13:31:49 UTC
The segfault is a current issue with openmpi 4/UCX that has yet to be resolved.

Comment 6 Miro Hrončok 2019-06-03 11:46:30 UTC
Orion, do you happen to have some pointers for that segfault?

Comment 7 Orion Poplawski 2019-06-03 14:30:10 UTC
I'm hoping that it's been resolved with the latest openmpi build - can you try another build?

Comment 8 Miro Hrončok 2019-06-03 14:45:21 UTC
OK. Rebuilding updated openmpi first.

Comment 9 Miro Hrončok 2019-06-03 15:47:07 UTC
mpi4py builds.


Note You need to log in before you can comment on or make changes to this bug.