Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1746564
Summary: | incorrect MPI_TAG_UB, throws "'boost::wrapexcept<boost::mpi::exception>' what(): MPI_Recv: MPI_ERR_TAG: invalid tag" | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Jean-Noël Grad <jgrad> | ||||||
Component: | openmpi | Assignee: | Philip Kovacs <pkdevel> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 31 | CC: | dakingun, dledford, hjelmn, hladky.jiri, junghans, orion, pkdevel | ||||||
Target Milestone: | --- | Keywords: | Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2019-08-31 20:05:54 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1728057 | ||||||||
Attachments: |
|
Description
Jean-Noël Grad
2019-08-28 19:04:56 UTC
I took some time to look at this -- the problem is somewhere in the ucx layer. Now I don't have f31, but I do have f32 rawhide and I did observe the max tag = 8388608 problem using the get_tag.cc sample provided. I reconfigured openmpi 4.0.1 without ucx and that resolves the problem: mpirun -np 4 ./get_tag MPI_TAG_UB = 2147483647 MPI_TAG_UB = 2147483647 MPI_TAG_UB = 2147483647 MPI_TAG_UB = 2147483647 Cheers, Phil Not a bug in Open MPI. Read the MPI standard. The MPI implementation is free to pick any upper bound to the tag range. pml/ucx just happens to have a smaller UB than pml/ob1. If boost is using a tag outside the allowed range then this is a bug in boost. At least we understand now the reason the OP is seeing this starting with f31: the addition of ucx support reduced the UB and revealed a bug elsewhere. The path seems clear now to finding the root cause either in boost upstream or Fedora's packaging of same downstream. Created attachment 1609468 [details]
Minimum working example without boost
Thanks to both of you for these clarifications! I can now reproduce the bug when compiling ucx + openmpi + boost from sources, which finally allowed me to work on a clean debug build in GDB. The incorrect tag used by boost::mpi::reduce() has value -8388608 and is received from the MPI_Status object generated by an MPI_Recv() call (https://github.com/boostorg/mpi/blob/48879409552179b2d830740d0f44dcbfa8890aec/src/point_to_point.cpp#L88-L90). It is immediately used as the tag argument of another MPI_Recv() call (https://github.com/boostorg/mpi/blob/48879409552179b2d830740d0f44dcbfa8890aec/src/point_to_point.cpp#L94-L97) which returns an error code, causing boost::mpi to throw the exception. When replacing the body of the boost::mpi::environment::max_tag() method (https://github.com/boostorg/mpi/blob/48879409552179b2d830740d0f44dcbfa8890aec/src/environment.cpp#L169-L183) by `return 8388602;`, the sample program doesn't throw an exception anymore and the value of MPI_Status.MPI_TAG is the expected 8388603 (8388602 + num_reserved_tags). This suggests using MPI_TAG_UB is not possible in openmpi 4.0.1 with ucx, even though the MPI standard 3.1 specifically states MPI_TAG_UB is a valid tag. The send_recv.cpp file attached shows the bare minimum of MPI communication, and fails when the tag is MPI_TAG_UB (note how the received tag got a sign flip): [user@300292f4c8e3 ~]$ mpicxx -std=c++11 send_tag.cpp [user@300292f4c8e3 ~]$ mpiexec -n 2 a.out MPI_TAG_UB = 8388608 Sent 7 with error 0 and MPI tag 8388608 Received 7 with error 0 and MPI tag -8388608 This has all the characteristics of integer overflow on a signed 24bit integer type, which is undefined behavior. This bug already has a fix in 4.0.2 (https://github.com/open-mpi/ompi/pull/6792). When compiling ucx + openmpi + boost from sources with this 4.0.2 fix as a patch on the openmpi 4.0.1 sources, I obtain the expected behavior for my sample. So it was an openmpi bug. That means we need to bump openmpi to 4.0.2 if we intend to continue building with ucx. Indeed. UCX devs had on off by one error. We (Open MPI) should have test coverage for sending max tag. Apparently not. It is fixed in 4.0.2 so that is the best path forward. Erratum: the bash commands in Comment 5 should read send_recv.cpp instead of send_tag.cpp Fixing this bug should also fix Bug 1728057, I've just compiled the espresso package with the patched openmpi 4.0.1 library and it passed all the tests in a Docker container with Fedora 31. @Nathan, when can we expect to see openmpi-4.0.2 in rawhide? I have 4.0.2rc1 done already for rawhide. There is an unrelated problem in rawhide with one of the arch'es, aarch64, but I can push nevertheless. OK, 4.0.2rc1 is now in rawhide. Thanks! Huh? DEBUG util.py:585: BUILDSTDERR: Error: transaction check vs depsolve: DEBUG util.py:585: BUILDSTDERR: libc.so.6(GLIBC_PRIVATE)(64bit) is needed by openmpi-4.0.2-0.1.rc1.fc32.x86_64 DEBUG util.py:585: BUILDSTDERR: To diagnose the problem, try running: 'rpm -Va --nofiles --nodigest'. DEBUG util.py:585: BUILDSTDERR: You probably have corrupted RPMDB, running 'rpm --rebuilddb' might fix the issue. DEBUG util.py:734: Child return code was: 1 Odd, I see it. Perhaps the machine it built on had a problem. Let me look. Can we get a bump for f31 as well? I see the problem. The issue appears to be that this open mpi compilation unit below is using glibc's private function __mmap: openmpi-4.0.2rc1/opal/mca/memory/patcher/memory_patcher_component.c: result = __mmap (start, length, prot, flags, fd, offset) <== ouch we can't so that! The configure checks are probably doing compile checks but not link checks and not seeing that __mmap is a private glibc function. The outcome is a libopen-pal.so.40.20.2 with an unusable symbol: readelf -s libopen-pal.so.40.20.2 | grep PRIVATE 20: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __mmap@GLIBC_PRIVATE (4) <=== No! We mustn't do this. The ompi folks source may have already caught this problem, but the 4.0.2rc1 tarball I am using from their download page either has to be patched or updated. I think the upstream commit for https://github.com/open-mpi/ompi/issues/6853 may help. I'm going to try that. I'm building the fix now in rawhide. openmpi 4.0.2-0.2.rc1.fc32 is now in rawhide. It should have no private symbol problems. Try it when it hits the mirrors and let me know how it goes. 4.0.2 is now in an acceptable state, closing this. I will merge and build for 31 and pass it to Zbignew so he can replace 4.0.1 in bodhi with 4.0.2. |