Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1366328
Summary: | libosmium 2.6+ is FTBFS on aarch64/ppc64le due to failing tests | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Peter Robinson <pbrobinson> |
Component: | libosmium | Assignee: | Tom Hughes <tom> |
Status: | CLOSED RAWHIDE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rawhide | CC: | tom |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-01-17 22:30:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 245418, 1071880, 922257, 1051573 |
Description
Peter Robinson
2016-08-11 15:58:19 UTC
I'm struggling to find any way to figure this out without access to aarch64 hardware. They mostly seem to be heap corruptions so I assumed they would be easy to find with valgrind or ASAN on another platform but I've tried both and found nothing so far. (In reply to Tom Hughes from comment #1) > I'm struggling to find any way to figure this out without access to aarch64 > hardware. You can run a aarch64 install in a VM on x86_64 https://fedoraproject.org/wiki/Architectures/AArch64/F24/Installation#Install_with_QEMU Yeah I've done that for ARM32 in the past and it's not an experience I particularly want to repeat... Maybe when I've got a week or two spare I'll take a look. So I got an F24 VM up and running, distro synced to rawhide and rebooted ready to try a build only to find it no longer boots and bombs out before reaching grub with: Failed to set MokListRT: Invalid Parameter FSOpen: Open '\EFI\fedora\grubaa64.efi' Success Synchronous Exception at 0x00000000B834C498 X0 0x1100069417FFFFE2 X1 0x00000000BBFF0018 X2 0x00000000B83526E8 X3 0x00000000000FD000 X4 0x0000000000000000 X5 0x0000000000000007 X6 0x0000000000000000 X7 0x00000000BBBE24D4 X8 0x0000000000000208 X9 0x00000000BF02AA00 X10 0x0000000000000023 X11 0x00000000000000AB X12 0x0000000070FFE07A X13 0x0000000000000000 X14 0x0000000000000000 X15 0x0000000000000000 X16 0x00000000BF02AC80 X17 0x0000000000000000 X18 0x0000000000000000 X19 0x00000000BBFF0018 X20 0x0000000000000000 X21 0x00000000B8352000 X22 0x0000000000000000 X23 0xAA1303E4F940E022 X24 0x0000000000000000 X25 0x0000000000000000 X26 0x0000000000000000 X27 0x0000000000000000 X28 0x0000000000000000 FP 0x00000000BF02AA10 LR 0x00000000B834D040 V0 0x0000000000000000 0000000000000000 V1 0x0000000000000000 0000000000000000 V2 0x0000000000000000 0000000000000000 V3 0x0000000000000000 0000000000000000 V4 0x0000000000000000 0000000000000000 V5 0x0000000000000000 0000000000000000 V6 0x0000000000000000 0000000000000000 V7 0x0000000000000000 0000000000000000 V8 0x0000000000000000 0000000000000000 V9 0x0000000000000000 0000000000000000 V10 0x0000000000000000 0000000000000000 V11 0x0000000000000000 0000000000000000 V12 0x0000000000000000 0000000000000000 V13 0x0000000000000000 0000000000000000 V14 0x0000000000000000 0000000000000000 V15 0x0000000000000000 0000000000000000 V16 0x0000000000000000 0000000000000000 V17 0x0000000000000000 0000000000000000 V18 0x0000000000000000 0000000000000000 V19 0x0000000000000000 0000000000000000 V20 0x0000000000000000 0000000000000000 V21 0x0000000000000000 0000000000000000 V22 0x0000000000000000 0000000000000000 V23 0x0000000000000000 0000000000000000 V24 0x0000000000000000 0000000000000000 V25 0x0000000000000000 0000000000000000 V26 0x0000000000000000 0000000000000000 V27 0x0000000000000000 0000000000000000 V28 0x0000000000000000 0000000000000000 V29 0x0000000000000000 0000000000000000 V30 0x0000000000000000 0000000000000000 V31 0x0000000000000000 0000000000000000 SP 0x00000000BF02AA10 ELR 0x00000000B834C498 SPSR 0x60000305 FPSR 0x00000000 ESR 0x94000004 FAR 0x1100069417FFFFE2 ESR : EC 0x25 IL 0x0 ISS 0x00000004 Data abort: Translation fault, zeroth level ASSERT [ArmCpuDxe] /builddir/build/BUILD/tianocore-edk2-a8c39ba/ArmPkg/Library/DefaultExceptionHandlerLib/AArch64/DefaultExceptionHandler.c(184): ((BOOLEAN)(0==1)) So I've managed to build this on aarch64 now and run valgrind on one of the failing tests and the first report is: ==11158== Thread 2 _osmium_write: ==11158== Invalid read of size 8 ==11158== at 0x1B5674: wait (future:325) ==11158== by 0x1B5674: _M_get_result (future:687) ==11158== by 0x1B5674: get (future:766) ==11158== by 0x1B5674: pop (queue_util.hpp:142) ==11158== by 0x1B5674: operator() (write_thread.hpp:85) ==11158== by 0x1B5674: osmium::io::Writer::write_thread(osmium::thread::Queue<std::future<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::unique_ptr<osmium::io::Compressor, std::default_delete<osmium::io::Compressor> >&&, std::promise<bool>&&) (writer.hpp:124) ==11158== by 0x49BABFB: ??? (in /usr/lib64/libstdc++.so.6.0.22) ==11158== by 0x48C7173: start_thread (in /usr/lib64/libpthread-2.24.90.so) ==11158== by 0x4C93F87: thread_start (in /usr/lib64/libc-2.24.90.so) ==11158== Address 0x4d6cba0 is 16 bytes inside a block of size 48 free'd ==11158== at 0x48854F4: operator delete(void*) (vg_replace_malloc.c:576) ==11158== by 0x1B58A3: _M_release (shared_ptr_base.h:166) ==11158== by 0x1B58A3: ~__shared_count (shared_ptr_base.h:662) ==11158== by 0x1B58A3: ~__shared_ptr (shared_ptr_base.h:928) ==11158== by 0x1B58A3: ~shared_ptr (shared_ptr.h:93) ==11158== by 0x1B58A3: ~__basic_future (future:641) ==11158== by 0x1B58A3: ~future (future:731) ==11158== by 0x1B58A3: destroy<std::future<std::__cxx11::basic_string<char> > > (new_allocator.h:124) ==11158== by 0x1B58A3: destroy<std::future<std::__cxx11::basic_string<char> > > (alloc_traits.h:467) ==11158== by 0x1B58A3: pop_front (stl_deque.h:1554) ==11158== by 0x1B58A3: pop (stl_queue.h:271) ==11158== by 0x1B58A3: wait_and_pop (queue.hpp:171) ==11158== by 0x1B58A3: pop (queue_util.hpp:141) ==11158== by 0x1B58A3: operator() (write_thread.hpp:85) ==11158== by 0x1B58A3: osmium::io::Writer::write_thread(osmium::thread::Queue<std::future<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::unique_ptr<osmium::io::Compressor, std::default_delete<osmium::io::Compressor> >&&, std::promise<bool>&&) (writer.hpp:124) ==11158== by 0x49BABFB: ??? (in /usr/lib64/libstdc++.so.6.0.22) ==11158== by 0x48C7173: start_thread (in /usr/lib64/libpthread-2.24.90.so) ==11158== by 0x4C93F87: thread_start (in /usr/lib64/libc-2.24.90.so) ==11158== Block was alloc'd at ==11158== at 0x48843D4: operator new(unsigned long) (vg_replace_malloc.c:334) ==11158== by 0x1B94AF: allocate (new_allocator.h:104) ==11158== by 0x1B94AF: allocate (alloc_traits.h:416) ==11158== by 0x1B94AF: __allocate_guarded<std::allocator<std::_Sp_counted_ptr_inplace<std::__future_base::_State_baseV2, std::allocator<std::__future_base::_State_baseV2>, (__gnu_cxx::_Lock_policy)2u> > > (allocated_ptr.h:103) ==11158== by 0x1B94AF: __shared_count<std::__future_base::_State_baseV2, std::allocator<std::__future_base::_State_baseV2> > (shared_ptr_base.h:613) ==11158== by 0x1B94AF: __shared_ptr<std::allocator<std::__future_base::_State_baseV2> > (shared_ptr_base.h:1100) ==11158== by 0x1B94AF: shared_ptr<std::allocator<std::__future_base::_State_baseV2> > (shared_ptr.h:319) ==11158== by 0x1B94AF: allocate_shared<std::__future_base::_State_baseV2, std::allocator<std::__future_base::_State_baseV2> > (shared_ptr.h:620) ==11158== by 0x1B94AF: make_shared<std::__future_base::_State_baseV2> (shared_ptr.h:636) ==11158== by 0x1B94AF: promise (future:1024) ==11158== by 0x1B94AF: void osmium::io::detail::add_to_queue<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(osmium::thread::Queue<std::future<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) (queue_util.hpp:81) ==11158== by 0x1B98C7: send_to_output_queue (output_format.hpp:117) ==11158== by 0x1B98C7: osmium::io::detail::XMLOutputFormat::write_header(osmium::io::Header const&) (xml_output_format.hpp:475) ==11158== by 0x1BA46F: operator() (writer.hpp:231) ==11158== by 0x1BA46F: ensure_cleanup<osmium::io::Writer::Writer(const osmium::io::File&, TArgs&& ...) [with TArgs = {osmium::io::Header&, osmium::io::overwrite}]::<lambda()> > (writer.hpp:152) ==11158== by 0x1BA46F: osmium::io::Writer::Writer<osmium::io::Header&, osmium::io::overwrite>(osmium::io::File const&, osmium::io::Header&, osmium::io::overwrite&&) (writer.hpp:230) ==11158== by 0x1BAA9F: osmium::io::Writer::Writer<osmium::io::Header&, osmium::io::overwrite>(char const*, osmium::io::Header&, osmium::io::overwrite&&) (writer.hpp:242) ==11158== by 0x1B048B: ____C_A_T_C_H____T_E_S_T____7() (test_output_iterator.cpp:11) ==11158== by 0x1C81EB: invoke (catch.hpp:6582) ==11158== by 0x1C81EB: invoke (catch.hpp:7519) ==11158== by 0x1C81EB: invokeActiveTestCase (catch.hpp:6158) ==11158== by 0x1C81EB: runCurrentTest (catch.hpp:6129) ==11158== by 0x1C81EB: runTest (catch.hpp:5949) ==11158== by 0x1C81EB: Catch::runTests(Catch::Ptr<Catch::Config> const&) (catch.hpp:6297) ==11158== by 0x1AF597: run (catch.hpp:6405) ==11158== by 0x1AF597: run (catch.hpp:6384) ==11158== by 0x1AF597: main (catch.hpp:10333) which relates to this piece of code: std::future<T> data_future; m_queue.wait_and_pop(data_future); data = std::move(data_future.get()); and specifically seems to say that the attempt to get the value of the future on the third line is accessing memory that was freed while popping the future from the queue on the previous line. That doesn't seem to make much sense though, because wait_and_pop is doing: value = std::move(m_queue.front()); m_queue.pop(); so the returned value is moved out of the queue before the pop and hence the pop should be destroying an empty value, not the returned value.
> which relates to this piece of code:
>
> std::future<T> data_future;
> m_queue.wait_and_pop(data_future);
> data = std::move(data_future.get());
>
> and specifically seems to say that the attempt to get the value of the
> future on the third line is accessing memory that was freed while popping
> the future from the queue on the previous line.
>
> That doesn't seem to make much sense though, because wait_and_pop is doing:
>
> value = std::move(m_queue.front());
> m_queue.pop();
>
> so the returned value is moved out of the queue before the pop and hence the
> pop should be destroying an empty value, not the returned value.
Compiler error?
I'm wondering about a possible compiler or libstdc++ issue yes... This also seems to fail in the same way on ppc64le (but not ppc64) so I have excluded that as well for now. Failed scratch build logs for 2.10.2 on each: https://kojipkgs.fedoraproject.org/work/tasks/2495/16472495/build.log https://kojipkgs.fedoraproject.org/work/tasks/2499/16472499/build.log Also reported upstream now: https://github.com/osmcode/libosmium/issues/176 Restoring correct trackers per current ExcludeArch policy (see https://fedoraproject.org/wiki/Packaging:Guidelines#Architecture_Build_Failures). I have no idea what changed but libosmium 2.11.0 has built successfully and passed tests on all architectures so either something has changed in libosmium or some bug has been fixed in the toolchain or system libraries. |