Note: This is a public test instance of Red Hat Bugzilla. The data contained within is a snapshot of the live data so any changes you make will not be reflected in the production Bugzilla. Email is disabled so feel free to test any aspect of the site that you want. File any problems you find or give feedback at bugzilla.redhat.com.
Bug 1477354 - ld.bfd: binutils 2.29 dropped global visibility of implicit symbols pertaining the (orphaned) sections (__{start,stop}_S) on some architectures (not ppc64*)
Summary: ld.bfd: binutils 2.29 dropped global visibility of implicit symbols pertainin...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: binutils
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Nick Clifton
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-01 21:03 UTC by Jan Pokorný [poki]
Modified: 2018-02-07 11:15 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-04 07:18:03 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1478089 0 unspecified CLOSED logging feature silently severed due to changed treatment of orphaned sections in ld.bfd/binutils-2.29 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1494598 0 unspecified CLOSED Deprecation of liblvm API and python bindings 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 1500898 0 unspecified CLOSED [ld.gold] message-less compilation results in "error while loading shared libraries: unexpected reloc type 0x0b" in run-... 2021-02-22 00:41:40 UTC
Sourceware 21964 0 None None None 2017-10-04 11:47:33 UTC

Internal Links: 1478089 1494598 1500898

Description Jan Pokorný [poki] 2017-08-01 21:03:36 UTC
All started by unexpected libqb build failure in recent mass rebuild
in rawhide and x86_64 architecture, which should not have been affected
by the commonly observed failure elsewhere in that round:

https://koji.fedoraproject.org/koji/getfile?taskID=20791411&volume=DEFAULT&name=build.log&offset=-1824

> PASS: array.test
> PASS: map.test
> PASS: rb.test
> FAIL: log.test
> PASS: blackbox-segfault.sh
> PASS: loop.test
> PASS: ipc.test
> PASS: resources.test

Log test failed.  After some investigation, I realized this is due to
some linker change between 2.28 an 2.29, and even identified commits
that likely introduced the change leading to the failed tests, as
reverting them restores the expected outcome[1]:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=7dba9362c172f1073487536eb137feb2da30b0ff
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=b27685f2016c510d03ac9a64f7b04ce8efcf95c4
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commit;h=cbd0eecf261c2447781f8c89b0d955ee66fae7e9

Indeed, libqb is using orphaned section heavily when it detects the
compiler supports __attribute__((section("foo"))) construct.
My understanding is that it's so as to have the static data stored
in compile-time, with an alternative approach of extracting them
in run-time, which is less efficient, indeed.

Sadly, I was unable to come up with a better minimal case that would also
fit the libqb pattern, but at least it shows something has changed:

=== dummylib.c ===

#include <stdio.h>
#define JOIN_PASS(a, b) a##b
#define JOIN(a, b)  JOIN_PASS(a, b)
#define STRINGIFY_PASS(a) #a
#define STRINGIFY(a) STRINGIFY_PASS(a)
#define SECTION_NAME __foo
#define SECTION_START JOIN(__start_, SECTION_NAME)
#define SECTION_STR STRINGIFY(SECTION_NAME)
extern int SECTION_START[];
static void dummy(void) {
        { static int __attribute__((section(SECTION_STR))) z = 13; }
        { static int __attribute__((section(SECTION_STR))) z = 17; }
        printf("dummy finished %d %d\n",
               *((int *) SECTION_START), *((int *) SECTION_START + 1));
}       
int do_something(int n) { dummy(); return n*n; }

=== test.c ===

#include <stdio.h>
#define JOIN_PASS(a, b) a##b
#define JOIN(a, b)  JOIN_PASS(a, b)
#define STRINGIFY_PASS(a) #a
#define STRINGIFY(a) STRINGIFY_PASS(a)
#define SECTION_NAME __foo
#define SECTION_START JOIN(__start_, SECTION_NAME)
#define SECTION_STR STRINGIFY(SECTION_NAME)
extern int SECTION_START[];
int do_something(int n);
static void test(void) {
        { static int __attribute__((section(SECTION_STR))) z = 42; }
        { static int __attribute__((section(SECTION_STR))) z = 10; }
        printf("test finished %d %d\n",
               *((int *) SECTION_START), *((int *) SECTION_START + 1));
}
int main(char argc, char *argv[]) {
        int ret = do_something(argc);
        test();
        printf("got it? %d %d\n",
              *((int *) SECTION_START), *((int *) SECTION_START + 1));
        return ret;
}

===

# gcc -shared -fPIC dummylib.c -o libdummy.so
# gcc -o test -L. -ldummy test.c
# LD_LIBRARY_PATH=. ./test 

ld.bfd @ 2.28:
> dummy finished 42 10
> test finished 42 10
> got it? 42 10

ld.bfd @ 2.29:
> dummy finished 13 17
> test finished 42 10
> got it? 42 10


In libqb unit test that now fails, the problem may be similar:
the library's section "wins" over binary's proper one within the
in-library-call.  As an aside, ld.gold was likely never applied
with libqb, as that won't work regardless of binutils version;
the output with the above minimal example:

> ./test: Symbol `__start___foo' causes overflow in R_X86_64_PC32 relocation
> ./test: Symbol `__start___foo' causes overflow in R_X86_64_PC32 relocation
> ./test: Symbol `__start___foo' causes overflow in R_X86_64_PC32 relocation
> ./test: Symbol `__start___foo' causes overflow in R_X86_64_PC32 relocation

On the other hand, the established arrangement within libqb served well
for the past 6+ years, so I'd dare to call the change in ld.bfd behaviour
a regression.  What's worse, any inflicted loss of logging output happens
silently, making people realize there's a problem when they would need
some clues as to what happens the most :-/



[1] http://oss.clusterlabs.org/pipermail/developers/2017-July/000504.html

Comment 1 Jan Pokorný [poki] 2017-08-02 08:20:59 UTC
re [comment 0]:

> In libqb unit test that now fails, the problem may be similar:
> the library's section "wins" over binary's proper one within the
> in-library-call.

Indeed, a probable root cause in libqb is that qb_log_init
(in-library-call) refers to the pointers for section boundaries that
are expected to refer to the context of "main compilation unit" (possibly
out-of-library), but with ld.bfd @ 2.29, they refer to the library
context uncondtionally:

https://github.com/ClusterLabs/libqb/blob/v1.0.2/lib/log.c#L873

Trying to set various visibility attributes at the extern declaration for
the section boundary symbols was fruitless:

https://github.com/ClusterLabs/libqb/blob/v1.0.2/include/qb/qblog.h#L268


Is there any apparent solution (barring a complete rethinking) for libqb
to stay compatible with unpatched binutils 2.29?

Any idea on a shortest test possible (rather than being fixed on
ld.bfd --version) for a configure script to detect misbehaving linker
in this sense?  The example in [comment 0] feels still too large.

Can this indeed be considered a bug in ld.bfd?

Comment 2 Nick Clifton 2017-08-02 12:00:29 UTC
Hi Jan,

  Relying upon the linker's orphan section placement algorithm to remain
  constant is a very bad idea IMHO.  As an alternative, why not tell the
  linker exactly what you want ?  For example with the small test case you
  reported, if you change the second gcc command to this:

    gcc -Wl,foo.t -o test -L. -ldummy test.c 

  then I think that the test will work.  The foo.t file looks like this:

    # cat foo.t
    SECTIONS
    {
      __foo :
      {
        PROVIDE (__start___foo = .);
        *(__foo);
      }
    }

  This just tells the linker to accumulate all __foo sections together and
  to define a symbol called __start___foo at the beginning of the section.
  Since the file is included on the command line as an input file, rather
  than being the argument of a -T or --script option, the linker will use
  the contents to *augment* the built-in linker script, rather than override
  it.  This solution also has the advantage of being backwards compatible,
  so it should work the 2.28 linker as well.

  Does this solve the problem for you ?

Cheers
  Nick

Comment 3 Jan Pokorný [poki] 2017-08-02 12:53:50 UTC
Nick, thanks for the response.

Yes, it's entirely possible that previously libqb was relying on some
behaviour that was working just ouf of pure luck (though it lasted for
at least 6 recent years).

And sadly, nope, the proposed update to the build recipe didn't help,
furthermore I am getting this complaint:

> /usr/bin/ld: warning: foo.t contains output sections; did you forget -T?


Easy way out would be to extend the interface of the library function:

=== dummylib.c ===

#include <stdio.h>
#define JOIN_PASS(a, b) a##b
#define JOIN(a, b)  JOIN_PASS(a, b)
#define STRINGIFY_PASS(a) #a
#define STRINGIFY(a) STRINGIFY_PASS(a)
#define SECTION_NAME __foo
#define SECTION_START JOIN(__start_, SECTION_NAME)
#define SECTION_STR STRINGIFY(SECTION_NAME)
extern int SECTION_START[];
static void dummy(const int *start) {
        { static int __attribute__((section(SECTION_STR))) z = 13; }
        { static int __attribute__((section(SECTION_STR))) z = 17; }
        printf("dummy finished %d %d\n",
               *((int *) start), *((int *) start + 1));
}       
int do_something(int n, const int *start) { dummy(start); return n*n; }

=== test.c ===

#include <stdio.h>
#define JOIN_PASS(a, b) a##b
#define JOIN(a, b)  JOIN_PASS(a, b)
#define STRINGIFY_PASS(a) #a
#define STRINGIFY(a) STRINGIFY_PASS(a)
#define SECTION_NAME __foo
#define SECTION_START JOIN(__start_, SECTION_NAME)
#define SECTION_STR STRINGIFY(SECTION_NAME)
extern int SECTION_START[];
int do_something(int n, const int *start);
static void test(void) {
        { static int __attribute__((section(SECTION_STR))) z = 42; }
        { static int __attribute__((section(SECTION_STR))) z = 10; }
        printf("test finished %d %d\n",
               *((int *) SECTION_START), *((int *) SECTION_START + 1));
}
int main(char argc, char *argv[]) {
        int ret = do_something(argc, SECTION_START);
        test();
        printf("got it? %d %d\n",
              *((int *) SECTION_START), *((int *) SECTION_START + 1));
        return ret;
}

===

But what if we don't want to change do_something interface at any cost?
Can we somehow arrange for programmatically (dlopen + dlsym?) accessing
SECTION_START of test.c from within do_something (within dummylib.c)?

Comment 4 Nick Clifton 2017-08-02 15:03:22 UTC
(In reply to Jan Pokorný from comment #3)

Hi Jan,

> And sadly, nope, the proposed update to the build recipe didn't help,

What happened - did the behaviour change at all ?


> furthermore I am getting this complaint:
> 
> > /usr/bin/ld: warning: foo.t contains output sections; did you forget -T?

Yes, I was hoping that that would not matter to you.  (You can ignore the
message, it does not effect the output).  I also note that when I tried
your original test case I received this warning from the linker:

 ld: warning: type and size of dynamic symbol `__start___foo' are not defined

Do you get this too ?

> But what if we don't want to change do_something interface at any cost?
> Can we somehow arrange for programmatically (dlopen + dlsym?) accessing
> SECTION_START of test.c from within do_something (within dummylib.c)?

I must admit that I still do not really understand why libqb is doing this.  What is wrong with allowing the compiler and linker to do their thing and arrange for access to data in the normal way ?

One idea - untested - is that in the library, instead of referring to __start___foo, you refer to the address of the first declared symbol ?  
(Hoping that the linker does not rearrange the order of the symbols, which
of course it may well do).

Cheers
  Nick

Comment 5 Jan Pokorný [poki] 2017-08-02 17:13:57 UTC
re [comment 4]

> What happened - did the behaviour change at all ?

Yep, mere combination of files/commands from [comment 0] and [comment 2]
did not change the behaviour at all.


>> furthermore I am getting this complaint:
>> 
>> > /usr/bin/ld: warning: foo.t contains output sections; did you forget -T?
> 
> Yes, I was hoping that that would not matter to you.  (You can ignore the
> message, it does not effect the output).

It doesn't matter to me in any other way than, frankly, I don't understand
low-level linker magic (and barely some user-facing options), so whatever is
complained about can have consequences I cannot predict :)

> I also note that when I tried
> your original test case I received this warning from the linker:
> 
>  ld: warning: type and size of dynamic symbol `__start___foo' are not defined
> 
> Do you get this too ?

No, haven't seen this with binutils-2.29-5.fc27.x86_64,
gcc-7.1.1-6.fc27.x86_64.

> I must admit that I still do not really understand why libqb is
> doing this.  What is wrong with allowing the compiler and linker to
> do their thing and arrange for access to data in the normal way ?

Original author is no longer active (or with RH, for that matter).
As noted above, I am only coincidentally getting to explore this
rabbit hole.  I guess there could be two reasons why qb_log_init is
not using the pattern I sketched in [comment 3]:

- a goal to maintain ABI/API compatibility from times there wasn't
  this compile-time offloading of run-time work
- unified approach (single header to do the right job) used for
  both intra-library and actual user logging, and it was working
  well with ld.bfd prior to 2.29

(rhetorical question: is there only a single project affected by this
change?  I would doubt that)


> One idea - untested - is that in the library, instead of referring to
> __start___foo, you refer to the address of the first declared symbol ?
> (Hoping that the linker does not rearrange the order of the symbols,
> which of course it may well do).

I am not sure I follow, but I tried to workaround the section aliasing
issue by adding one more section that's expected to be present only
in binary, not in the library:

--- dummylib.c.orig	2017-08-02 18:28:53.252179427 +0200
+++ dummylib.c	2017-08-02 19:07:35.888675684 +0200
@@ -7,10 +7,18 @@
 #define SECTION_START JOIN(__start_, SECTION_NAME)
 #define SECTION_STR STRINGIFY(SECTION_NAME)
 extern int SECTION_START[];
+#define CROSSUNIT 1
+#if CROSSUNIT
+extern int *__start___bar[];
+#endif
 static void dummy(void) {
         { static int __attribute__((section(SECTION_STR))) z = 13; }
         { static int __attribute__((section(SECTION_STR))) z = 17; }
         printf("dummy finished %d %d\n",
+#if CROSSUNIT
+               **((int **) __start___bar), *(*((int **) __start___bar) + 1));
+#else
                *((int *) SECTION_START), *((int *) SECTION_START + 1));
+#endif
 }       
 int do_something(int n) { dummy(); return n*n; }

--- test.c.orig	2017-08-02 18:36:24.460664287 +0200
+++ test.c	2017-08-02 19:06:31.324606291 +0200
@@ -7,14 +7,17 @@
 #define SECTION_START JOIN(__start_, SECTION_NAME)
 #define SECTION_STR STRINGIFY(SECTION_NAME)
 extern int SECTION_START[];
+extern int *__start___bar[];
 int do_something(int n);
 static void test(void) {
-        { static int __attribute__((section(SECTION_STR))) z = 42; }
+        { static int __attribute__((section(SECTION_STR))) z = 42;
+          static int __attribute__((section("__bar"))) *y = &z; }
         { static int __attribute__((section(SECTION_STR))) z = 10; }
         printf("test finished %d %d\n",
                *((int *) SECTION_START), *((int *) SECTION_START + 1));
 }
 int main(char argc, char *argv[]) {
+        printf("**__bar_start == %d\n", **((int **) __start___bar));
         int ret = do_something(argc);
         test();
         printf("got it? %d %d\n",


With that "define CROSSUNIT 1", I am getting this linker error:

> /usr/bin/ld: test: hidden symbol `__start___bar' in test is referenced by DSO
> /usr/bin/ld: final link failed: Bad value
> collect2: error: ld returned 1 exit status

When I "define CROSSUNIT 0", the binary is created, with output:

> **__bar_start == 42
> dummy finished 13 17
> test finished 42 10
> got it? 42 10

(**__bar_start is per expectation there)

Comment 6 Jan Pokorný [poki] 2017-08-02 17:23:25 UTC
Furthermore, "section start equals address of the first item being put
here" seems pretty fragile, because I can imagine more complex cases
where it is hard to say which item will be "first" from the buildchain
perspective.

Comment 7 Jan Pokorný [poki] 2017-08-02 19:15:59 UTC
Ok, I think I have a sketch of possible solution, using sort of
out-of-band-signaling (extraneous exported symbols).

=== dummylib.c ===

#include <stdio.h>                                                                                    
#define JOIN_PASS(a, b) a##b                                                                          
#define JOIN(a, b)  JOIN_PASS(a, b)                                                                   
#define STRINGIFY_PASS(a) #a                                                                          
#define STRINGIFY(a) STRINGIFY_PASS(a)                                                                
#define SECTION_NAME __foo                                                                            
#define SECTION_START JOIN(__start_, SECTION_NAME)                                                    
#define SECTION_STOP JOIN(__stop_, SECTION_NAME)                                                      
#define SECTION_STR STRINGIFY(SECTION_NAME)                                                           
extern int SECTION_START[];                                                                           
extern int SECTION_STOP[];                                                                            
extern int *main_section_first;
extern int *main_section_last;
int *main_section_first __attribute__((weak)) = NULL;
int *main_section_last __attribute__((weak)) = NULL;
static void dummy(void) {
        { static int __attribute__((section(SECTION_STR))) z = 13; }
        { static int __attribute__((section(SECTION_STR))) z = 17; }                                  
        { static int __attribute__((section(SECTION_STR))) z = 23; }                                  
        if (main_section_first == NULL || main_section_last == NULL) {
                main_section_first = SECTION_START;
                main_section_last = SECTION_STOP;
        }
        for (int *i = main_section_first; i < main_section_last; i++) {
                printf("dummy: %d\n", *i);
        }
}       
int do_something(int n) { dummy(); return n*n; }

=== test.c ===

#include <stdio.h>                                                                                    
#define JOIN_PASS(a, b) a##b                                                                          
#define JOIN(a, b)  JOIN_PASS(a, b)                                                                   
#define STRINGIFY_PASS(a) #a                                                                          
#define STRINGIFY(a) STRINGIFY_PASS(a)                                                                
#define SECTION_NAME __foo                                                                            
#define SECTION_START JOIN(__start_, SECTION_NAME)                                                    
#define SECTION_STOP JOIN(__stop_, SECTION_NAME)                                                      
#define SECTION_STR STRINGIFY(SECTION_NAME)                                                           
extern int SECTION_START[];                                                                           
extern int SECTION_STOP[];                                                                            
int *main_section_first = SECTION_START;
#if 1
int *main_section_last = SECTION_STOP;
#endif
int do_something(int n);
static void test(void) {
        { static int __attribute__((section(SECTION_STR))) z = 42; }                                  
        { static int __attribute__((section(SECTION_STR))) z = 10; }                                  
        { static int __attribute__((section(SECTION_STR))) z = 1; }
        printf("test finished %d %d\n",
               *((int *) SECTION_START), *((int *) SECTION_START + 1));
}       
int main(char argc, char *argv[]) {
        printf("*main_section_first == %d\n", *((int *) main_section_first));
        int ret = do_something(argc);
        test();
        printf("got it? %d %d\n",
              *((int *) SECTION_START), *((int *) SECTION_START + 1));
        return ret;
}

===

When "#if 1":

> *main_section_first == 42
> dummy: 42
> dummy: 10
> dummy: 1
> test finished 42 10
> got it? 42 10

and when changed to "#if 0":

> *main_section_first == 42
> dummy: 13
> dummy: 17
> dummy: 23
> test finished 42 10
> got it? 42 10

That could achieve basic forward/backward compatibility (indeed, the
reported case, unpatched libqb vs. ld.bfd @ 2.29 won't work, perhaps
there could be some additional one-off self-test for libqb users
compiled against newer libqb when the older one is actually in use
to prevent mentioned no-logs issue?).  Plus some library-specific
tweaking.

More importantly, such a change would also bring a compatibility with
ld.gold -> portability++.

That being said, I'd take any possibly further discussion regarding
the solution on the libqb side out of this bug (except if you have
anything immediately to add, Nick, anyway, thanks for the assistance
so far), I'll only post final commit(s) link when that's ready.

It's up to consideration what the risk of other SW being broken by
this unexpected change is in the wild.  Probably only time will show.

Comment 8 Jan Pokorný [poki] 2017-08-03 12:29:56 UTC
After further contemplation, here's the actually working solution across
both ld.bfd {2.28,2.29}, but not with ld.gold, which still might be
solvable when -Wl,<file> is turned into proper "-T <file>":

=== dummylib.c ===

#include <stdio.h>
#define JOIN_PASS(a, b) a##b
#define JOIN(a, b)  JOIN_PASS(a, b)
#define STRINGIFY_PASS(a) #a
#define STRINGIFY(a) STRINGIFY_PASS(a)
#define SECTION_NAME __foo
#define SECTION_START JOIN(__start_, SECTION_NAME)
#define SECTION_STOP JOIN(__stop_, SECTION_NAME)
#define SECTION_STR STRINGIFY(SECTION_NAME)
extern int SECTION_START[];
extern int SECTION_STOP[];

// following only needed to force SECTION_{START,STOP} be global with ld 2.29
int __attribute__((weak)) SECTION_START[] = { 0 };
int __attribute__((weak)) SECTION_STOP[] = { 0 };

static void dummy(void) {
        { static int __attribute__((section(SECTION_STR))) z = 13; }
        { static int __attribute__((section(SECTION_STR))) z = 17; }
        printf("dummy finished %d %d\n",
               *((int *) SECTION_START), *((int *) SECTION_STOP - 1));
}       
int do_something(int n) { dummy(); return n*n; }

=== test.c ===

#include <stdio.h>
#define JOIN_PASS(a, b) a##b
#define JOIN(a, b)  JOIN_PASS(a, b)
#define STRINGIFY_PASS(a) #a
#define STRINGIFY(a) STRINGIFY_PASS(a)
#define SECTION_NAME __foo
#define SECTION_START JOIN(__start_, SECTION_NAME)
#define SECTION_STOP JOIN(__stop_, SECTION_NAME)
#define SECTION_STR STRINGIFY(SECTION_NAME)
extern int SECTION_START[];
extern int SECTION_STOP[];
int do_something(int n);
static void test(void) {
        { static int __attribute__((section(SECTION_STR))) z = 42; }
        { static int __attribute__((section(SECTION_STR))) z = 10; }
        printf("test finished %d %d\n",
               *((int *) SECTION_START), *((int *) SECTION_STOP - 1));
}
int main(char argc, char *argv[]) {
        int ret = do_something(argc);
        test();
        printf("got it? %d %d\n",
              *((int *) SECTION_START), *((int *) SECTION_STOP - 1));
        return ret;
}

=== foo.t ===

SECTIONS
{
  __foo :
  {
    __start___foo = .;
    *(__foo);
    __stop___foo = .;
  }
}

===

# gcc -Wl,foo.t -shared -fPIC dummylib.c -o libdummy.so
# gcc -Wl,foo.t -o test -L. -ldummy test.c
# LD_LIBRARY_PATH=. ./test
> dummy finished 42 10
> test finished 42 10
> got it? 42 10


Node the following readelf outputs, similar for both 2.28 and 2.29, were
very similar out of the box with files and recipe in [comment 0] with
2.28:

# readelf -s libdummy.so | grep foo
>     7: 0000000000601030     4 OBJECT  GLOBAL DEFAULT   25 __start___foo
>     9: 0000000000601038     4 OBJECT  GLOBAL DEFAULT   25 __stop___foo
>    48: 0000000000601038     4 OBJECT  GLOBAL DEFAULT   25 __stop___foo
>    52: 0000000000601030     4 OBJECT  GLOBAL DEFAULT   25 __start___foo

# readelf -s test | grep foo
>     6: 0000000000a01030     4 OBJECT  GLOBAL DEFAULT   26 __start___foo
>     8: 0000000000a01038     4 OBJECT  GLOBAL DEFAULT   26 __stop___foo
>    50: 0000000000a01038     4 OBJECT  GLOBAL DEFAULT   26 __stop___foo
>    56: 0000000000a01030     4 OBJECT  GLOBAL DEFAULT   26 __start___foo

whereas those original files + recipe with 2.29 resulted in:

# readelf -s test | grep foo
>    42: 0000000000601034     0 NOTYPE  LOCAL  DEFAULT   24 __stop___foo
>    43: 000000000060102c     0 NOTYPE  LOCAL  DEFAULT   24 __start___foo
# readelf -s libdummy.so | grep foo
>    41: 0000000000201028     0 NOTYPE  LOCAL  DEFAULT   23 __stop___foo
>    42: 0000000000201020     0 NOTYPE  LOCAL  DEFAULT   23 __start___foo

which could, arguably, have been a regression.


Now, what bothers me, libqb library users might forget this "-Wl,foo.t"
equivalent easily, even if it will be provided through
"pkg-config --libs (or rather --cflags?) libqb".  Partial solution would
be that one-off run-time self-test ([comment 7]).  I guess there's no
"when you are linking me to a program, apply this script snippet
automatically" facility carried within the shared library directly.

Comment 9 Nick Clifton 2017-08-03 16:06:47 UTC
(In reply to Jan Pokorný from comment #8)

Hi Jan,

  I am glad that you have found a solution.

> Now, what bothers me, libqb library users might forget this "-Wl,foo.t"
> equivalent easily, even if it will be provided through
> "pkg-config --libs (or rather --cflags?) libqb".  Partial solution would
> be that one-off run-time self-test ([comment 7]).  I guess there's no
> "when you are linking me to a program, apply this script snippet
> automatically" facility carried within the shared library directly.

Actually there may be ...  have a look at, for example /usr/lib64/libncurses.so
which rather than being a shared library is actually a linker script fragment
that uses the INPUT directive to include both the real, versioned, libncurses.so.6 library and another library.

So you could rename foo.t to libgb.so, and add INPUT(libqb.so.6) at the start
then everything should work automatically without the users having to change
their command lines at all.  (Well in theory anyway.  I have not actually 
tested this idea.  Plus of course you would have to use the correct name of 
the real libfo.so.xxx shared library).

Cheers
  Nick

Comment 10 Nick Clifton 2017-08-03 16:08:41 UTC
Please excuse the typos - I was typing too fast.

For "libgb.so" I of course meant "libqb.so", and "libfo.so.xxx" was meant to be
"libqb.so.xxx".

Cheers
  Nick

Comment 11 Jan Pokorný [poki] 2017-08-03 16:44:28 UTC
Nick, thanks, that looks pretty cool!  Will try it for sure.

In the meantime, preliminary patch solves the issue in libqb,
looking at x86_64 scratch build:
https://koji.fedoraproject.org/koji/taskinfo?taskID=21019015

My custom WIP branch:
https://github.com/jnpkrn/libqb/commits/workaround-ld-2.29


To conclude, the main problem is the *changed visibility* of the
boundaries denoting (start, stop) symbols for orphaned sections.
Might be still worth considering to restore original behaviour.

Comment 12 Nick Clifton 2017-08-04 07:18:03 UTC
Hi Jan,

(In reply to Jan Pokorný from comment #11)
> To conclude, the main problem is the *changed visibility* of the
> boundaries denoting (start, stop) symbols for orphaned sections.
> Might be still worth considering to restore original behaviour.

Thanks.  I will keep that in mind.  If other developers encounter
similar problems then I will consider creating a patch for the linker.
If it is just the libqb project however, I will probably decide to
leave things as they are. :-}

Cheers
  Nick

Comment 13 Jan Pokorný [poki] 2017-08-04 12:42:16 UTC
Please note, though, this behaviour of the linker is moreover
inconsistent amongst platforms (not just between versions), which
may be bad on its own merit:
https://koji.fedoraproject.org/koji/taskinfo?taskID=20995093

Comment 14 Jan Pokorný [poki] 2017-10-04 11:47:33 UTC
Hmm, apparently I wasn't the only to notice the change in behaviour,
and the proper upstream was more open to do some reconciling
(as opposed to CLOSED NOTABUG):

https://sourceware.org/bugzilla/show_bug.cgi?id=21964

Anyway, while I fixed the issue in libqb for 2.29, there're new
issues with 2.29.1, will share the results of my investigation.

Comment 15 Jan Pokorný [poki] 2017-10-09 14:58:04 UTC
re [comment 14]:

> Anyway, while I fixed the issue in libqb for 2.29, there're new
> issues with 2.29.1, will share the results of my investigation.

Turned to be a matter of partially reconciled change making the
configure check consider no workaround is needed, while in fact
it remains to be the case.  So I just hardened this check to
hit a boundary condition that hasn't changed between 2.29
and 2.29.1 (but has between 2.28 and 2.29+).

Comment 16 Jan Pokorný [poki] 2017-10-11 17:18:10 UTC
re [comment 15]:

For how I extended the check, see the 2.29.1 mentioning assertion
line at reproducer from related [bug 1500898 comment 0].

Comment 17 Jan Pokorný [poki] 2018-02-07 09:39:58 UTC
Just noting down that I was pointed out the topic keep re-emerging
in upstream:

https://sourceware.org/ml/binutils/2018-02/msg00038.html

Also related code keeps (at least in outlook) shuffling:
https://sourceware.org/ml/binutils/2018-01/msg00347.html

Luckily, binutils 2.30 doesn't seem to break anything based on
libqb self-checks.

Comment 18 Jan Pokorný [poki] 2018-02-07 11:15:57 UTC
On the bright side, libqb's (workaround) mechanism keeps working
with the current master as of now, 407aa07cee - 2.30.0.20180207.

(https://github.com/ClusterLabs/libqb/pull/266#issuecomment-363737275)


Note You need to log in before you can comment on or make changes to this bug.