Bug 19520 - protected function pointer and copy relocation don't work right
Summary: protected function pointer and copy relocation don't work right
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
: 51880 (view as bug list)
Depends on:
Blocks: 55012
  Show dependency treegraph
 
Reported: 2005-01-19 00:24 UTC by H.J. Lu
Modified: 2012-12-11 14:44 UTC (History)
12 users (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-02-02 13:15:45


Attachments
A testcase (703 bytes, application/octet-stream)
2005-01-19 00:27 UTC, H.J. Lu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2005-01-19 00:24:32 UTC
Protected function pointer doesn't work right. For pointer to protected
function, gcc should treat it as if it is normal.
Comment 1 H.J. Lu 2005-01-19 00:27:03 UTC
Created attachment 7985 [details]
A testcase

With the new linker, I got

[hjl@gnu-20 x86_64-3]$ make
gcc -fPIC   -c -o x.o x.c
gcc -shared -o libx.so x.o
/usr/local/bin/ld: x.o: relocation R_X86_64_PC32 against `foo' can not be used
when making a shared object; recompile with -fPIC
/usr/local/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status
make: *** [libx.so] Error 1

With the old linker, I got
[hjl@gnu-20 x86_64-3]$ make CC="gcc -B/usr/bin/"
gcc -B/usr/bin/ -fPIC	-c -o x.o x.c
gcc -B/usr/bin/ -shared -o libx.so x.o
gcc -B/usr/bin/ -o foo m.c libx.so -Wl,-rpath,.
./foo
called from main foo_p: 0x400610
called from shared foo: 0x2a9566d8d8
shared foo: 0x2a9566d8d8
shared foo: 0x2a9566d8d8
called from shared foo_p: 0x400610
shared foo: 0x2a9566d8d8
shared foo: 0x2a9566d8d8
called from main foo: 0x400610
got from main foo: 0x2a9566d8d8
Function pointer `foo' are't the same in DSO and main
Comment 2 Andrew Pinski 2005-01-19 00:34:15 UTC
Isn't this just binutils ld/584?
http://sources.redhat.com/bugzilla/show_bug.cgi?id=584
Alan M. claims this is a ld bug rather than a gcc bug.
Comment 3 H.J. Lu 2005-01-19 00:35:54 UTC
The same bug also happen on i686-pc-linux-gnu:

gcc -fPIC   -c -o x.o x.c
gcc -shared -o libx.so x.o
gcc -o foo m.c libx.so -Wl,-rpath,.
./foo
called from main foo_p: 0x80483e4
called from shared foo: 0x111524
shared foo: 0x111524
shared foo: 0x111524
called from shared foo_p: 0x80483e4
shared foo: 0x111524
shared foo: 0x111524
called from main foo: 0x80483e4
got from main foo: 0x111524
Function pointer `foo' are't the same in DSO and main
Comment 4 H.J. Lu 2005-01-19 00:41:06 UTC
They aren't the same. It is function pointer vs. function. The other looks
like a linker bug.
Comment 5 Andrew Pinski 2005-01-19 00:47:23 UTC
This is really a dup of bug 10908.  
Comment 6 Andrew Pinski 2005-01-19 00:56:18 UTC
protected always binds local as you cannot override it so the bug is in the linker/asm.

*** This bug has been marked as a duplicate of 10908 ***
Comment 7 H.J. Lu 2005-01-19 01:47:57 UTC
Please take a closer look at the testcase. It is different from
bug 10908. Basically, main executable and DSO see different
function pointer values for the SAME function. From the linker

/* Will references to this symbol always reference the symbol
   in this object?  STV_PROTECTED is excluded from the visibility test
   here so that function pointer comparisons work properly.  Since
   function symbols not defined in an app are set to their .plt entry,
   it's necessary for shared libs to also reference the .plt even
   though the symbol is really local to the shared lib.  */

On many architectures, the function pointer != the address of the function
body.
Comment 8 Andrew Pinski 2005-01-19 03:11:37 UTC
The difference between non protected and protected functions is the following in the asm:
        movl    foo@GOT(%ebx), %eax
        leal    foo@GOTOFF(%ebx), %eax

but really add -fPIC to m.c make this work, so again this looks like an ld bug (maybe it is keeping the 
symbol protected or something).

Or gcc is doing:
        cmpl    $foo, -4(%ebp)

 which is not wrong in the non pic case.
Comment 9 Andrew Pinski 2005-01-19 03:31:40 UTC
So help out here, which is more correct the GOT or the GOTOFF?(In reply to comment #7)
> Please take a closer look at the testcase. It is different from
> bug 10908. Basically, main executable and DSO see different
> function pointer values for the SAME function. From the linker

That comment is only for the PPC bfd so it cannot apply to x86 :).
Comment 10 Andrew Pinski 2005-01-19 03:41:17 UTC
Well I think there is wrong reloc somewhere or a reloc being resolved wrongly
because foo binds locally in x.c otherwise the protect is visibility is really useless otherwise (except 
maybe to make sure that it does not get overridden).
Comment 11 H.J. Lu 2005-01-20 19:28:16 UTC
Depending on the psABI, because of copy relocation on data symbols and
function pointer on function symbols, a protected symbol has to be
treated very carefully. We have to check 2 things:

1. If the psABI uses copy relocation, protected data symbol is the same
as normal symbol.
2. If the psABI doesn't support the "official function address", that is
the psABI guarantee there is one and one only function address, only
branch to functions can be treated as local.
Comment 12 H.J. Lu 2005-01-20 22:34:29 UTC
Ignore the copy relocation. There is not much a compiler can do when the psABI
doesn't support protected symbols with copy relocation. See:

http://sources.redhat.com/ml/binutils/2003-03/msg00413.html
Comment 13 Ian Lance Taylor 2005-01-21 06:35:45 UTC
I think this bug report is reporting an actual bug.  At least when using ELF,
when the compiler takes the address of a protected function, it has to act as
though it is taking the address of an ordinary function, and rely on the dynamic
linker to do the right thing.  If the compiler takes the address of a protected
function without using the PLT, then as HJ says function symbols can not compare
equal, even though they should.  This is not something the linker can fix up. 
The dynamic linker, however, when setting up the PLT, should observe that the
symbol is protected, and call the local symbol even if the executable overrides it.

In other words, we should only treat protected function symbols as special when
we call them.  Otherwise they should be treated as ordinary symbols.

This only applies to ELF.  I don't know what should be done for other object
file formats, if there are any others which support protected symbols.
Comment 14 H.J. Lu 2005-01-21 06:47:55 UTC
A patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01394.html
Comment 15 H.J. Lu 2005-01-24 18:35:21 UTC
This is the updated patch:

http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01551.html

This is the testcase patch:

http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01550.html
Comment 16 Alan Modra 2005-02-02 13:15:33 UTC
Confirming that the bug is real.

I can't say I like HJ's solution though.  It seems to require that ld.so resolve
a protected symbol in a shared library to a symbol defined in the main app. 
That's weird.  In other cases you don't want ld.so to do that, for instance when
the main app defines a function with the same name as a protected library
function.  I think it might be difficult for ld.so to choose the right symbol,
especially for the general case of multiple levels of shared libraries.

Another problem is that making protected functions non-local prevents certain
optimizations, for example see alias.c:mark_constant_function.
Comment 17 H.J. Lu 2005-02-02 16:17:20 UTC
Please keep in mind that my proposal affects FUNCTION symbols only and my change
won't change function CALL, which will still be local. It only changes the
function pointer.

BTW, I believe ld.so in the current glibc is OK. It is kind of tricky. I think
I covered everything for FUNCTION symbols. If you believe ld.so is wrong in
some cases, please send me a testcase. I will fix it.
Comment 18 H.J. Lu 2005-02-02 23:05:31 UTC
I posted an updated patch

http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00196.html

I hope it will work better.
Comment 19 Daniel Jacobowitz 2005-02-03 15:51:17 UTC
FWIW, the reason this leaves a bad taste in my mouth is that I strongly believe
symbol visibility should be consistent between ELF platforms.  There's at least
one ELF platform where resolving a function pointer to a PLT entry is an
absolute no-show (MIPS binding stubs).
Comment 20 H.J. Lu 2005-02-03 15:59:04 UTC
Each psABI defines how function address works. Not all of psABIs have
the same treatment for function address. Function address may mean different
things for different psABIs. You can't even compare function address between
the x86 psABI and the mips psABI. Where does the consistency come from?
Comment 21 Simon Strandman 2005-11-06 09:50:54 UTC
(In reply to comment #18)
> I posted an updated patch
> 
> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg00196.html
> 
> I hope it will work better.

Sorry to bother but where is the updated patch? That link leads to something else.
Comment 22 Thiago Macieira 2007-01-11 18:42:11 UTC
Is there any update on this bug?

According to http://sourceware.org/ml/binutils/2005-01/msg00401.html, a protected function symbol cannot be used in a R_386_GOTOFF. I don't claim to understand the full implications of the issue, but it seems that the ld decision means gcc must not emit that relocation.
Comment 23 Thiago Macieira 2012-01-16 14:56:50 UTC
I've changed my opinion on this matter. I think GCC is generating the proper code (most efficient). It's ld that should accept this decision.
Comment 24 Andrew Pinski 2012-01-17 20:00:27 UTC
*** Bug 51880 has been marked as a duplicate of this bug. ***
Comment 25 Richard Biener 2012-01-18 09:21:14 UTC
LD bug: http://sourceware.org/bugzilla/show_bug.cgi?id=13600

The GCC side is a QOI thing and maybe a conformance thing.  ICC generates
for

__attribute__((visibility("protected")))
void * foo (void) { return (void *)foo; }

        .protected foo
        .globl foo
foo:
..B1.1:                         # Preds ..B1.0
..___tag_value_foo.1:                                           #1.60
        movq      foo@GOTPCREL(%rip), %rax                      #1.77

thus does not resolve the function address to the local symbol, which GCC
does and which confuses LD (thus the linker bug):

        .globl  foo
        .protected      foo
        .type   foo, @function
foo:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        leaq    foo(%rip), %rax

I think ICC this way avoids the function pointer comparison issues with
symbols with protected visibility (can someone double-check?  HJs testcase
doesn't compile for me).
Comment 26 Thiago Macieira 2012-01-18 13:28:05 UTC
ld *can* link, it just chooses not to.

$ cat > foo.c
__attribute__((visibility("protected")))
void * foo (void) { return (void *)foo; }

$ gcc -fPIC -shared foo.c           
/usr/bin/ld: /tmp/cclrufLV.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object
/usr/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status

$ gcc -Wl,-Bsymbolic-functions -fPIC -shared foo.c && echo success
success
$ cat > empty.dynlist                                                     
{ "__this_symbol_isnt_present__"; };
$ gcc -Wl,--dynamic-list,empty.dynlist -fPIC -shared foo.c && echo success
success

I also cannot confirm that icc does anything different:
$ icc -fPIC -shared foo.c
ld: /tmp/iccf15gTK.o: relocation R_X86_64_PC32 against protected symbol `foo' can not be used when making a shared object
ld: final link failed: Bad value
$ icc -O3 -S -o /dev/stdout -fPIC -shared foo.c | grep -A4 foo:
foo:
..B1.1:                         # Preds ..B1.0
..___tag_value_foo.1:                                           #2.19
        lea       foo(%rip), %rax                               #2.36
        ret                                                     #2.36

What's more, if you actually do compile the following program into a shared library, it succeeds:
$ cat > foo.S
        .text
        .globl  foo
        .protected      foo
        .type   foo, @function
foo:
        movq      foo@GOTPCREL(%rip), %rax
        ret
$ gcc -shared foo.S && echo success
success

But the resulting shared object has the following (extracted from eu-readelf):
Relocation section [ 5] '.rela.dyn' for section [ 0] '' at offset 0x230 contains 1 entry:
  Offset              Type            Value               Addend Name
  0x0000000000200330  X86_64_GLOB_DAT 0x0000000000000248      +0 foo

    2: 0000000000000248      0 FUNC    GLOBAL PROTECTED      6 foo

Now we introduce a third component to this discussion: the dynamic linker. What will it do?

This has become a decision, not a bug: what should the compiler do when taking the address of a function when said function is under protected visibility. Both solutions are technically correct and would load the same function address under the correct circumstances. 

The compiler is also taking on the "protected" visibility to the letter (at least, according to its own definition of so):

    "protected"
          Protected visibility is like default visibility except that it
          indicates that references within the defining module will
          bind to the definition in that module.  That is, the declared
          entity cannot be overridden by another module.

Since the symbol was marked as "protected" in the symbol table, it's expected that the linker and dynamic linker will bind it locally. That being the case, the compiler can optimise for that fact. It can calculate what value would be placed in the GOT entry and load that instead. That's the LEA instruction.

The linker, however, mandates that the address to symbol should not be loaded directly, but only through the GOT. This is necessary because the psABI requires that the function address resolve to the PLT entry found in the position-dependent executable. If the executable takes the address of this global (but protected) symbol, it will hardcode the address to its own address space, forcing other ELF modules to follow suit.

Finally, what does the dynamic linker do when an "entity (that) cannot be overridden by another module" is overridden by another module? The glibc 2.14 loader will resolve the GOT entry's relocation to the executable's PLT stub, even if the symbol in question has protected visibility. Other loaders might work differently.

As it stands, the psABI requires that the address to a protected function be loaded through the GOT, even though the compiler thinks it knows what the address will be.

However, I really wish the compiler *not* to change its behaviour for PIC code, but instead change its behaviour for ELF position-dependent executables. I am asking for a change in the psABI and requesting that the loading of function addresses for "default" visibility symbols (not protected!) should be done via the GOT. In other words, I'm asking that we optimise for shared libraries, not for executables.

Versions:
GCC: 4.6.0
ld: 2.21.51.0.6-6.fc15 20110118
ICC: 12.1.0 20111011
Comment 27 Richard Biener 2012-01-18 15:17:19 UTC
(In reply to comment #26)

> The linker, however, mandates that the address to symbol should not be loaded
> directly, but only through the GOT. This is necessary because the psABI
> requires that the function address resolve to the PLT entry found in the
> position-dependent executable.

Why on earth does it do that?  If we have to go through the GOT it can
as well contain the functions address and not that of the PLT entry?
Comment 28 Richard Biener 2012-01-19 13:36:28 UTC
Final conclusion:  We need to resolve to the executables PLT consistently,
even from inside the shared object where the function binds locally.  This
is because of references to the function from the executables .rodata section
which we can't relocate (and thus have to point to the executables PLT entry).

Thus, this is a GCC target bug.

__attribute__((visibility("protected"))) void * foo () { return foo; }

needs to return the address of foo via a load from the GOT.  HJs patch
isn't correct as this is really a target ABI choice (another ABI may
choose to resolve all references to the functions start address with
the cost of having to put the constants into a .rel.rodata section).
Comment 29 H.J. Lu 2012-01-19 18:29:39 UTC
(In reply to comment #28)
> Final conclusion:  We need to resolve to the executables PLT consistently,
> even from inside the shared object where the function binds locally.  This
> is because of references to the function from the executables .rodata section
> which we can't relocate (and thus have to point to the executables PLT entry).
> 
> Thus, this is a GCC target bug.
> 
> __attribute__((visibility("protected"))) void * foo () { return foo; }
> 
> needs to return the address of foo via a load from the GOT.  HJs patch
> isn't correct as this is really a target ABI choice (another ABI may
> choose to resolve all references to the functions start address with

It only applies when we take an address of a protected function.
Branch to a protected function doesn't need to go through PLT.
Comment 30 Thiago Macieira 2012-01-19 18:52:57 UTC
This does solve the problem.

It's just unfortunate that it does so by creating more work for the library even if no executable ever takes the address of this protected function.

It would have been preferable to somehow tell the compiler when compiling an executable that this function it's taking the address of is protected elsewhere, so it should use the GOT too.
Comment 31 Rich Felker 2012-04-29 04:39:03 UTC
I think part of the difficulty of this issue is that the behavior of protected is not well-specified. Is it intended to prevent the definition from interposition? Or is it promising the compiler/toolchain that you won't override the definition (and acquiescing that the behavior will be undefined if you break this promise)?

If protected's intent is the former, then it's absolutely wrong to resolve the function's address to the main executable's PLT entry for a different function by the same name. To avoid this, the GOT entry for the function in the shared library must point to the PLT entry in the main program if and only if the main program's symbol got resolved to the library's version of the function; otherwise, it must point to the library's version. I don't see an easy way to arrange this without special help from the dynamic linker, and personally, I think it's a slippery slope to try to make promises that are this difficult to keep.

As such I'd prefer that protected's behavior be the latter: an optimization hint to the compiler in the form of a promise not to override the definition.

In any case, I'm experiencing this bug in the form of not being able to take the address of any external functions when using -fvisibility=protected, and it's making it impossible to use -fvisibility=protected. I get bogus linker errors about not being able to use a protected function for R_386_GOTOFF relocations. So I want to see this solved in one way or another, preferably in the way that results in maximal performance and minimal bloat while ensuring correct behavior as long as the functions are not overridden...
Comment 32 H.J. Lu 2012-10-21 21:34:50 UTC
Protected data symbol with copy relocation doesn't
work either.