Bug 68016 - ASan doesn't catch overflow in globals when COPY relocation is involved.
Summary: ASan doesn't catch overflow in globals when COPY relocation is involved.
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-10-19 16:03 UTC by Maxim Ostapenko
Modified: 2015-11-02 09:27 UTC (History)
4 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu
Build: x86_64-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Ostapenko 2015-10-19 16:03:40 UTC
Consider:

max@max:~/workspace/downloads/gcc$ cat libfoo.c
int f[5] = {1};

max@max:~/workspace/downloads/gcc$ cat main.c
extern int f[5];
int main ()
{
  return f[5];
}

max@max:~/workspace/downloads/gcc$ ~/install/master-ref/bin/gcc -fsanitize=address libfoo.c -shared -fpic -fsanitize=address -o libfoo.so
max@max:~/workspace/downloads/gcc$ ~/install/master-ref/bin/gcc -fsanitize=address  main.c -c  -o main.o
max@max:~/workspace/downloads/gcc$ ~/install/master-ref/bin/gcc -fsanitize=address main.o ./libfoo.so -o main
max@max:~/workspace/downloads/gcc$ LD_LIBRARY_PATH=~/install/master-ref/lib64 ASAN_OPTIONS=report_globals=3  ./main
    #0 0x7f73cc9bfdde in __asan_register_globals /home/max/workspace/downloads/gcc/libsanitizer/asan/asan_globals.cc:228
    #1 0x7f73cc796800 in _GLOBAL__sub_I_00099_1_libfoo.c (libfoo.so+0x800)
    #2 0x7f73cd910139  (/lib64/ld-linux-x86-64.so.2+0x10139)
    #3 0x7f73cd910222  (/lib64/ld-linux-x86-64.so.2+0x10222)
    #4 0x7f73cd901309  (/lib64/ld-linux-x86-64.so.2+0x1309)

=== ID 738197505; 0x7f73cc996bc0 0x7f73cc996bc0
==16063==Added Global[0x7f73cc996bc0]: beg=0x7f73cc996b60 size=20/64 name=f module=libfoo.c dyn_init=0
==16063==  location (0x7f73cc996ba0): name=libfoo.c[0x7f73cc79680d], 1 5

max@max:~/workspace/downloads/gcc$ readelf -r main | grep COPY
00000070eac0  025400000005 R_X86_64_COPY     000000000070eac0 f + 0 

This happens due to private aliases, used by GCC to register globals. LLVM catches this overflow, but it has another drawback - mixing sanitized and non-sanitized code may lead to application crash.

Don't know if there is a good fix for both issues. Any thoughts? IMHO, false negatives are more preferable than application crash.
Comment 1 Jakub Jelinek 2015-10-20 08:48:52 UTC
Yeah, this is intentional design decision, trying to register something for an object that might be living in completely different library and where the gap might not be supplied is just wrong.
As the copy relocations are created by the linker, there is no way (except perhaps some ELF extensions) to instruct the linker to also allocate the gap around it, so that it could be registered in the executable or PIE.
Comment 2 Maxim Ostapenko 2015-10-21 13:15:42 UTC
Ok, I guess won't fix here.
Comment 3 Maxim Ostapenko 2015-10-22 14:51:30 UTC
Won't fix, this is intentional design decision.
Comment 4 Maxim Ostapenko 2015-10-26 11:17:08 UTC
Actualy, LLVM is not better here (perhaps even worse). Consider the following testcase (it's the same Jakub provided in PR63888):

max@max:/tmp$ cat libfoo.c

long f = 4;
long foo (long *p) {
  return *p;
}

max@max:/tmp$ cat libbar.c

long h = 12;
long i = 13;

max@max:/tmp$ cat main.c

extern void abort (void);
extern long f;
extern long h;
extern long i;
long foo (long *);

int main () {
  if (foo (&f) != 4 || foo (&h) != 12 || foo (&i) != 13)
    abort ();
  return 0;
}


max@max:/tmp$ clang libfoo.c -shared -fpic -o libfoo.so -g
max@max:/tmp$ clang libbar.c -shared -fpic -o libbar.so -g
max@max:/tmp$ clang main.c -c -o main.o -g
max@max:/tmp$ clang  main.o ./libfoo.so ./libbar.so -o main -fsanitize=address
max@max:/tmp$ ./main
max@max:/tmp$ echo $?
0
max@max:/tmp$ clang libfoo.c -shared -fpic -o libfoo.so -g -fsanitize=address
max@max:/tmp$ ./main
./main: Symbol `f' has different size in shared object, consider re-linking
=================================================================
==19089==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000070fb10 at pc 0x7f0e0b65c931 bp 0x7ffc67828000 sp 0x7ffc67827ff8
READ of size 8 at 0x00000070fb10 thread T0
    #0 0x7f0e0b65c930 in foo /tmp/libfoo.c:2:29
    #1 0x4e166f in main /tmp/main.c:9:42
    #2 0x7f0e0a570ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
    #3 0x418fd5 in _start (/tmp/main+0x418fd5)

0x00000070fb10 is located 0 bytes to the right of global variable 'f' defined in 'libfoo.c:1:6' (0x70fb08) of size 8
SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/libfoo.c:2:29 in foo
Shadow bytes around the buggy address:
  0x0000800d9f10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9f20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9f30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9f40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9f50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0000800d9f60: 00 00[f9]f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00
  0x0000800d9f70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9f90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9fa0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d9fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==19089==ABORTING 

This happens because in LLVM case ASan changes symbols size ('f' in our case) and just breaks ABI for the library. Relinking binary each time we replace non-sanitized library with sanitized one is undesirable for large package oriented systems (e.g. distributives), so we need a general solution for the problem.
Comment 5 Reid Kleckner 2015-10-26 16:19:18 UTC
(In reply to Maxim Ostapenko from comment #4)
> This happens because in LLVM case ASan changes symbols size ('f' in our
> case) and just breaks ABI for the library. Relinking binary each time we
> replace non-sanitized library with sanitized one is undesirable for large
> package oriented systems (e.g. distributives), so we need a general solution
> for the problem.

Can you elaborate as to why changing the size of a visible global in a shared lib is an ABI break with ELF? That's surprising to me, and my searches haven't helped me understand.
Comment 6 Jakub Jelinek 2015-10-26 16:57:07 UTC
(In reply to Reid Kleckner from comment #5)
> (In reply to Maxim Ostapenko from comment #4)
> > This happens because in LLVM case ASan changes symbols size ('f' in our
> > case) and just breaks ABI for the library. Relinking binary each time we
> > replace non-sanitized library with sanitized one is undesirable for large
> > package oriented systems (e.g. distributives), so we need a general solution
> > for the problem.
> 
> Can you elaborate as to why changing the size of a visible global in a
> shared lib is an ABI break with ELF? That's surprising to me, and my
> searches haven't helped me understand.

Because symbol size is part of the ABI, and LLVM emits different symbol size between -fsanitize=address and -fno-sanitize=address.
E.g. COPY relocations use the st_size field, so you can have:
1) shared library originally not ASAN instrumented, binary (e.g. non-ASAN) linked against it, then the shared library recompiled with ASAN - the size of the symbol in the binary will be the one without padding, but LLVM incorrectly registers the variable using global symbol rather than local alias and thus assumes there is padding which is not available (plus you can get a runtime warning on the st_size mismatch from the dynamic linker)
2) even without COPY relocations, you could have the same variable defined in multiple shared libraries, if some of them are -fsanitize=address and the others are not, there is mismatch between the variable sizes, and depending on which library comes earlier in the symbol search scope, you could have either the version without or with padding used at runtime, but the sanitized libraries could very well register the non-padded one, making it fatal error to access e.g. variables after it
Comment 7 Reid Kleckner 2015-10-26 22:22:20 UTC
(In reply to Jakub Jelinek from comment #6)
> Because symbol size is part of the ABI, and LLVM emits different symbol size
> between -fsanitize=address and -fno-sanitize=address.
> E.g. COPY relocations use the st_size field, so you can have:
> 1) shared library originally not ASAN instrumented, binary (e.g. non-ASAN)
> linked against it, then the shared library recompiled with ASAN - the size
> of the symbol in the binary will be the one without padding, but LLVM
> incorrectly registers the variable using global symbol rather than local
> alias and thus assumes there is padding which is not available (plus you can
> get a runtime warning on the st_size mismatch from the dynamic linker)

I thought COPY relocations only occurred with fPIE, but I must have been mistaken.

> 2) even without COPY relocations, you could have the same variable defined
> in multiple shared libraries, if some of them are -fsanitize=address and the
> others are not, there is mismatch between the variable sizes, and depending
> on which library comes earlier in the symbol search scope, you could have
> either the version without or with padding used at runtime, but the
> sanitized libraries could very well register the non-padded one, making it
> fatal error to access e.g. variables after it

LLVM ASan tries to instrument only global definitions with external linkage. The goal of this check is to ensure that we have found the one true definition of the global, and it isn't COMDAT, weak, a C string, or going to get replaced with something else at link time through some other means.

It seems like you are describing interposition, which isn't something LLVM supports very well. LLVM has no equivalent of -fsemantic-interposition, for example. We always operate under something like -fno-semantic-interposition. (I know, it's ironic, because ASan interposes libc.)

Anyway, I agree the COPY relocation issue is a real problem, but other than that I think our approach is at least internally consistent.
Comment 8 Maxim Ostapenko 2015-10-27 07:48:43 UTC
(In reply to Reid Kleckner from comment #7)
> (In reply to Jakub Jelinek from comment #6)
> > Because symbol size is part of the ABI, and LLVM emits different symbol size
> > between -fsanitize=address and -fno-sanitize=address.
> > E.g. COPY relocations use the st_size field, so you can have:
> > 1) shared library originally not ASAN instrumented, binary (e.g. non-ASAN)
> > linked against it, then the shared library recompiled with ASAN - the size
> > of the symbol in the binary will be the one without padding, but LLVM
> > incorrectly registers the variable using global symbol rather than local
> > alias and thus assumes there is padding which is not available (plus you can
> > get a runtime warning on the st_size mismatch from the dynamic linker)
> 
> I thought COPY relocations only occurred with fPIE, but I must have been
> mistaken.
> 
> > 2) even without COPY relocations, you could have the same variable defined
> > in multiple shared libraries, if some of them are -fsanitize=address and the
> > others are not, there is mismatch between the variable sizes, and depending
> > on which library comes earlier in the symbol search scope, you could have
> > either the version without or with padding used at runtime, but the
> > sanitized libraries could very well register the non-padded one, making it
> > fatal error to access e.g. variables after it
> 
> LLVM ASan tries to instrument only global definitions with external linkage.
> The goal of this check is to ensure that we have found the one true
> definition of the global, and it isn't COMDAT, weak, a C string, or going to
> get replaced with something else at link time through some other means.
> 
> It seems like you are describing interposition, which isn't something LLVM
> supports very well. LLVM has no equivalent of -fsemantic-interposition, for
> example. We always operate under something like -fno-semantic-interposition.
> (I know, it's ironic, because ASan interposes libc.)
> 
> Anyway, I agree the COPY relocation issue is a real problem, but other than
> that I think our approach is at least internally consistent.

Jakub is right, here an example, where I believe COPY relocations are not involved:

max@max:/tmp$ cat libfoo.c

long h = 15;
long f = 4;
long foo (long *p) {
  return *p;
}

max@max:/tmp$ cat libbar.c 

extern void abort (void);
long foo (long *);
long h = 12;
long i = 13;
long f = 5;

int bar () {
  if (foo (&f) != 5 || foo (&h) != 12 || foo (&i) != 13)
    abort ();
  return 0;
}

max@max:/tmp$ cat main.c

int bar ();

int main () {
  return bar ();
}


max@max:/tmp$ clang libfoo.c -shared -fpic -o libfoo.so -g 
max@max:/tmp$ clang libbar.c -shared -fpic -o libbar.so -g
max@max:/tmp$ clang main.c -c -o main.o
max@max:/tmp$ clang  main.o  ./libbar.so ./libfoo.so -o main -fsanitize=address
max@max:/tmp$ ./main 
max@max:/tmp$ clang libfoo.c -shared -fpic -o libfoo.so  -g -fsanitize=address
max@max:/tmp$ ./main 
=================================================================
==27105==ERROR: AddressSanitizer: global-buffer-overflow on address 0x7f28c26a0050 at pc 0x7f28c229d9c1 bp 0x7ffd1716a950 sp 0x7ffd1716a948
READ of size 8 at 0x7f28c26a0050 thread T0
    #0 0x7f28c229d9c0 in foo /tmp/libfoo.c:4:10
    #1 0x7f28c249f7bf in bar /tmp/libbar.c:8:7
    #2 0x4e1585 in main (/tmp/main+0x4e1585)
    #3 0x7f28c13b3ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
    #4 0x418f25 in _start (/tmp/main+0x418f25)

0x7f28c26a0050 is located 0 bytes inside of global variable 'f' defined in 'libfoo.c:2:6' (0x7f28c26a0050) of size 8
0x7f28c26a0050 is located 8 bytes to the right of global variable 'h' defined in 'libfoo.c:1:6' (0x7f28c26a0040) of size 8
SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/libfoo.c:4:10 in foo
Shadow bytes around the buggy address:
  0x0fe5984cbfb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cbfc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cbfd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cbfe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cbff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0fe5984cc000: 00 00 00 00 00 00 00 00 00 f9[f9]f9 f9 f9 f9 f9
  0x0fe5984cc010: f9 f9 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cc020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cc030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cc040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0fe5984cc050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==27105==ABORTING

max@max:/tmp$ readelf -r main | grep COPY

Here, global symbols 'f' and 'h' are resolved to libbar.so, that is not sanitized. However, when libfoo.so registers its "own" globals, it actually poisons libbar.so's 'f' and 'h' that are not properly padded.
Comment 9 Jakub Jelinek 2015-10-27 07:56:44 UTC
(In reply to Maxim Ostapenko from comment #8)
> Jakub is right, here an example, where I believe COPY relocations are not
> involved:

Yeah, that, semantic interposition is used heavily not just by libasan itself, so trying to pretend it does not exist is wrong (GCC carefully distinguishes what can and what can't be interposed based on visibility attributes etc.).

But instead of those h and f variables you can e.g just use static block scope variables in C++ inline functions inlined in multiple shared libraries, those also have COMDAT linkage and one of the definitions will be used while the other one will refer to the other library's definition instead (whether using STB_GNU_UNIQUE or not).  Or template variables etc.
Comment 10 Yury Gribov 2015-11-02 09:27:47 UTC
> This happens because in LLVM case ASan changes symbols size
> ('f' in our case) and just breaks ABI for the library.

I've filed an upstream bug about this https://github.com/google/sanitizers/issues/619