Bug 63888 - [5 Regression] bootstrap failed when configured with -with-build-config=bootstrap-asan --disable-werror
Summary: [5 Regression] bootstrap failed when configured with -with-build-config=boots...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: bootstrap (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: 5.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-15 14:25 UTC by H.J. Lu
Modified: 2015-10-21 07:48 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-11-17 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2014-11-15 14:25:37 UTC
On Linux/x86-64, r217602 gave

build/genmddeps /export/gnu/import/git/sources/gcc/gcc/common.md /export/gnu/import/git/sources/gcc/gcc/config/i386/i386.md > tmp-mddeps
build/genmodes -h > tmp-modes.h

=================================================================
==31507==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x47b5da in __interceptor_malloc /export/gnu/import/git/sources/gcc/libsanitizer/asan/asan_malloc_linux.cc:38
    #1 0x4b42d7 in xmalloc /export/gnu/import/git/sources/gcc/libiberty/xmalloc.c:147

SUMMARY: AddressSanitizer: 40 byte(s) leaked in 1 allocation(s).
make[5]: *** [s-mddeps] Error 23

There are some global string variables initialized with

Breakpoint 5, xmalloc (size=size@entry=40)
    at /export/gnu/import/git/sources/gcc/libiberty/xmalloc.c:146
146	    size = 1;
(gdb) bt
#0  xmalloc (size=size@entry=40)
    at /export/gnu/import/git/sources/gcc/libiberty/xmalloc.c:146
#1  0x00000000004b43d1 in xstrndup (
    s=0x7fffffffe18a "/export/gnu/import/git/sources/gcc/gcc/common.md", 
    n=<optimized out>)
    at /export/gnu/import/git/sources/gcc/libiberty/xstrndup.c:56
#2  0x00000000004aff9e in handle_toplevel_file (
    handle_directive=handle_directive@entry=0x0)
    at /export/gnu/import/git/sources/gcc/gcc/read-md.c:1008
#3  0x00000000004b0672 in read_md_files (argc=<optimized out>, 
    argv=<optimized out>, parse_opt=parse_opt@entry=0x0, 
    handle_directive=handle_directive@entry=0x0)
    at /export/gnu/import/git/sources/gcc/gcc/read-md.c:1138
#4  0x00000000004059d1 in main (argc=<optimized out>, argv=<optimized out>)
    at /export/gnu/import/git/sources/gcc/gcc/genmddeps.c:50
(gdb) c

They aren't freed before exit and are detected as memory leaks.
How can they be suppressed?
Comment 1 Richard Biener 2014-11-17 09:33:34 UTC
This is just very stupid memory leak testing ...
Comment 2 hjl@gcc.gnu.org 2014-11-17 22:13:26 UTC
Author: hjl
Date: Mon Nov 17 22:12:55 2014
New Revision: 217678

URL: https://gcc.gnu.org/viewcvs?rev=217678&root=gcc&view=rev
Log:
Export "detect_leaks=0"

	PR bootstrap/63888
	* bootstrap-asan.mk (ASAN_OPTIONS): Export "detect_leaks=0".

Modified:
    trunk/config/ChangeLog
    trunk/config/bootstrap-asan.mk
Comment 3 H.J. Lu 2014-11-17 22:14:31 UTC
Now I got

I got

configure:3612: /export/build/gnu/gcc-asan/build-x86_64-linux/./gcc/xgcc
-B/export/build/gnu/gcc-asan/build-x86_64-linux/./gcc/
-B/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/bin/
-B/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/lib/ -isystem
/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/include -isystem
/usr/gcc-5.0.0/x86_64-unknown-linux-gnu/sys-include    -c -g -O2
conftest.c >&5
=================================================================
==14370==ERROR: AddressSanitizer: odr-violation (0x000002b38aa0):
  [1] size=12 'CSWTCH.2819'
/export/gnu/import/git/sources/gcc/gcc/tree-vrp.c:4056:7
  [2] size=12 'CSWTCH.2820'
/export/gnu/import/git/sources/gcc/gcc/tree-vrp.c:4109:8
These globals were registered at these points:
  [1]:
    #0 0x68e9c6 in __asan_register_globals
/export/gnu/import/git/sources/gcc/libsanitizer/asan/asan_globals.cc:217
    #1 0x28dc89c in __libc_csu_init
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x28dc89c)
    #2 0x309e821c34 in __libc_start_main (/lib64/libc.so.6+0x309e821c34)
    #3 0x683d3e
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x683d3e)

  [2]:
    #0 0x68e9c6 in __asan_register_globals
/export/gnu/import/git/sources/gcc/libsanitizer/asan/asan_globals.cc:217
    #1 0x28dc89c in __libc_csu_init
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x28dc89c)
    #2 0x309e821c34 in __libc_start_main (/lib64/libc.so.6+0x309e821c34)
    #3 0x683d3e
(/export/build/gnu/gcc-asan/build-x86_64-linux/gcc/cc1+0x683d3e)

==14370==HINT: if you don't care about these warnings you may set
ASAN_OPTIONS=detect_odr_violation=0
SUMMARY: AddressSanitizer: odr-violation: global 'CSWTCH.2819' at
/export/gnu/import/git/sources/gcc/gcc/tree-vrp.c:4056:7
==14370==ABORTING
Comment 4 Jakub Jelinek 2014-11-18 13:56:45 UTC
Short testcase for #c3:

int
foo (int x)
{
  int v = 0;
  switch (x)
    {
    case 11: v = 67; break;
    case 12: v = 68; break;
    case 13: v = 69; break;
    }
  return v;
}

int
bar (int x)
{
  int v = 0;
  switch (x)
    {
    case 18: v = 67; break;
    case 19: v = 68; break;
    case 20: v = 69; break;
    }
  return v;
}

int
main ()
{
  return foo (11) - 67 + bar (19) - 68;
}

reports that there is an ODR violation, though:
1) there is no such thing as ODR in C or Fortran, I wonder how can you apply by default something like that without letting the compiler to pass through the language of the vars
2) I don't see how that is anything close to an ODR violation; there are two vars, CSWTCH.1 and CSWTCH.3, which, sometimes because of compiler decision, sometimes unified by linker, have the same beginning address.  What does that have to do with ODR?  Each of the vars has different definition, but they are read-only and not address taken, so they can share the data.  GCC with -fmerge-all-constants also merges many read-only constants, even addressable ones, when they have the same content.
Comment 5 Yury Gribov 2014-11-19 14:53:26 UTC
Perhaps ODR check should check that names of global variables match before issuing warning?
Comment 6 Kostya Serebryany 2014-11-19 19:21:29 UTC
(In reply to Yury Gribov from comment #5)
> Perhaps ODR check should check that names of global variables match before
> issuing warning?

The variables are generated by the compiler, right? 
(These are switch tables)

Do we want to instrument them at all? 
I.e. is a buffer overflow on these variables possible? 

If we don't instrument them we won't report an ODR violation. 

>> GCC with -fmerge-all-constants also merges many read-only constants, even addressable ones, when they have the same content.

Then I guess we may have other false positives. 
We may want to disable -fmerge-all-constants under asan
Comment 7 Jakub Jelinek 2014-11-20 10:22:34 UTC
(In reply to Kostya Serebryany from comment #6)
> (In reply to Yury Gribov from comment #5)
> > Perhaps ODR check should check that names of global variables match before
> > issuing warning?
> 
> The variables are generated by the compiler, right? 
> (These are switch tables)
> 
> Do we want to instrument them at all? 

Generally, we do want to instrument even artificial variables, and on many of them buffer overflow is possible.

> I.e. is a buffer overflow on these variables possible? 
> 
> If we don't instrument them we won't report an ODR violation. 
> 
> >> GCC with -fmerge-all-constants also merges many read-only constants, even addressable ones, when they have the same content.
> 
> Then I guess we may have other false positives. 
> We may want to disable -fmerge-all-constants under asan

You haven't responded about the language thing, there is no such thing as ODR in C or Fortran, so you shouldn't report it.
There are 31 (or 63) bits left in __has_dynamic_init field, can't one bit be used for whether ODR should be reported or not?
Comment 8 Kostya Serebryany 2014-12-04 21:22:36 UTC
(sorry for delay, I missed the last comment)
> Generally, we do want to instrument even artificial variables, and on many
> of them buffer overflow is possible.

Yea, agree. 

> 
> > I.e. is a buffer overflow on these variables possible? 
> > 
> > If we don't instrument them we won't report an ODR violation. 
> > 
> > >> GCC with -fmerge-all-constants also merges many read-only constants, even addressable ones, when they have the same content.
> > 
> > Then I guess we may have other false positives. 
> > We may want to disable -fmerge-all-constants under asan
> 
> You haven't responded about the language thing, there is no such thing as
> ODR in C or Fortran, so you shouldn't report it.

In LLVM, I do not (and should not) know what source language is being compiled.
The differences between languages are represented in the linkage types 
of the globals. E.g. a regular global in C will not be instrumented at all 
unless -fno-common is given. I.e. the difference is not in the source language 
but in the linkage type of the globals. 

% clang -S -o - -emit-llvm glob.c 
@GLOBAL = common global i32 0, align 4
% clang -S -o - -emit-llvm glob.c -fno-common
@GLOBAL = global i32 0, align 4
% clang++ -x c++ -S -o - -emit-llvm glob.c 
@GLOBAL = global i32 0, align 4


Can you give an example in C where we do not want to check for ODRs, yet
the basic asan instrumentation is legal? 

> There are 31 (or 63) bits left in __has_dynamic_init field, can't one bit be
> used for whether ODR should be reported or not?
That will still mean an API change, something I like to avoid when possible.
Comment 9 Jakub Jelinek 2014-12-05 08:10:02 UTC
(In reply to Kostya Serebryany from comment #8)
> > You haven't responded about the language thing, there is no such thing as
> > ODR in C or Fortran, so you shouldn't report it.
> 
> In LLVM, I do not (and should not) know what source language is being
> compiled.

Sounds like LLVM limitation.

> The differences between languages are represented in the linkage types 
> of the globals. E.g. a regular global in C will not be instrumented at all 
> unless -fno-common is given. I.e. the difference is not in the source
> language 
> but in the linkage type of the globals. 

I believe your http://llvm.org/klaus/compiler-rt/blob/0926de35c9357aa1a5c47d3a618d6c72f9e8f085/test/asan/TestCases/Linux/odr-violation.cc
example is valid in C, and commonly used (sure, more commonly with functions than with variables, but even with variables).  Furthermore, the kind of ODR detection in libasan isn't really ODR detection, you are instead checking if the same global is registered multiple times.  GCC intentionally registers local aliases of the globals, so that the same global isn't registered multiple times if it is defined by multiple shared libraries or binary and some shared library - each TU registers the vars local to it, rather than trying to register globals in a completely different shared library.
If LLVM uses global symbols instead of local aliases, it is more expensive.
You can have aliases/weakrefs etc. to symbols, and those still aren't ODR violations.

An ODR violation is IMHO something different, it is the case where you have the same symbol name (but, you'd need to distinguish between globally visible symbols that should be ODR checked and local symbols and/or symbols from languages you don't want to check for it) registered multiple times for multiple addresses.  So you'd need to hash based on the symbol name if marked for ODR checking, and check if the same (non-comdat) global isn't registered several times.
Comment 10 Yury Gribov 2014-12-05 10:07:56 UTC
(In reply to Kostya Serebryany from comment #8)
> (sorry for delay, I missed the last comment)
> > Generally, we do want to instrument even artificial variables, and on many
> > of them buffer overflow is possible.
> 
> Yea, agree. 

FYI this is not only about artificials - gfortran emits tons of spurious ODR errors on user global vars in SPEC2006.
Comment 11 Yury Gribov 2014-12-05 10:49:13 UTC
(In reply to Jakub Jelinek from comment #9)
> An ODR violation is IMHO something different, it is the case where you have
> the same symbol name (but, you'd need to distinguish between globally
> visible symbols that should be ODR checked and local symbols and/or symbols
> from languages you don't want to check for it) registered multiple times for
> multiple addresses.  So you'd need to hash based on the symbol name if
> marked for ODR checking, and check if the same (non-comdat) global isn't
> registered several times.

There is already proposal to this in UBSan: http://llvm.org/bugs/show_bug.cgi?id=21498
Comment 12 Kostya Serebryany 2014-12-06 01:37:38 UTC
(In reply to Jakub Jelinek from comment #9)
> (In reply to Kostya Serebryany from comment #8)
> > > You haven't responded about the language thing, there is no such thing as
> > > ODR in C or Fortran, so you shouldn't report it.
> > 
> > In LLVM, I do not (and should not) know what source language is being
> > compiled.
> 
> Sounds like LLVM limitation.
> 
> > The differences between languages are represented in the linkage types 
> > of the globals. E.g. a regular global in C will not be instrumented at all 
> > unless -fno-common is given. I.e. the difference is not in the source
> > language 
> > but in the linkage type of the globals. 
> 
> I believe your
> http://llvm.org/klaus/compiler-rt/blob/
> 0926de35c9357aa1a5c47d3a618d6c72f9e8f085/test/asan/TestCases/Linux/odr-
> violation.cc
> example is valid in C

But for this example in C the globals will not get instrumented, unless  -fno-common is given. 


> , and commonly used (sure, more commonly with functions
> than with variables, but even with variables).  Furthermore, the kind of ODR
> detection in libasan isn't really ODR detection, you are instead checking if
> the same global is registered multiple times.  GCC intentionally registers
> local aliases of the globals, so that the same global isn't registered
> multiple times if it is defined by multiple shared libraries or binary and
> some shared library - each TU registers the vars local to it, rather than
> trying to register globals in a completely different shared library.

> If LLVM uses global symbols instead of local aliases, it is more expensive.
> You can have aliases/weakrefs etc. to symbols, and those still aren't ODR
> violations.

But these again should not be instrumented by asan at all. No? 

> 
> An ODR violation is IMHO something different, it is the case where you have
> the same symbol name (but, you'd need to distinguish between globally
> visible symbols that should be ODR checked and local symbols and/or symbols
> from languages you don't want to check for it) registered multiple times for
> multiple addresses.

Never seen a case like this ("registered multiple times for multiple addresses").
Can you give an example? 

>> So you'd need to hash based on the symbol name if
> marked for ODR checking, and check if the same (non-comdat) global isn't
> registered several times.
Comment 13 Yury Gribov 2014-12-08 09:02:46 UTC
(In reply to Kostya Serebryany from comment #12)
> But for this example in C the globals will not get instrumented, unless 
> -fno-common is given.

BTW I think everyone already pairs this with -fsanitize=address, otherwise sanitization of globals becomes so weak.

>> If LLVM uses global symbols instead of local aliases, it is more expensive.
>> You can have aliases/weakrefs etc. to symbols, and those still aren't ODR
>> violations.
> 
> But these again should not be instrumented by asan at all. No? 

Wouldn't that cause false negatives for internal references to local aliases?

> Never seen a case like this ("registered multiple times for multiple
> addresses").
> Can you give an example?

How about this: say some global variable xyz is defined both in executable and in shared library. Now
1) if xyz isn't exported from executable, it will not be available to shlib; so exe's internal references will bind to exe's definition but shlib's references will bind to shlib's definition
2) if xyz is exported from executable but shlib is linked with -Bsymbolic, shlib's internal references will bind to shlib's implementation
3) RTLD_DEEPBIND could also alter symbol resolution order in a shlib
Comment 14 Kostya Serebryany 2014-12-12 22:10:27 UTC
(In reply to Yury Gribov from comment #13)
> (In reply to Kostya Serebryany from comment #12)
> > But for this example in C the globals will not get instrumented, unless 
> > -fno-common is given.
> 
> BTW I think everyone already pairs this with -fsanitize=address, otherwise
> sanitization of globals becomes so weak.

Dunno. That's what we recommend.

> 
> >> If LLVM uses global symbols instead of local aliases, it is more expensive.
> >> You can have aliases/weakrefs etc. to symbols, and those still aren't ODR
> >> violations.
> > 
> > But these again should not be instrumented by asan at all. No? 
> 
> Wouldn't that cause false negatives for internal references to local aliases?

We should be careful when instrumenting something that can be redefined because the 
definition may be no-instrumented. 
Consider a case where 
 a.c has a weakref (or whatever else that can be redefined) and
 b.c had a regular definition. 
Then, a.c is instrumented and b.c is not. Bad.
  

> 
> > Never seen a case like this ("registered multiple times for multiple
> > addresses").
> > Can you give an example?
> 
> How about this: say some global variable xyz is defined both in executable
> and in shared library. Now
> 1) if xyz isn't exported from executable, it will not be available to shlib;
> so exe's internal references will bind to exe's definition but shlib's
> references will bind to shlib's definition
> 2) if xyz is exported from executable but shlib is linked with -Bsymbolic,
> shlib's internal references will bind to shlib's implementation
> 3) RTLD_DEEPBIND could also alter symbol resolution order in a shlib
Comment 15 Jakub Jelinek 2014-12-12 22:25:01 UTC
(In reply to Kostya Serebryany from comment #14)
> We should be careful when instrumenting something that can be redefined
> because the 
> definition may be no-instrumented. 

But that is a strong argument why what GCC does is desirable and not what LLVM does.
Say you have:
int f = 4;
int g = 5;
int foo (int *p) { return *p; }
in libfoo.c and
int f = 4;
int g = 5;
int foo (int *);
int main ()
{
  return foo (&f) - 4;
}
Suppose libfoo.c is compiled/linked with -fsanitize=address -fpic -shared.
Suppose main.c is compiled without -fsanitize=address.
GCC will use in registration of global symbols in libfoo.c the local alias to f and g, so registers something that in the end is not used, because f in the executable comes earlier in the search scope.  But registers something that is known to be properly padded for asan.
LLVM doesn't use a local alias, and so will happily register the f and g symbols in the binary, but as if they had the padding which they most probably don't.  So you can end up with false positives or false negatives that way.
But when you use local aliases, obviously the current way of ODR checking can't work and should use symbol names instead.
Comment 16 Kostya Serebryany 2014-12-12 22:34:51 UTC
(In reply to Jakub Jelinek from comment #15)
> (In reply to Kostya Serebryany from comment #14)
> > We should be careful when instrumenting something that can be redefined
> > because the 
> > definition may be no-instrumented. 
> 
> But that is a strong argument why what GCC does is desirable and not what
> LLVM does.
> Say you have:
> int f = 4;
> int g = 5;
> int foo (int *p) { return *p; }
> in libfoo.c and
> int f = 4;
> int g = 5;
> int foo (int *);
> int main ()
> {
>   return foo (&f) - 4;
> }
> Suppose libfoo.c is compiled/linked with -fsanitize=address -fpic -shared.
> Suppose main.c is compiled without -fsanitize=address.
> GCC will use in registration of global symbols in libfoo.c the local alias
> to f and g, so registers something that in the end is not used, because f in
> the executable comes earlier in the search scope.  But registers something
> that is known to be properly padded for asan.
> LLVM doesn't use a local alias, and so will happily register the f and g
> symbols in the binary, but as if they had the padding which they most
> probably don't.  So you can end up with false positives or false negatives
> that way.
> But when you use local aliases, obviously the current way of ODR checking
> can't work and should use symbol names instead.

Frankly, I realize that I don't understand the subtleties of this problem. :( 

First, if this is C++ we clearly have a bug (ODR violation) and we are done. 
Then, if this is C w/o any extra flags we will not instrument these globals. Done.
Finally, if this is C with -fno-common, I am not sure if the C standard covers this and how we should behave.
Comment 17 Jakub Jelinek 2014-12-12 22:50:30 UTC
(In reply to Kostya Serebryany from comment #16)
> Frankly, I realize that I don't understand the subtleties of this problem.
> :( 
> 
> First, if this is C++ we clearly have a bug (ODR violation) and we are done. 

So it is an ODR violation in C++, but you won't report it (remember, the binary is not instrumented), just it will misbehave (can mark valid memory of other vars in the binary as poisoned, e.g.).

> Then, if this is C w/o any extra flags we will not instrument these globals.

Not true, the vars are initialized, thus are not common.  Or the var in the binary could be common, and the var in the shared library not, etc.
And I've actually verified both clang and gcc instrument it.

Registering something assuming padding has been added (and aligned) when you don't have a control on it is just wrong, and the local alias is an very easy way to avoid it.
Comment 18 Kostya Serebryany 2014-12-12 23:00:40 UTC
(In reply to Jakub Jelinek from comment #17)
> (In reply to Kostya Serebryany from comment #16)
> > Frankly, I realize that I don't understand the subtleties of this problem.
> > :( 
> > 
> > First, if this is C++ we clearly have a bug (ODR violation) and we are done. 
> 
> So it is an ODR violation in C++, but you won't report it (remember, the
> binary is not instrumented), just it will misbehave (can mark valid memory
> of other vars in the binary as poisoned, e.g.).
> 
> > Then, if this is C w/o any extra flags we will not instrument these globals.
> 
> Not true, the vars are initialized, thus are not common.  Or the var in the
> binary could be common, and the var in the shared library not, etc.
> And I've actually verified both clang and gcc instrument it.
> 
> Registering something assuming padding has been added (and aligned) when you
> don't have a control on it is just wrong, and the local alias is an very
> easy way to avoid it.

I am disoriented. 
Can you please give a full repro (with command lines, etc) where 
we'll now produce a false positive (in clang or in gcc)?
Comment 19 Jakub Jelinek 2014-12-12 23:26:08 UTC
(In reply to Kostya Serebryany from comment #18)
> I am disoriented. 
> Can you please give a full repro (with command lines, etc) where 
> we'll now produce a false positive (in clang or in gcc)?

$ cat libfoo.c
long f = 4;
long g = 5;
long foo (long *p) { return *p; }
$ cat main.c
extern void abort (void);
char a[32] __attribute__((aligned (32))) = { 1 };
long f = 4;
long b1 = 2, b2 = 3, b3 = 4, b4 = 5, b5 = 6, b6 = 7, b7 = 8;
long g = 5;
long c1 = 2, c2 = 3, c3 = 4, c4 = 5, c5 = 6, c6 = 7, c7 = 8;
long foo (long *);

int main ()
{
  if (foo (&f) != 4 || foo (&b1) != 2 || foo (&b2) != 3 || foo (&b3) != 4
      || foo (&b4) != 5 || foo (&b5) != 6 || foo (&b6) != 7 || foo (&b7) != 8
      || foo (&g) != 5 || foo (&c1) != 2 || foo (&c2) != 3 || foo (&c3) != 4
      || foo (&c4) != 5 || foo (&c5) != 6 || foo (&c6) != 7 || foo (&c7) != 8)
    abort ();
  return 0;
}
$ gcc -fsanitize=address -g -shared -fpic -o libfoo.{so,c}
$ gcc -c -g -o main.{o,c}
$ gcc -fsanitize=address -o main{,.o} ./libfoo.so
$ ./main; echo $?
0
$ clang -fsanitize=address -g -shared -fpic -o libfoo.{so,c}
$ clang -c -g -o main.{o,c}
$ clang -fsanitize=address -o main{,.o} ./libfoo.so
$ ./main; echo $?
=================================================================
==5535==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000006dca28 at pc 0x7f691dc51b94 bp 0x7fffd23e0930 sp 0x7fffd23e0928
READ of size 8 at 0x0000006dca28 thread T0
    #0 0x7f691dc51b93 in foo /tmp/libfoo.c:3:22
    #1 0x4ba3c8 in main /tmp/main.c:11:24
    #2 0x327981ffdf in __libc_start_main (/lib64/libc.so.6+0x327981ffdf)
    #3 0x434366 in _start (/tmp/main+0x434366)

0x0000006dca28 is located 56 bytes to the left of global variable 'g' defined in 'libfoo.c:2:6' (0x6dca60) of size 8
0x0000006dca28 is located 0 bytes to the right of global variable 'f' defined in 'libfoo.c:1:6' (0x6dca20) of size 8
SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/libfoo.c:3 foo
Shadow bytes around the buggy address:
  0x0000800d38f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0000800d3940: 00 00 00 00 00[f9]f9 f9 f9 f9 f9 f9 00 f9 f9 f9
  0x0000800d3950: f9 f9 f9 f9 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==5535==ABORTING
1

Is this clear?  This is on access of the b1 variable defined in main.c, certainly not anything around f variable defined in libfoo.c.
Comment 20 Kostya Serebryany 2014-12-12 23:40:59 UTC
> Is this clear?  This is on access of the b1 variable defined in main.c,
> certainly not anything around f variable defined in libfoo.c.

Yea. Thanks. Pondering...
I am still not convinced that this code is good (even if it is C-standard-compliant).
Comment 21 Jakub Jelinek 2014-12-13 00:10:46 UTC
(In reply to Kostya Serebryany from comment #20)
> > Is this clear?  This is on access of the b1 variable defined in main.c,
> > certainly not anything around f variable defined in libfoo.c.
> 
> Yea. Thanks. Pondering...
> I am still not convinced that this code is good (even if it is
> C-standard-compliant).

Various programs do it, this is normal ELF symbol interposition.  Of course more often people interpose functions than variables (after all, all of libasan relies heavily on it, and also violates C++ ODR by doing that), but it happens even for variables.
And, more importantly, the diagnostics is completely confusing.

If it wasn't for the recently added ODR checking, I'd say GCC will keep what it is doing and in LLVM it is up to you what you want to do.  But the ODR checking relies on the implementation choice of LLVM, which makes it a problem for GCC too.

Consider another testcase, without ANY duplicate symbol definition in there, I think you can't find anything not kosher on it.

$ cat libfoo.c
long f = 4;
long g = 5;
long foo (long *p) { return *p; }
$ cat libfoo2.c
long h = 12, i = 13;
$ cat main2.c
extern void abort (void);
extern long f;
extern long h;
extern long i;
long foo (long *);

int
main ()
{
  if (foo (&f) != 4 || foo (&h) != 12 || foo (&i) != 13)
    abort ();
  return 0;
}
$ gcc -g -shared -fpic -o libfoo.{so,c}
$ gcc -g -shared -fpic -o libfoo2.{so,c}
$ gcc -c -g -o main2.{o,c}
$ gcc -fsanitize=address -o main2{,.o} ./libfoo.so ./libfoo2.so
$ ./main2; echo $?
0
$ gcc -fsanitize=address -g -shared -fpic -o libfoo.{so,c}
$ ./main2; echo $?
0
$ clang -g -shared -fpic -o libfoo.{so,c}
$ clang -g -shared -fpic -o libfoo2.{so,c}
$ clang -c -g -o main2.{o,c}
$ clang -fsanitize=address -o main2{,.o} ./libfoo.so ./libfoo2.so
$ ./main2; echo $?
0
$ clang -fsanitize=address -g -shared -fpic -o libfoo.{so,c}
$ ./main2; echo $?
./main2: Symbol `f' has different size in shared object, consider re-linking
=================================================================
==5791==ERROR: AddressSanitizer: global-buffer-overflow on address 0x0000006dca10 at pc 0x7f6601b17b94 bp 0x7fff63d1ebb0 sp 0x7fff63d1eba8
READ of size 8 at 0x0000006dca10 thread T0
    #0 0x7f6601b17b93 in foo /tmp/libfoo.c:3:22
    #1 0x4ba273 in main /tmp/main2.c:10:42
    #2 0x327981ffdf in __libc_start_main (/lib64/libc.so.6+0x327981ffdf)
    #3 0x4341f6 in _start (/tmp/main2+0x4341f6)

0x0000006dca10 is located 0 bytes to the right of global variable 'f' defined in 'libfoo.c:1:6' (0x6dca08) of size 8
SUMMARY: AddressSanitizer: global-buffer-overflow /tmp/libfoo.c:3 foo
Shadow bytes around the buggy address:
  0x0000800d38f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3910: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3920: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3930: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0000800d3940: 00 00[f9]f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00
  0x0000800d3950: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3960: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3970: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0000800d3990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
==5791==ABORTING
1

That shows not just that not using the local aliases is bad, but it is also bad to use incorrect sizes of the variables.
In gcc produced sanitized libfoo.so the f and g symbols have st_size still 8, the padding is included after the symbol, but not included in the size of the symbol.
In llvm, the padding is included in st_size, so f and g symbols have st_size of 64, making it ABI incompatible (both ways, either a binary linked against -fsanitize=address compiled library later run against non-sanitized one results in larger size allocated in .dynbss and you get a dynamic linker warning, or binary linked against non-sanitized library later run against sanitized library - this allocates smaller size in .dynbss and then writes over larger size, you get a dynamic linker warning and can get symbols after it poisoned).
Comment 22 Kostya Serebryany 2014-12-13 01:24:52 UTC
Summoning more folks
Comment 23 Eric Christopher 2015-02-03 21:40:06 UTC
So, I think Jakub's solution is strictly better here as it allows intermixing of asan and non-asan code. It'll involve a bit of work in llvm's middle end to keep track of symbol type to make sure to emit padding as we hit the backend, but as long as we make sure to have full symbols for each variable it should be fine on platforms like OS X where we have linkers that split sections based on type of section and symbols in the section. It's an endless string of corner cases, but as long as we're careful it'll probably work.
Comment 24 Eric Christopher 2015-02-03 21:50:37 UTC
For the record btw, I don't believe there's a reason why the linker couldn't split up the data section by knowing the size of the variable so it's worth being very careful here - there be dragons and things could break. I don't have any better ideas at the moment though.
Comment 25 Andrew Pinski 2015-02-03 22:00:11 UTC
(In reply to Eric Christopher from comment #24)
> For the record btw, I don't believe there's a reason why the linker couldn't
> split up the data section by knowing the size of the variable so it's worth
> being very careful here - there be dragons and things could break.

We can't for most non-x86 targets where section anchors are being used.
Comment 26 Eric Christopher 2015-02-03 22:03:30 UTC
Sure. Not the general case for other targets though.
Comment 27 Jakub Jelinek 2015-02-03 22:22:37 UTC
And even without section anchors, there is no guarantee there aren't relative relocations in between the symbols, that just have been applied already at assembly time (say relocations against .L* symbols).
Another reason where this would break is that various programs accumulate symbols in some sections and assume the result forms an array.  If the linker attempted to break that appart, Linux kernel and various other projects would fail miserably.
So without something like -fdata-sections the linker really can't do such changes.
Comment 28 Eric Christopher 2015-02-03 22:30:14 UTC
Ah, but not every platform is ELF :)

ld64 has the flexibility to do this with Mach-O. As I said, I don't have any better ideas at the moment, but warning that it is possible to break.
Comment 29 Jakub Jelinek 2015-02-04 13:06:29 UTC
Perhaps you can emit another symbol covering the original size plus padding in that case on targets where linker can do something so problematic.
So you'd end up having e.g. symbol
varfoo addr=0x1234500 size=32
__asan_var.varfoo addr=0x1234500 size=64
(replace . with whatever char can be used in object format's identifiers and if at all possible is not allowed in C/C++ identifiers).
Or would Mach-O still attempt to tear it appart?  That would sound like a bug to me.
Comment 30 Eric Christopher 2015-02-04 18:09:00 UTC
That might be reasonable. As far as I know ld64 doesn't do that yet, but I know it's been thought about.
Comment 31 Jakub Jelinek 2015-02-20 10:44:41 UTC
Any progress on this?

If not, I'm considering doing:
--- libsanitizer/asan/asan_globals.cc.jj	2014-11-14 00:10:34.000000000 +0100
+++ libsanitizer/asan/asan_globals.cc	2015-02-20 11:43:33.179177767 +0100
@@ -148,7 +148,9 @@ static void RegisterGlobal(const Global
   CHECK(AddrIsInMem(g->beg));
   CHECK(AddrIsAlignedByGranularity(g->beg));
   CHECK(AddrIsAlignedByGranularity(g->size_with_redzone));
-  if (flags()->detect_odr_violation) {
+  // This "ODR violation" detection is fundamentally incompatible with
+  // how GCC registers globals.  Disable as useless until rewritten upstream.
+  if (0 && flags()->detect_odr_violation) {
     // Try detecting ODR (One Definition Rule) violation, i.e. the situation
     // where two globals with the same name are defined in different modules.
     if (__asan_region_is_poisoned(g->beg, g->size_with_redzone)) {
for now.
Comment 32 Yury Gribov 2015-02-20 11:04:46 UTC
Or (probably less intrusive) add detect_odr_violation=0 to ASAN_OPTIONS config/bootstrap-asan.mk.
Comment 33 Jakub Jelinek 2015-02-20 11:09:14 UTC
That is not sufficient.  The bug affects all programs, not just gcc when bootstrapping it.
Comment 34 Kostya Serebryany 2015-02-20 16:31:48 UTC
(In reply to Jakub Jelinek from comment #31)
> Any progress on this?
> 
> If not, I'm considering doing:
> --- libsanitizer/asan/asan_globals.cc.jj	2014-11-14 00:10:34.000000000 +0100
> +++ libsanitizer/asan/asan_globals.cc	2015-02-20 11:43:33.179177767 +0100
> @@ -148,7 +148,9 @@ static void RegisterGlobal(const Global
>    CHECK(AddrIsInMem(g->beg));
>    CHECK(AddrIsAlignedByGranularity(g->beg));
>    CHECK(AddrIsAlignedByGranularity(g->size_with_redzone));
> -  if (flags()->detect_odr_violation) {
> +  // This "ODR violation" detection is fundamentally incompatible with
> +  // how GCC registers globals.  Disable as useless until rewritten
> upstream.
> +  if (0 && flags()->detect_odr_violation) {
>      // Try detecting ODR (One Definition Rule) violation, i.e. the situation
>      // where two globals with the same name are defined in different
> modules.
>      if (__asan_region_is_poisoned(g->beg, g->size_with_redzone)) {
> for now.

Sorry, this got pushed back in the queue... 
Sure, feel free to commit something like this.
Slightly better place is at the definition of detect_odr_violation -- change the default to 0 and mark the change with some visible markers. 
Obviously we'll need to do a proper fix upstream before the next merge.

Frankly, I am not at all motivated to do any significant surgery in the llvm compiler instrumentation because for me everything works fine. 

We can extend the Global struct to have a boolean (bit?) that says whether we are checking for ODRs (as you suggested above). This will be a new ABI break. Not a big deal, of course.

I wonder if we can somehow else detect that the module is compiled with GCC and disable the checking in that case?
Comment 35 Yury Gribov 2015-02-20 17:23:50 UTC
(In reply to Kostya Serebryany from comment #34)
> Frankly, I am not at all motivated to do any significant surgery in the llvm
> compiler instrumentation because for me everything works fine. 

Hm, is this related to GCC at all? Comments mention symbol resolution rules (local aliases being one example), I think this is totally orthogonal to compiler.
Comment 36 Jakub Jelinek 2015-02-23 17:49:35 UTC
(In reply to Yury Gribov from comment #35)
> (In reply to Kostya Serebryany from comment #34)
> > Frankly, I am not at all motivated to do any significant surgery in the llvm
> > compiler instrumentation because for me everything works fine. 
> 
> Hm, is this related to GCC at all? Comments mention symbol resolution rules
> (local aliases being one example), I think this is totally orthogonal to
> compiler.

It is related to the implementation of ASAN for PIC code.
GCC registers globals through private aliases (so always registers globals in the current shared library, no matter whether the symbol in the end is interposed by another symbol or not), while LLVM registers globals through their exported symbols (so if a symbol is interposed, it is registered multiple times by multiple shared libraries).
Comment 37 Jakub Jelinek 2015-02-23 21:02:29 UTC
Author: jakub
Date: Mon Feb 23 21:01:57 2015
New Revision: 220919

URL: https://gcc.gnu.org/viewcvs?rev=220919&root=gcc&view=rev
Log:
	PR bootstrap/63888
	* asan/asan_globals.cc (RegisterGlobal): Disable detect_odr_violation
	support until it is rewritten upstream.

	* c-c++-common/asan/pr63888.c: New test.

Added:
    trunk/gcc/testsuite/c-c++-common/asan/pr63888.c
Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/libsanitizer/ChangeLog
    trunk/libsanitizer/asan/asan_globals.cc
Comment 38 Jakub Jelinek 2015-02-23 21:06:22 UTC
Should be fixed now.
Comment 39 Yury Gribov 2015-02-24 11:00:50 UTC
(In reply to Jakub Jelinek from comment #36)
> (In reply to Yury Gribov from comment #35)
> > (In reply to Kostya Serebryany from comment #34)
> > > Frankly, I am not at all motivated to do any significant surgery in the llvm
> > > compiler instrumentation because for me everything works fine. 
> > 
> > Hm, is this related to GCC at all? Comments mention symbol resolution rules
> > (local aliases being one example), I think this is totally orthogonal to
> > compiler.
> 
> It is related to the implementation of ASAN for PIC code.
> GCC registers globals through private aliases (so always registers globals
> in the current shared library, no matter whether the symbol in the end is
> interposed by another symbol or not), while LLVM registers globals through
> their exported symbols (so if a symbol is interposed, it is registered
> multiple times by multiple shared libraries).

What I was trying to say was that if library references it's local symbol (via local alias or -Bsymbolic), it'll (effectively) not be instrumented which will lead to false negatives. And this is not GCC-related.
Comment 40 Maxim Ostapenko 2015-10-19 17:20:49 UTC
Just a little bit more fuel to the flames:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68016
Comment 41 Maxim Ostapenko 2015-10-21 07:48:26 UTC
Author: chefmax
Date: Wed Oct 21 07:47:54 2015
New Revision: 229114

URL: https://gcc.gnu.org/viewcvs?rev=229114&root=gcc&view=rev
Log:
libsanitizer/

	PR bootstrap/63888
	Reapply:
	2015-02-20  Jakub Jelinek  <jakub@redhat.com>

	* asan/asan_globals.cc (RegisterGlobal): Disable detect_odr_violation
	support until it is rewritten upstream.

	* c-c++-common/asan/pr63888.c: New test.


Modified:
    trunk/libsanitizer/ChangeLog
    trunk/libsanitizer/asan/asan_globals.cc