Bug 61408 - r205695 breaks packaging step of Firefox 24 ESR on Ubuntu Lucid building with ASan
Summary: r205695 breaks packaging step of Firefox 24 ESR on Ubuntu Lucid building with...
Status: RESOLVED WORKSFORME
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 4.9.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-04 05:56 UTC by Georg Koppen
Modified: 2016-05-26 11:34 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Georg Koppen 2014-06-04 05:56:15 UTC
Trying to build Firefox 24 ESR on Lucid with ASan and 4.9.0 leads to a crash during the packaging of the build:

Executing /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/ubuntu/build/tor-browser/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://gre/");
ASAN:SIGSEGV
=================================================================
==22869==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 sp 0x2b0f084bf678 bp 0x2b0f084bf780 T2)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
Thread T2 created by T0 here:
    #0 0x2b0ec8ea572a in __interceptor_pthread_create ../../.././libsanitizer/asan/asan_interceptors.cc:183
    #1 0x2b0ef7b75269 in _PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:444
    #2 0x2b0ef7b778ae in PR_CreateThread /home/ubuntu/build/tor-browser/nsprpub/pr/src/pthreads/ptthread.c:527
    #3 0x2b0ede4a9286 in nsThread::Init() /home/ubuntu/build/tor-browser/xpcom/threads/nsThread.cpp:332
    #4 0x2b0ee5d7d57c (/home/ubuntu/build/tor-browser/obj-x86_64-unknown-linux-gnu/dist/bin/libxul.so+0x1bdfe57c)

==22869==ABORTING

After a lot of bisecting it turns out that r205695 is the first revision that breaks.
Oddly, this is not a problem if building and using 4.9.0 on Ubuntu Precise. There building and packaging Firefox 24 ESR is working fine.
Comment 1 Georg Koppen 2014-06-04 05:57:52 UTC
I am happy to debug this further (and am, of course, interested in a fix/workaround). Let me know what you need.
Comment 2 Kostya Serebryany 2014-06-04 07:22:00 UTC
Does this happen with GCC trunk? 
LLVM trunk?
Comment 3 Georg Koppen 2014-06-04 10:33:05 UTC
(In reply to Kostya Serebryany from comment #2)
> Does this happen with GCC trunk?

Hard to say as it crashes differently:

Executing /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/gk/asan/mozilla-esr24/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://gre/");
=================================================================
==22303==ERROR: AddressSanitizer: unknown-crash on address 0x2ad2d31bd3c0 at pc 0x2ad2d1803362 bp 0x7fff8f6149c0 sp 0x7fff8f6149b8
READ of size 16 at 0x2ad2d31bd3c0 thread T0
    #0 0x2ad2d1803361 in nsIDHashKey ../../dist/include/nsHashKeys.h:375
    #1 0x2ad2d1803361 in nsBaseHashtableET ../../dist/include/nsBaseHashtable.h:408
    #2 0x2ad2d1803361 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::s_InitEntry(PLDHashTable*, PLDHashEntryHdr*, void const*) ../../dist/include/nsTHashtable.h:472
    #3 0x2ad2d179ad39 in PL_DHashTableOperate /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/xpcom/build/pldhash.cpp:630
    #4 0x2ad2d1805d75 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::PutEntry(nsID const&, mozilla::fallible_t const&) ../../dist/include/nsTHashtable.h:184
    #5 0x2ad2d1805d75 in nsTHashtable<nsBaseHashtableET<nsIDHashKey, nsFactoryEntry*> >::PutEntry(nsID const&) ../../dist/include/nsTHashtable.h:170
    #6 0x2ad2d1805d75 in nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Put(nsID const&, nsFactoryEntry* const&, mozilla::fallible_t const&) ../../dist/include/nsBaseHashtable.h:147
    #7 0x2ad2d1805d75 in nsBaseHashtable<nsIDHashKey, nsFactoryEntry*, nsFactoryEntry*>::Put(nsID const&, nsFactoryEntry* const&) ../../dist/include/nsBaseHashtable.h:141
    #8 0x2ad2d1806065 in nsComponentManagerImpl::RegisterCIDEntryLocked(mozilla::Module::CIDEntry const*, nsComponentManagerImpl::KnownModule*) /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:502
    #9 0x2ad2d1809d35 in nsComponentManagerImpl::RegisterModule(mozilla::Module const*, mozilla::FileLocation*) /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:453
    #10 0x2ad2d180aba2 in nsComponentManagerImpl::Init() /home/gk/asan/mozilla-esr24/xpcom/components/nsComponentManager.cpp:389
    #11 0x2ad2d17a1fb0 in NS_InitXPCOM2 /home/gk/asan/mozilla-esr24/xpcom/build/nsXPComInit.cpp:467
    #12 0x406d4b in main /home/gk/asan/mozilla-esr24/js/xpconnect/shell/xpcshell.cpp:1566
    #13 0x2ad2d59b6c8c in __libc_start_main (/lib/libc.so.6+0x1ec8c)
    #14 0x407ea0 (/home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell+0x407ea0)

0x2ad2d31bd3c0 is located 0 bytes inside of global variable 'kComponentManagerCID' from '/home/gk/asan/mozilla-esr24/xpcom/build/nsXPComInit.cpp' (0x2ad2d31bd3c0) of size 16
SUMMARY: AddressSanitizer: unknown-crash ../../dist/include/nsHashKeys.h:375 nsIDHashKey
Shadow bytes around the buggy address:
  0x055ada62fa20: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa30: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa40: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa50: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
  0x055ada62fa60: 00 00 f9 f9 f9 f9 f9 f9 00 00 f9 f9 f9 f9 f9 f9
=>0x055ada62fa70: 00 00 f9 f9 f9 f9 f9 f9[00]00 f9 f9 f9 f9 f9 f9
  0x055ada62fa80: 07 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 04 f9 f9 f9
  0x055ada62fa90: f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x055ada62faa0: 05 f9 f9 f9 f9 f9 f9 f9 06 f9 f9 f9 f9 f9 f9 f9
  0x055ada62fab0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x055ada62fac0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  ASan internal:           fe
==22303==ABORTING
 
> LLVM trunk?

Have not tried yet. Shall I?
Comment 4 Kostya Serebryany 2014-06-04 10:51:20 UTC
> > LLVM trunk?
> 
> Have not tried yet. Shall I?

asan is being developed in LLVM trunk.
So if there is a bug in run-time it's better to investigate the freshest variant in LLVM trunk
The fix will have to go there first anyway.

If the problem is in GCC instrumentation, then of course, there is no reason to test LLVM.
But you say that the problem stared after the libsanitizer merge, so it is likely in run-time. 

It can be a problem in FF itself or in the way you run it, of course.
Comment 5 Georg Koppen 2014-06-04 13:39:14 UTC
(In reply to Kostya Serebryany from comment #4)
> > > LLVM trunk?
> > 
> > Have not tried yet. Shall I?
> 
> asan is being developed in LLVM trunk.
> So if there is a bug in run-time it's better to investigate the freshest
> variant in LLVM trunk
> The fix will have to go there first anyway.

Okay. Then I'll do that. Note that the last crash is happening on my Precise system as well now. Thus, I guess this is a different issue worth another bug. Anyway, I'll post back when I have convinced LLVM/Clang to compile Firefox.
Comment 6 Jakub Jelinek 2014-06-04 13:42:04 UTC
I'd say there is no point in doing that.  Just build the compiler-rt library and link it in statically (-static-libasan) with gcc instead of the gcc one.
Comment 7 Georg Koppen 2014-06-05 13:10:23 UTC
(In reply to Jakub Jelinek from comment #6)
> I'd say there is no point in doing that.  Just build the compiler-rt library
> and link it in statically (-static-libasan) with gcc instead of the gcc one.

Hmm... how do I do that exactly? I have some libclang_rt.* in Release+Asserts/lib/clang/3.5.0/lib/linux but seem not to be able to get that going...

And I don't get any Firefox version between 24 and trunk compiled with 3.5.0 (for different reasons), thus that way is blocked as well atm.
Comment 8 Georg Koppen 2014-06-09 15:01:10 UTC
While I am still unable to compile/test Firefox 24 with clang trunk I managed to compile it with clang's r196090 which is the one r205695 merged. And there the problem occurs as well. Thus, we know at least that it is not a GCC specific issue. Not sure about the problem in comment 3 yet which is probably better tracked in a different bug. I'll open one as soon as my build machine is not occupied anymore with bisecting the things related to the crash mentioned in the description.
Comment 9 Georg Koppen 2014-06-11 12:45:22 UTC
(In reply to Georg Koppen from comment #8)
> Not sure about the problem in comment 3 yet which is
> probably better tracked in a different bug. I'll open one as soon as my
> build machine is not occupied anymore with bisecting the things related to
> the crash mentioned in the description.

This is now bug 61475. Let's see what happens when I compile Firefox 24 on Lucid with LLVM/Clang trunk...
Comment 10 Georg Koppen 2014-06-11 19:05:59 UTC
Okay. LLVM/Clang trunk does not cope with the packaging step either. Sending the crash through asan_symbolize.py gives:

Executing /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/xpcshell -g /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -a /home/gk/asan/mozilla-esr24/obj-x86_64-unknown-linux-gnu/dist/bin/ -f /home/gk/asan/mozilla-esr24/toolkit/mozapps/installer/precompile_cache.js -e precompile_startupcache("resource://gre/");
ASAN:SIGSEGV
=================================================================
==4017==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 sp 0x2b9955a94738 bp 0x2b9955a947f0 T2)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV ??:0 ??
Thread T2 created by T0 here:
BFD: Dwarf Error: found dwarf version '4', this reader only handles version 2 and 3 information.
    #0 0x48aa4f in pthread_create ??:0
BFD: Dwarf Error: found dwarf version '4', this reader only handles version 2 and 3 information.
    #1 0x2b994b0d4d4e in _PR_CreateThread /home/gk/asan/mozilla-esr24/nsprpub/pr/src/pthreads/ptthread.c:0
BFD: Dwarf Error: found dwarf version '0', this reader only handles version 2 and 3 information.
    #2 0x2b994b0d48b7 in PR_CreateThread ??:0

==4017==ABORTING

I am trying to work around that problem until somebody has a good idea on what is going on/how to debug that further.
Comment 11 Georg Koppen 2014-06-16 10:51:28 UTC
Working around was tricky, so I started bisecting LLVM/clang/compiler-rt. The first bad revision there is r193602. Might be worth filing an upstream bug (I don't have a Google account), I guess.
Comment 12 Kostya Serebryany 2014-06-16 11:24:28 UTC
I am not sure what does your bisection of LLVM/clang/compiler-rt mean
if you say that clang trunk works fine.
Comment 13 Georg Koppen 2014-06-16 11:29:31 UTC
(In reply to Kostya Serebryany from comment #12)
> I am not sure what does your bisection of LLVM/clang/compiler-rt mean
> if you say that clang trunk works fine.

There are two different issues here in this bug: one got moved to bug 61475 and *there* clang trunk is working fine. BUT for the original issue mentioned in this bug trunk is *not* working. See comment 8 above. I was referring to that comment when I talked about the bisection in my last comment. Anyway, I backed out the corresponding code parts in my GCC 4.9.0 and now everything is working when I am compiling with GCC 4.9.0 on Ubuntu Lucid.
Comment 14 Georg Koppen 2016-05-26 11:34:20 UTC
Works for me given that probably no one is still using 10.04 and Firefox ESR24. Certainly we don't.