On both powerpc64 and powerpc64le, the c-c++-common/asan/misalign-1.c shows 6 failures, all in the output pattern test. The failures are due to to the stack trace missing stack frame #1 (it only includes frame #0). It looks like the backtrace on powerpc doesn't work correctly. ================================================================= ==87868==ERROR: AddressSanitizer: unknown-crash on address 0x3fffc231eb2f at pc 0x000010000ce8 bp 0x3fffc231e9f0 sp 0x3fffc231ea10 READ of size 4 at 0x3fffc231eb2f thread T0 #0 0x10000ce4 in foo /src/gcc-5.0-git/gcc/testsuite/c-c++-common/asan/misalign-1.c:11 Address 0x3fffc231eb2f is located in stack of thread T0 at offset 175 in frame #0 0x1000086c in main /src/gcc-5.0-git/gcc/testsuite/c-c++-common/asan/misalign-1.c:28 This frame has 3 object(s): [32, 36) 'v' [96, 100) 'w' [160, 176) 't' <== Memory access at offset 175 partially overflows this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext (longjmp and C++ exceptions *are* supported) SUMMARY: AddressSanitizer: unknown-crash /src/gcc-5.0-git/gcc/testsuite/c-c++-common/asan/misalign-1.c:11 foo Shadow bytes around the buggy address: 0x09fff8463d10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463d20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463d30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463d40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463d50: f1 f1 f1 f1 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 =>0x09fff8463d60: f2 f2 f2 f2 00[00]f4 f4 f3 f3 f3 f3 00 00 00 00 0x09fff8463d70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463d90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463da0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x09fff8463db0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe ==87868==ABORTING
The same problem is causing failures in the following tests on these targets: c-c++-common/asan/misalign-2.c c-c++-common/asan/null-deref-1.c
Created attachment 35196 [details] Proof-of-concept patch. Two problems are contributing to the failures in these tests: 1) a missing -fasynchronous-unwind-tables option; the option is necessary on powerpc*-*-*-* to generate a full stack trace 2) a bug in the backtrace_qsort function introduced in r208403 that makes the algorithm unstable (see also http://gcc.gnu.org/ml/gcc-patches/2014-03/msg00342.html) The attached proof-of-concept patch adds the missing option mentioned in (1) and backs out the commit referenced in (2) as a proof of concept of fixing the problem. I'll try to come up with an approach that doesn't undo the performance improvement in a subsequent patch.
The bug in backtrace_qsort is actually worse than the regression introduced in r208403. There is a fundamental problem with relying on the addresses of the array elements to maintain stability. Either the algorithm needs to be replaced with a stable one like Merge Sort, or a new data member needs to be introduced into struct line to reflect their initial order. I suspect the latter alternative will be cheaper in terms of resources (i.e., less memory and faster sort time). This also means that the aspect of the bug isn't powerpc specific. It only happens to manifest in the testsuite runs on that target.
As for the -fasynchronous-unwind-tables option, this really should be handled by teaching libsanitizer to handle powerpc* fast unwinding. For the backtrace_qsort, can you cook up some short testcase that calls backtrace_qsort and sorts things incorrectly?
Created attachment 35289 [details] Test case demonstrating stability problem with backtrace_qsort. Attached is a demo program showing the stability bug in the backtrace_qsort function. The output shows the result of the current implementation (Unstable) and the expected result (Stable). A better test case wouldn't rely on the knowledge of the line_compare function and instead arrange to construct a DWARF line program with similar properties that would then cause the backtrace line problem. I suspect that would take quite a bit of effort to put together, especially if we wanted it to be reproducible across targets. I plan to work on the fast unwinding but I don't expect it to be ready in time for the 5.0 release. In the meantime, I'll post a patch to fix the test failures and maintain stability to be considered for 5.0.
(In reply to Martin Sebor from comment #5) > Created attachment 35289 [details] > Test case demonstrating stability problem with backtrace_qsort. > > Attached is a demo program showing the stability bug in the backtrace_qsort > function. The output shows the result of the current implementation > (Unstable) and the expected result (Stable). A better test case wouldn't > rely on the knowledge of the line_compare function and instead arrange to > construct a DWARF line program with similar properties that would then cause > the backtrace line problem. I suspect that would take quite a bit of effort > to put together, especially if we wanted it to be reproducible across > targets. > > I plan to work on the fast unwinding but I don't expect it to be ready in > time for the 5.0 release. In the meantime, I'll post a patch to fix the > test failures and maintain stability to be considered for 5.0. That hints at a bug in the line_compare function, what it does is just bogus. Adding the idx field to struct line sounds IMHO like the right thing, and on 64-bit arches won't even eat any extra memory because there have been 32 bits of padding.
I forgot to mention that there is yet another bug here that's complicating things. I was initially going to describe it here but since it's independent of this problem I decided to open a separate bug: pr65749. The complication is that a patch for this bug that produces the expected results on POWER (i.e., passes all sanitizer tests) breaks at least one test on x86_64 because of the incorrect computation of the PC value in libsanitizer/sanitizer_common/sanitizer_stacktrace_libcdep.cc.
Created attachment 35308 [details] Patch tested on powerp64*-*-* This patch lets the affected tests pass on powerp64*-*-* but due to bug 65749, causes regressions in the stack-overflow-1.c test where asan reports different line numbers in the stack trace than expected.
Created attachment 35360 [details] Proposed patch. The attached patch resolves all the Address Sanitizer test suite failures on powerpc64 except for those that are subject to pr65643. Tested on powerpc64*-*-*-* and x86_64.
Author: msebor Date: Fri Jun 12 00:01:50 2015 New Revision: 224402 URL: https://gcc.gnu.org/viewcvs?rev=224402&root=gcc&view=rev Log: 2015-06-11 Martin Sebor <msebor@redhat.com> PR sanitizer/65479 * dwarf.c (struct line): Add new field idx. (line_compare): Use it. (add_line): Set it. (read_line_info): Reset it. Modified: trunk/libbacktrace/ChangeLog trunk/libbacktrace/dwarf.c
Martin, can this be closed?
The sanitizer tests are still failing and this bug still hasn't been fixed. The patch in r224402 was only partial, committed on the assumption that the LLVM AddressSantitizer maintainers would approve my patch for it. Unfortunately, I never managed to convince them to approve the trivial patch there and after much back and forth I gave up (the review is here: http://reviews.llvm.org/D10065).
with gcc 4.9.3 an powerpc32 I sometimes get backtraces like this: ================================================================= ==2899==ERROR: AddressSanitizer: heap-use-after-free on address 0xb400091c at pc 0xb3800120 bp 0xbf86a270 sp 0xbf86a288 READ of size 4 at 0xb400091c thread T0 ==2899==AddressSanitizer: while reporting a bug found another one.Ignoring. #0 0xb380011c (+0x11c) #1 0xfb2aa10 in __asan_report_error (/usr/lib/gcc/powerpc-unknown-linux-gnu/4.9.3/libasan.so.1+0x65a10) #2 0xfb2bac8 in __asan_report_load4 (/usr/lib/gcc/powerpc-unknown-linux-gnu/4.9.3/libasan.so.1+0x66ac8) #3 0xf3c028c in Icn_unregisterIf (/opt/appl/cuappl04a-r27a-160209jt2//lib/libiciif.so.1+0x4c28c) 0xb400091c is located 2204 bytes inside of 2208-byte region [0xb4000080,0xb4000920) freed by thread T0 here: #0 0xf3b9ba0 in Icn_removeEntry (/opt/appl/cuappl04a-r27a-160209jt2//lib/libiciif.so.1+0x45ba0) previously allocated by thread T0 here: #0 0xf3c0870 in Icn_registerIf (/opt/appl/cuappl04a-r27a-160209jt2//lib/libiciif.so.1+0x4c870) SUMMARY: AddressSanitizer: heap-use-after-free ??:0 ?? Shadow bytes around the buggy address: 0x368000d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x368000e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x368000f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36800100: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36800110: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x36800120: fd fd fd[fd]fa fa fa fa fa fa fa fa fa fa fa fa 0x36800130: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x36800140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x36800150: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36800160: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36800170: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap right redzone: fb Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Contiguous container OOB:fc ASan internal: fe ==2899==ABORTING This backtrace is incomplete, could this bug be the cause?
I wonder if this is just support that hasn't been updated in the GCC copy of libsanitizer. I recall fixing this bug (or one very similar to it) on the Clang side in 2015. There is some Power-specific logic in there for doing the stack unwinding. I will try to dig that information up and compare the two.
Yes, it seems likely that this is due to this patch being missing from GCC 5.3: http://reviews.llvm.org/D11552 The fix is present in trunk, so this should be fixed with a backport.
Unfortunately, that does not appear to be sufficient to solve any of the extant ASAN failures. I'm suspicious that fast unwinding is still being disabled somewhere. In any case, we need the backport; but we seem to need something else as well. Confirmed, in any case.
Unassigning myself since I'm not actively working on it anymore.
With other fixes that have been made the only thing remaining to get the 3 test cases working is the -fasynchronous-unwind-tables option. I tried this on powerpc64 LE and BE and it works fine. I also verified it doesn't break on x86. c-c++-common/asan/misalign-1.c c-c++-common/asan/misalign-2.c c-c++-common/asan/null-deref-1.c /* { dg-additional-options "-fasynchronous-unwind-tables" { target { powerpc*-*-linux* } } } */ (plus some tweaking of line numbers in expected outputs) I'll be submitting a patch for this.
Author: seurer Date: Wed Dec 21 19:09:10 2016 New Revision: 243863 URL: https://gcc.gnu.org/viewcvs?rev=243863&root=gcc&view=rev Log: [PATCH, v2, rs6000] pr65479 Add -fasynchronous-unwind-tables when the -fsanitize=address option is seen. All feedback from the earlier version has been taken into account now. This patch adds the -fasynchronous-unwind-tables option to compilations when the -fsanitize=address option is seen but not if any -fasynchronous-unwind-tables options were already specified. -fasynchronous-unwind-tables causes a full strack trace to be produced when the sanitizer detects an error. Without the full trace several of the asan test cases fail on powerpc. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65479 for more information. Bootstrapped and tested on powerpc64le-unknown-linux-gnu, powerpc64be-unknown-linux-gnu, and x86_64-pc-linux-gnu with no regressions. Is this ok for trunk? [gcc] 2016-12-21 Bill Seurer <seurer@linux.vnet.ibm.com> PR sanitizer/65479 * config/rs6000/rs6000.c (rs6000_option_override_internal): Add -fasynchronous-unwind-tables option when -fsanitize=address is specified. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.c
Author: seurer Date: Mon Jan 9 20:45:59 2017 New Revision: 244241 URL: https://gcc.gnu.org/viewcvs?rev=244241&root=gcc&view=rev Log: 2017-01-09 Bill Seurer <seurer@linux.vnet.ibm.com> Backport from mainline 2016-12-21 Bill Seurer <seurer@linux.vnet.ibm.com> PR sanitizer/65479 * config/rs6000/rs6000.c (rs6000_option_override_internal): Add -fasynchronous-unwind-tables option when -fsanitize=address is specified. Modified: branches/gcc-6-branch/gcc/ChangeLog branches/gcc-6-branch/gcc/config/rs6000/rs6000.c
Author: seurer Date: Tue Jan 10 19:11:55 2017 New Revision: 244284 URL: https://gcc.gnu.org/viewcvs?rev=244284&root=gcc&view=rev Log: 2017-01-10 Bill Seurer <seurer@linux.vnet.ibm.com> Backport from mainline 2016-12-21 Bill Seurer <seurer@linux.vnet.ibm.com> PR sanitizer/65479 * config/rs6000/rs6000.c (rs6000_option_override_internal): Add -fasynchronous-unwind-tables option when -fsanitize=address is specified. Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/config/rs6000/rs6000.c
This is now resolved in gcc 5, 6, and trunk
Fixed.
Closing as fixed.
Really closed now. :-(