Bug 61771 - Test failures in ASan testsuite on ARM Linux due to FP format mismatch between libasan and GCC.
Summary: Test failures in ASan testsuite on ARM Linux due to FP format mismatch betwee...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: sanitizer (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: 6.0
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-10 12:12 UTC by Maxim Ostapenko
Modified: 2017-07-07 08:48 UTC (History)
6 users (show)

See Also:
Host: x86_64-pc-linux-gnu
Target: arm-linux-gnueabi
Build:
Known to work:
Known to fail:
Last reconfirmed: 2014-07-24 00:00:00


Attachments
heap-overflow-1 output (799 bytes, text/plain)
2014-07-10 12:12 UTC, Maxim Ostapenko
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Maxim Ostapenko 2014-07-10 12:12:42 UTC
Created attachment 33101 [details]
heap-overflow-1 output

I see some regressions (e.g. heap-overflow-1) in Asan testsuite on ARM Linux. It seems that libsanitizer couldn't unwind stack from __interceptor_malloc() function by using StackTrace::FastUnwindStack. Same test works fine with Clang on ARM.

After some investigation, I noticed, that StackTrace::FastUnwindStack reads fp and lr from wrong stack slot. I suppose this related with different
code producing by GCC and Clang in __interceptor_malloc() prologue.

This is a small reprocase:

extern int func2(int i);
int func(int i) {
  return func2(i);
}

Clang builds such code in func prologue:

.save   {r11, lr}
push    {r11, lr}
.setfp  r11, sp
 mov r11, sp

GCC builds another one:

.save {fp, lr}
str fp, [sp, #-8]!
str lr, [sp, #4]
.setfp fp, sp, #4
add fp, sp, #4

As we can see, the LLVM code results in the frame pointer pointing to just after the pushed R11(fp) register on the stack, while the GCC code results in the frame pointer pointing to just after the pushed LR register on the stack (related discussion in LLVM is http://comments.gmane.org/gmane.comp.compilers.llvm.devel/69514).

This is not a bug in libsanitizer but rather an ABI mismatch between GCC and LLVM. Any ideas how to proceed with this?

-Maxim
Comment 1 Richard Earnshaw 2014-07-15 15:10:07 UTC
The ABI does not document a model for stack chains, so any use of a frame pointer is, by definition, purely private to that function.
Comment 2 Maxim Ostapenko 2014-07-15 16:12:11 UTC
So looks like fast unwinding in libsanitizer is not portable to GCC for ARM Linux target because of incompatible frame pointer value. But how is libsanitizer going to identify functions/object files compiled by GCC? Perhaps we should just disable fast unwind on ARM?

-Maxim
Comment 3 Evgeniy Stepanov 2014-07-15 16:21:10 UTC
Yes, FP on ARM is non-standard and differs in GCC and Clang implementations.
Disabling fast unwind is not really an option, as you are looking at 10x, maybe 100x slowdown (depending of the app, of course).
It should be possible to detect fp layout on the frame basis - there is a slot (don't know which one off the top of my head) that is FP in one compiler and return address in the other. Comparing its contents with the current stack limits (readily available in ASan) should do the trick.

Of course, it would be awesome if we could synchronize (and document somewhere) FP stack layout between GCC and Clang - after all, there is no strong reason to do it one way or the other.
Comment 4 Yury Gribov 2014-07-16 05:36:40 UTC
> It should be possible to detect fp layout on the frame basis -
> there is a slot (don't know which one off the top of my head)
> that is FP in one compiler and return address in the other.
> Comparing its contents with the current stack limits
> (readily available in ASan) should do the trick.

Unless program uses nested functions (and so LR may point to trampoline which is located on stack).

> Of course, it would be awesome if we could synchronize
> (and document somewhere) FP stack layout between GCC and Clang

Let me ping LLVM list for this.
Comment 5 Kostya Serebryany 2014-07-16 08:01:58 UTC
> > Of course, it would be awesome if we could synchronize
> > (and document somewhere) FP stack layout between GCC and Clang
Isn't this what ARM should be doing?

>> Perhaps we should just disable fast unwind on ARM?
You will slowdown asan to the point where Valgrind will become preferable
Comment 6 Yury Gribov 2014-07-16 08:21:24 UTC
(In reply to Kostya Serebryany from comment #5)
> >> Perhaps we should just disable fast unwind on ARM?
> You will slowdown asan to the point where Valgrind will become preferable

Unless you choose to default-disable malloc contexts on platforms that don't really support them. Anyway, Evgeny's proposal sounds like a good WAR. We'll send something upstream.
Comment 7 Ramana Radhakrishnan 2014-07-24 09:03:15 UTC
(In reply to Evgeniy Stepanov from comment #3)
> Yes, FP on ARM is non-standard and differs in GCC and Clang implementations.
> Disabling fast unwind is not really an option, as you are looking at 10x,
> maybe 100x slowdown (depending of the app, of course).
> It should be possible to detect fp layout on the frame basis - there is a
> slot (don't know which one off the top of my head) that is FP in one
> compiler and return address in the other. Comparing its contents with the
> current stack limits (readily available in ASan) should do the trick.
> 
> Of course, it would be awesome if we could synchronize (and document
> somewhere) FP stack layout between GCC and Clang - after all, there is no
> strong reason to do it one way or the other.

I think finding a fix in the run time will be better and probably more resilient across versions of GCC. In any case I think this is worthy of a work around in the sanitisers rather than in GCC itself. 

I don't know where the bugs for the sanitizer run time is tracked - so it maybe worth closing this with a link to the appropriate upstream bug report.
Comment 8 Yury Gribov 2014-07-24 10:07:18 UTC
(In reply to Ramana Radhakrishnan from comment #7)
> I think finding a fix in the run time will be better and probably more
> resilient across versions of GCC. In any case I think this is worthy of a
> work around in the sanitisers rather than in GCC itself. 

Ok, we are working on a libsanitizer workaround for this.

> I don't know where the bugs for the sanitizer run time is tracked - so it
> maybe worth closing this with a link to the appropriate upstream bug report.

AFAIK we keep bugs open until they get fixed with next merge from LLVM.(In reply to Ramana Radhakrishnan from comment #7)
Comment 9 Ramana Radhakrishnan 2014-07-24 12:55:56 UTC
(In reply to Yury Gribov from comment #8)
> (In reply to Ramana Radhakrishnan from comment #7)
> > I think finding a fix in the run time will be better and probably more
> > resilient across versions of GCC. In any case I think this is worthy of a
> > work around in the sanitisers rather than in GCC itself. 
> 
> Ok, we are working on a libsanitizer workaround for this.
> 
> > I don't know where the bugs for the sanitizer run time is tracked - so it
> > maybe worth closing this with a link to the appropriate upstream bug report.
> 
> AFAIK we keep bugs open until they get fixed with next merge from LLVM.(In
> reply to Ramana Radhakrishnan from comment #7)

That's ok - I didn't know how you handled issues for the sanitizers when you were working across the 2 projects.

Anyway - confirmed.
Comment 10 Yuri Gribov 2017-07-07 08:09:35 UTC
This should be fixed since r229111 (and 229115). Close?
Comment 11 Maxim Ostapenko 2017-07-07 08:35:08 UTC
Yes, fixed.
Comment 12 Ramana Radhakrishnan 2017-07-07 08:48:37 UTC
(In reply to Maxim Ostapenko from comment #11)
> Yes, fixed.

Fixed for GCC 6.0 I suspect.