82158 – _Noreturn functions that do return clobber caller's registers on ARM32 (but not other arches)

Bug 82158 - _Noreturn functions that do return clobber caller's registers on ARM32 (but not other arches)

Summary: _Noreturn functions that do return clobber caller's registers on ARM32 (but n...

Status:	RESOLVED WONTFIX

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	target (show other bugs)
Version:	7.1.0

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:	https://godbolt.org/g/GhW4b8
Keywords:	wrong-code

Depends on:
Blocks:

Reported:	2017-09-09 06:24 UTC by Peter Cordes
Modified:	2017-10-20 07:36 UTC (History)
CC List:	2 users (show)

See Also:
Host:
Target:	arm--*
Build:
Known to work:
Known to fail:
Last reconfirmed:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Peter Cordes 2017-09-09 06:24:26 UTC

ARM32 clobbers a call-preserved register in a function which does actually return.  (It was declared _Noreturn, but gcc chooses to make a function which returns instead of crashing).

Not sure if this is a feature or bug (came up in https://stackoverflow.com/a/45982153/224132).  Probably a bug, since saving registers in a noreturn function is potentially useful for backtraces (and gcc avoids tailcall with noreturn for that reason), even if exceptions are disabled so stack-unwinding doesn't need them.  And this is only happening on ARM32, not the others I looked at: ARM64, MIPS, MIPS64, PowerPC64, MSP430, x86 -m32, or x86 -m64.  (But ARM's multi-operand-at-once POP is probably handled specially, so that could explain the difference...)

Anyway, in a function declared noreturn which actually *does* return, ARM32 still does the optimization of clobbering "call-preserved" even though it also generates code to return instead of just falling off the end of the function.  This is C undefined behaviour so it's legal, but is this desirable?  gcc warns about it (even without -Wall), but it's so likely to cause hard-to-debug problems that I'd suggest erroring by default, or not doing the optimization.


// https://godbolt.org/g/GhW4b8 for gcc6.3, also tested locally with gcc7.1

void ext(void);

ATTRIBUTE
void foo(int *p, int y) {
    ext();
    *p = y;   // then use args that had to survive a call
}

On ARM32 with gcc7.1 -O3 -DATTRIBUTE=_Noreturn produces this:

7 : <source>:7:1: warning: 'noreturn' function does return

foo:
  @ Function supports interworking.
  @ Volatile: function does not return.               # This line not present without _Noreturn
  @ args = 0, pretend = 0, frame = 0
  @ frame_needed = 0, uses_anonymous_args = 0
  push {r4, lr}
  mov r5, r1
  mov r4, r0
  bl ext
  str r5, [r4]
  pop {r4, lr}
  bx lr


With ATTRIBUTE= (empty), we get push/pop {r4, r5, r6, lr} and things are otherwise the same.  See https://godbolt.org/g/hYnrrD for a diff.

Comment 1 Peter Cordes 2017-09-09 06:27:52 UTC

Related: bug 55747 describes why gcc keeps the `push {r4, lr}` in the _Noreturn function: backtraces.

Comment 2 jsm-csl@polyomino.org.uk 2017-09-11 16:53:39 UTC

Falling off a noreturn function sounds like it could be another case to 
insert __builtin_trap (), as we do in various cases of undefined behavior.

Comment 3 Peter Cordes 2017-09-14 23:17:14 UTC

(In reply to joseph@codesourcery.com from comment #2)
> Falling off a noreturn function sounds like it could be another case to 
> insert __builtin_trap (), as we do in various cases of undefined behavior.

gcc has had a `-mabort-on-noreturn` option for ARM32 for a long time, but it's not enabled by default.

I'm still not sure if clobbering r5 when the function really doesn't return is a feature (which gcc should be doing for other targets) or a bug.

It clearly breaks stack unwinding of call-preserved register values with x86-64 style .eh_frame metadata; does this matter for exceptions from callees of the noreturn function, or only for debugging?

Anyway, gcc should either produce a compile-time error, call __builtin_trap() or abort(), or properly restore all call-preserved registers when it returns.  There should be no set of options that lets it make bad code like this, even with a warning.

Comment 4 Ramana Radhakrishnan 2017-09-15 15:34:39 UTC

(In reply to Peter Cordes from comment #3)
> (In reply to joseph@codesourcery.com from comment #2)
> > Falling off a noreturn function sounds like it could be another case to 
> > insert __builtin_trap (), as we do in various cases of undefined behavior.
> 
> gcc has had a `-mabort-on-noreturn` option for ARM32 for a long time, but
> it's not enabled by default.
> 
> I'm still not sure if clobbering r5 when the function really doesn't return
> is a feature (which gcc should be doing for other targets) or a bug.

It's a "feature" - if the function really doesn't return, then there is no real requirement to save and restore all callee-saved registers. 

A deliberate choice when computing the save register mask - all that we are providing is an ability to backtrace out of such cases - anything more is just broken and sort of defeats the purpose of noreturn functions in the first place.

I can see that other backends have similar techniques - x86_64 appears to trash %esi in the experiments I did and it does look like %esi is a callee saved register.

> 
> It clearly breaks stack unwinding of call-preserved register values with
> x86-64 style .eh_frame metadata; does this matter for exceptions from
> callees of the noreturn function, or only for debugging?

Arm doesn't turn on exception tables by default very unlike x86_64 - if you want eh information from C then use -fexceptions on the command line and I see that it all works just fine. 

> 
> Anyway, gcc should either produce a compile-time error, call
> __builtin_trap() or abort(), or properly restore all call-preserved
> registers when it returns.  There should be no set of options that lets it
> make bad code like this, even with a warning.


It does produce a warning , you could use -Werror :) ? We could look to put out a __builtin_trap instead of a call to abort in the backend - As you say -mabort-on-noreturn has existed since time immemorial , maybe we have to replace that with a trap instruction and discuss whether to turn it on by default.

Comment 5 Peter Cordes 2017-09-16 09:40:19 UTC

(In reply to Ramana Radhakrishnan from comment #4)
> It's a "feature" - if the function really doesn't return, then there is no
> real requirement to save and restore all callee-saved registers. 
>
> A deliberate choice when computing the save register mask - all that we are
> providing is an ability to backtrace out of such cases

 That's what I thought; just be able to print backtraces.  Good point about -fno-exceptions.  I forgot I was building as C in the first place. :P


> I can see that other backends have similar techniques - x86_64 appears to
> trash %esi in the experiments I did and it does look like %esi is a callee
> saved register.

No, %esi is call-clobbered in x86-64 (Windows and System V).  It's call-preserved in x86-32.  (-m32)

I checked again with -fno-exceptions, and none of the other backends on godbolt do this.  I constructed a better test-case (https://godbolt.org/g/pAgsnA) that makes it easier to compare code-gen for a function that returns (without using _Noreturn) vs. a _Noreturn function that calls done().  So even if you aren't sure of the calling convention on a target, you can look for fewer store instructions, or lack of them before a reg-reg move.

In all cases except ARM32, gcc stores two regs before moving arg regs into them.

So there's also a missed-optimization for everything except ARM32, for functions that really don't return.  I suppose that's a separate bug that I should open.

Comment 6 Ramana Radhakrishnan 2017-09-20 09:31:33 UTC

(In reply to Peter Cordes from comment #5)
> (In reply to Ramana Radhakrishnan from comment #4)
> > It's a "feature" - if the function really doesn't return, then there is no
> > real requirement to save and restore all callee-saved registers. 
> >
> > A deliberate choice when computing the save register mask - all that we are
> > providing is an ability to backtrace out of such cases
> 
>  That's what I thought; just be able to print backtraces.  Good point about
> -fno-exceptions.  I forgot I was building as C in the first place. :P

Yes and that's why I'm closing this as a WONTFIX.

> 
> 
> > I can see that other backends have similar techniques - x86_64 appears to
> > trash %esi in the experiments I did and it does look like %esi is a callee
> > saved register.
> 
> No, %esi is call-clobbered in x86-64 (Windows and System V).  It's
> call-preserved in x86-32.  (-m32)
> 
> I checked again with -fno-exceptions, and none of the other backends on
> godbolt do this.  I constructed a better test-case
> (https://godbolt.org/g/pAgsnA) that makes it easier to compare code-gen for
> a function that returns (without using _Noreturn) vs. a _Noreturn function
> that calls done().  So even if you aren't sure of the calling convention on
> a target, you can look for fewer store instructions, or lack of them before
> a reg-reg move.
> 
> In all cases except ARM32, gcc stores two regs before moving arg regs into
> them.
> 
> So there's also a missed-optimization for everything except ARM32, for
> functions that really don't return.  I suppose that's a separate bug that I
> should open.

Probably but again this is backend specific. I'm going to close this off

Comment 7 Peter Cordes 2017-09-21 03:36:38 UTC

(In reply to Ramana Radhakrishnan from comment #6)
> (In reply to Peter Cordes from comment #5)
> >  That's what I thought; just be able to print backtraces.  Good point about
> > -fno-exceptions.  I forgot I was building as C in the first place. :P
> 
> Yes and that's why I'm closing this as a WONTFIX.

Hang on, there's still a bug here for _Noreturn functions that *do* return.

That discussion established that the current behaviour is correct for actual noreturn functions (thank you), but that's *not* what this bug report is about.  It's about this function:

void ext(void);
_Noreturn void foo(int *p, int y) {
    ext();
    *p = y;   // then use args that had to survive a call
    return;
}

gcc for ARM32 has a quality-of-implementation bug here: it only warns, and then makes code that silently clobbers the caller's R5.

It's UB so the standard allows this, but it's definitely not nice, and can cause problems that are hard to trace back to this function.

The valid options are:
* preserve the caller's registers as if _Noreturn hadn't been specified
* abort / trap at the end of the function instead of returning.  (Enable -mabort-on-noreturn by default?)
* compile-time error.

Comment 8 Peter Cordes 2017-09-21 03:39:51 UTC

reopening

Comment 9 Jakub Jelinek 2017-09-21 09:57:00 UTC

None of the above options is IMHO acceptable.
This is UB like any other.

What we could add is -fsanitize=noreturn that would add runtime instrumentation that noreturn functions don't return, but it certainly shouldn't be the default, it can be included in -fsanitize=undefined to catch for that UB like any other UB it catches.
Compile time error is undesirable, it isn't an error if a noreturn function could return, the error is if it does return at runtime, which you really can't prove.  Either the function isn't called in the program, or say the return is only conditional and that path doesn't ever happen in a valid program, or say there are calls in the noreturn function that just throw exceptions or loop forever or abort or don't really return some other way.

Comment 10 Peter Cordes 2017-09-21 11:55:41 UTC

(In reply to Jakub Jelinek from comment #9)
> None of the above options is IMHO acceptable.
> This is UB like any other.

I still think it's a quality-of-implementation bug that could be fixed without downsides for conforming programs by emitting an illegal instruction instead of bx lr.  (Thanks for pointing out some cases I hadn't considered, BTW.  That narrows down the possible good solutions.)

IMO corrupting registers while otherwise working normally is really dangerous.  That could appear to work in a unit test but fail sometimes with a different caller, maybe even causing security problems if the clobbered register was only used in an error-handling case.

If a fix would take too much work to implement, then I'm fine with leaving this as WONTFIX; there is a warning enabled by default for cases where gcc is sure that a noreturn function *does* return.  I disagree about INVALID, though.

I'm imagining a codebase where a stray _Noreturn attribute got attached to a function that was never intended to be _Noreturn.

A buggy _Noreturn function that has a combination of inputs that does return would always be problematic regardless of clobbering registers, and I guess that's why you're thinking about sanitize?

> What we could add is -fsanitize=noreturn that would add runtime
> instrumentation

I'm not proposing anything that heavy, just an illegal instruction instead of a return, or simply no instruction at all.  In a _Noreturn function (that has clobbered call-preserved registers), I think gcc should never emit instructions that return.

As you point out, even if gcc can't prove whether that path is/isn't taken, a correct program *won't* take it, so an illegal instruction is a good thing.  (For performance, an illegal instruction can stop incorrect speculation down a wrong path after a mispredicted branch, potentially saving useless prefetch of code/data and speculative page walks).

There is precedent for trapping with illegal instructions on UB:

int* address_of_local(int v) { return &v;  }
int foo() {
    int* p = address_of_local(4);
    return *p;
}
   // gcc4.x returns 4
   // x86-64 gcc5 and later.  https://godbolt.org/g/W5vUiA

foo():
        movl    0, %eax       # load from absolute address 0
        ud2                   # undefined 2-byte instruction: future-proof way to raise #UD

Falling through into whatever's next would also be a better option, since it saves code size instead of generating instructions that a correct program will never execute.  (gcc does that for foo() on ARM after trying a NULL-pointer load.  ARM64 gcc uses brk #1000 after trying a NULL-pointer load.)

If the block that would return isn't at the end of the function, then the fall-through would be into another block of the current function.  This is potentially hard to debug, so probably an illegal instruction is a good idea, maybe with an option to omit it.

Anything in a basic block that returns can be assumed not to be executed, so the example from the original report could be compiled to:

  push {r4, lr}
     #  mov r5, r1    # optionally optimize these out, because we know
     #  mov r4, r0    # the code that uses them can't be reached.
  bl ext
   # don't emit the store
   # or the pop / bx lr
   # Fall through into padding.
   # Ideally pad with illegal instructions instead of NOPs

Hmm, unless the store faults, and that's what made this function not return.  But we can definitely replace the pop/bx with an illegal instruction or a fall-through, because executing them is guaranteed to be the wrong thing.

Making functions that almost works but violate the ABI seems like the worst possible way to handle this corner case.  Crashing is better than silent corruption when there's no correct way to continue execution, right?


> Compile time error is undesirable

That's what I thought, too.  -Werror exists for users that want that.

I only mentioned it for completeness, but good point that it's not an option even when gcc can prove that a function returns, because it might never be called. 

> , or say the
> return is only conditional and that path doesn't ever happen in a valid
> program, or say there are calls in the noreturn function that just throw
> exceptions or loop forever or abort or don't really return some other way.

That's a good point.  We don't want to save extra registers that a correct program doesn't need saved, so ignoring the noreturn attribute and emitting code to save/restore is not a good solution for the general case.

Illegal instruction or fall through into other code is left as the only solution I think is really good.

Comment 11 Peter Cordes 2017-09-26 15:05:10 UTC

Flagging this WONTFIX instead of INVALID, since I'm still convinced gcc could do better.  What I'm proposing wouldn't be any worse than breaking the ABI on UB.  I guess unsafe code-gen for something that gcc warns about by default is not too bad, though.

I guess I should file a separate missed-optimization bug about not eliminating the code from a basic block that returns in a noreturn function.

See also https://stackoverflow.com/questions/45981545/why-does-noreturn-function-return/46407858#46407858 (description of the possible optimizations).

Comment 12 Jakub Jelinek 2017-10-20 07:36:19 UTC

Author: jakub
Date: Fri Oct 20 07:35:48 2017
New Revision: 253926

URL: https://gcc.gnu.org/viewcvs?rev=253926&root=gcc&view=rev
Log:
	PR target/82158
	* tree-cfg.c (pass_warn_function_return::execute): In noreturn
	functions when optimizing replace GIMPLE_RETURN stmts with
	calls to __builtin_unreachable ().

	* gcc.dg/tree-ssa/noreturn-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/noreturn-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-cfg.c