Bug 91459

Summary: Tail-Call Optimization is not performed when return value is assumed.
Product: gcc Reporter: mike.k
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: NEW ---    
Severity: normal CC: dimhen
Priority: P3 Keywords: missed-optimization
Version: 9.2.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2019-08-16 00:00:00

Description mike.k 2019-08-15 15:50:44 UTC
In situations where a function either returns a specific value or does not return at all, GCC fails to perform tail call optimizations. This appears to occur on all GCC versions with -O1, -O2, -O3, and -Os. It occurs with both the C and C++ front-ends.


/* This function is guaranteed to only return the value '1', else it does not return.
// This is meant to emulate a function such as 'exec'.
extern int function_returns_only_1_or_doesnt_return(int, int);

int foo1(int a, int b) {
    const int result = function_returns_only_1_or_doesnt_return(a, b);
    if (result == 1) {
        return result;
    else {

int foo2(int a, int b) {
    return function_returns_only_1_or_doesnt_return(a, b);

This results in the following output for -O3 on x86-64:

foo1(int, int):
  push rax
  call function_returns_only_1_or_doesnt_return(int, int)
  mov eax, 1
  pop rdx
foo3(int, int):
  jmp function_returns_only_1_or_doesnt_return(int, int)

While the behavior is correct, the tail-call optimization is far more optimal and preserves the same semantics.

The same behavior occurs with other architectures as well, so it does not appear to be a back-end issue.
Comment 1 mike.k 2019-08-15 15:53:03 UTC
'foo3' in the assembly output should be 'foo2'. I'd changed the function name in my test code and did not update the assembly. Apologies.
Comment 2 Richard Biener 2019-08-16 07:42:11 UTC
I think this has a duplicate somewhere.  The reason is we end up with

foo1 (int a, int b)
  <bb 2> [local count: 1073741824]:
  function_returns_only_1_or_doesnt_return (a_2(D), b_3(D));
  return 1;

and thus no obvious tail-call opportunity.  Here it is EVRP replacing
result_5 with 1 in

  <bb 2> :
  result_5 = function_returns_only_1_or_doesnt_return (a_2(D), b_3(D));
  if (result_5 == 1)
    goto <bb 3>; [INV]
    goto <bb 4>; [INV]

  <bb 3> :
  // predicted unlikely by early return (on trees) predictor.
  return result_5;

  <bb 4> :
  __builtin_unreachable ();

where it would be better to transform this idiom to unconditional
return result_5 early.
Comment 3 Andrew Pinski 2019-08-16 08:33:23 UTC
Part of that might be because of:
  // predicted unlikely by early return (on trees) predictor.

That seems not true, the other side is more unlikely ...