Bug 91459 - Tail-Call Optimization is not performed when return value is assumed.
Summary: Tail-Call Optimization is not performed when return value is assumed.
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2019-08-15 15:50 UTC by mike.k
Modified: 2019-08-17 12:34 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2019-08-16 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description mike.k 2019-08-15 15:50:44 UTC
In situations where a function either returns a specific value or does not return at all, GCC fails to perform tail call optimizations. This appears to occur on all GCC versions with -O1, -O2, -O3, and -Os. It occurs with both the C and C++ front-ends.

Observe:

/* This function is guaranteed to only return the value '1', else it does not return.
// This is meant to emulate a function such as 'exec'.
*/
extern int function_returns_only_1_or_doesnt_return(int, int);

int foo1(int a, int b) {
    const int result = function_returns_only_1_or_doesnt_return(a, b);
    if (result == 1) {
        return result;
    }
    else {
        __builtin_unreachable();
    }
}

int foo2(int a, int b) {
    return function_returns_only_1_or_doesnt_return(a, b);
}


This results in the following output for -O3 on x86-64:

foo1(int, int):
  push rax
  call function_returns_only_1_or_doesnt_return(int, int)
  mov eax, 1
  pop rdx
  ret
foo3(int, int):
  jmp function_returns_only_1_or_doesnt_return(int, int)

While the behavior is correct, the tail-call optimization is far more optimal and preserves the same semantics.

The same behavior occurs with other architectures as well, so it does not appear to be a back-end issue.
Comment 1 mike.k 2019-08-15 15:53:03 UTC
'foo3' in the assembly output should be 'foo2'. I'd changed the function name in my test code and did not update the assembly. Apologies.
Comment 2 Richard Biener 2019-08-16 07:42:11 UTC
I think this has a duplicate somewhere.  The reason is we end up with

foo1 (int a, int b)
{
  <bb 2> [local count: 1073741824]:
  function_returns_only_1_or_doesnt_return (a_2(D), b_3(D));
  return 1;

and thus no obvious tail-call opportunity.  Here it is EVRP replacing
result_5 with 1 in

  <bb 2> :
  result_5 = function_returns_only_1_or_doesnt_return (a_2(D), b_3(D));
  if (result_5 == 1)
    goto <bb 3>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 3> :
  // predicted unlikely by early return (on trees) predictor.
  return result_5;

  <bb 4> :
  __builtin_unreachable ();

where it would be better to transform this idiom to unconditional
return result_5 early.
Comment 3 Andrew Pinski 2019-08-16 08:33:23 UTC
Part of that might be because of:
  // predicted unlikely by early return (on trees) predictor.

That seems not true, the other side is more unlikely ...