This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [tree-tailcall] Check if function returns it's argument


On 25 November 2016 at 21:17, Jeff Law <law@redhat.com> wrote:
> On 11/25/2016 01:07 AM, Richard Biener wrote:
>
>>> For the tail-call, issue should we artificially create a lhs and use that
>>> as return value (perhaps by a separate pass before tailcall) ?
>>>
>>> __builtin_memcpy (a1, a2, a3);
>>> return a1;
>>>
>>> gets transformed to:
>>> _1 = __builtin_memcpy (a1, a2, a3)
>>> return _1;
>>>
>>> So tail-call optimization pass would see the IL in it's expected form.
>>
>>
>> As said, a RTL expert needs to chime in here.  Iff then tail-call
>> itself should do this rewrite.  But if this form is required to make
>> things work (I suppose you checked it _does_ actually work?) then
>> we'd need to make sure later passes do not undo it.  So it looks
>> fragile to me.  OTOH I seem to remember that the flags we set on
>> GIMPLE are merely a hint to RTL expansion and the tailcalling is
>> verified again there?
>
> So tail calling actually sits on the border between trees and RTL.
> Essentially it's an expand-time decision as we use information from trees as
> well as low level target information.
>
> I would not expect the former sequence to tail call.  The tail calling code
> does not know that the return value from memcpy will be a1.  Thus the tail
> calling code has to assume that it'll have to copy a1 into the return
> register after returning from memcpy, which obviously can't be done if we
> tail called memcpy.
>
> The second form is much more likely to turn into a tail call sequence
> because the return value from memcpy will be sitting in the proper register.
> This form out to work for most calling conventions that allow tail calls.
>
> We could (in theory) try and exploit the fact that memcpy returns its first
> argument as a return value, but that would only be helpful on a target where
> the first argument and return value use the same register. So I'd have a
> slight preference to rewriting per Prathamesh's suggestion above since it's
> more general.
Thanks for the suggestion. The attached patch creates artificial lhs,
and returns it if the function returns it's argument and that argument
is used as return-value.

eg:
f (void * a1, void * a2, long unsigned int a3)
{
  <bb 2> [0.0%]:
  # .MEM_5 = VDEF <.MEM_1(D)>
  __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D));
  # VUSE <.MEM_5>
  return a1_2(D);

}

is transformed to:
f (void * a1, void * a2, long unsigned int a3)
{
  void * _6;

  <bb 2> [0.0%]:
  # .MEM_5 = VDEF <.MEM_1(D)>
  _6 = __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D));
  # VUSE <.MEM_5>
  return _6;

}

While testing, I came across an issue with function f() defined
intail-padding1.C:
struct X
{
  ~X() {}
  int n;
  char d;
};

X f()
{
  X nrvo;
  __builtin_memset (&nrvo, 0, sizeof(X));
  return nrvo;
}

input to the pass:
X f() ()
{
  <bb 2> [0.0%]:
  # .MEM_3 = VDEF <.MEM_1(D)>
  __builtin_memset (nrvo_2(D), 0, 8);
  # VUSE <.MEM_3>
  return nrvo_2(D);

}

verify_gimple_return failed with:
tail-padding1.C:13:1: error: invalid conversion in return statement
 }
 ^
struct X

struct X &

# VUSE <.MEM_3>
return _4;

It seems the return type of function (struct X) differs with the type
of return value (struct X&).
Not sure how this is possible ?
To work around that, I guarded the transform on:
useless_type_conversion_p (TREE_TYPE (TREE_TYPE (cfun->decl)),
                                             TREE_TYPE (retval)))

in the patch. Does that look OK ?

Bootstrap+tested on x86_64-unknown-linux-gnu with --enable-languages=all,ada.
Cross-tested on arm*-*-*, aarch64*-*-*.

Thanks,
Prathamesh
>
>
> Jeff

Attachment: tailcall-5.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]