This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [tree-tailcall] Check if function returns it's argument
- From: Richard Biener <rguenther at suse dot de>
- To: Prathamesh Kulkarni <prathamesh dot kulkarni at linaro dot org>
- Cc: Jeff Law <law at redhat dot com>, gcc Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 1 Dec 2016 13:10:29 +0100 (CET)
- Subject: Re: [tree-tailcall] Check if function returns it's argument
- Authentication-results: sourceware.org; auth=none
- References: <CAAgBjMkcer2PZkJNBE10eAqWt2RXT7vOp+ERmp+FL1QQhmVZGA@mail.gmail.com> <alpine.LSU.2.11.1611240933390.5294@t29.fhfr.qr> <CAAgBjM=gRR+C8GwrT6=9nHQZ3vapJKuOPsrDMHjEWmEnFCa3Rw@mail.gmail.com> <alpine.LSU.2.11.1611241317310.5294@t29.fhfr.qr> <CAAgBjM=DqX=BJXYK7D8Cdns5T5_1CyaaLc2m+jJMEXnL7ZyRFQ@mail.gmail.com> <alpine.LSU.2.11.1611241336280.5294@t29.fhfr.qr> <CAAgBjMkUBTOnvn01o+zR5AntK20VpxwpWtTZYk1_6Ahyttirmg@mail.gmail.com> <alpine.LSU.2.11.1611250905260.5294@t29.fhfr.qr> <eecbcf9c-3214-e921-248b-e0ce9359e2bb@redhat.com> <CAAgBjM=LjzkmNMnthf2PmznrCMdJgXSqp6aqLghXLafx=k9XzA@mail.gmail.com>
On Thu, 1 Dec 2016, Prathamesh Kulkarni wrote:
> On 25 November 2016 at 21:17, Jeff Law <law@redhat.com> wrote:
> > On 11/25/2016 01:07 AM, Richard Biener wrote:
> >
> >>> For the tail-call, issue should we artificially create a lhs and use that
> >>> as return value (perhaps by a separate pass before tailcall) ?
> >>>
> >>> __builtin_memcpy (a1, a2, a3);
> >>> return a1;
> >>>
> >>> gets transformed to:
> >>> _1 = __builtin_memcpy (a1, a2, a3)
> >>> return _1;
> >>>
> >>> So tail-call optimization pass would see the IL in it's expected form.
> >>
> >>
> >> As said, a RTL expert needs to chime in here. Iff then tail-call
> >> itself should do this rewrite. But if this form is required to make
> >> things work (I suppose you checked it _does_ actually work?) then
> >> we'd need to make sure later passes do not undo it. So it looks
> >> fragile to me. OTOH I seem to remember that the flags we set on
> >> GIMPLE are merely a hint to RTL expansion and the tailcalling is
> >> verified again there?
> >
> > So tail calling actually sits on the border between trees and RTL.
> > Essentially it's an expand-time decision as we use information from trees as
> > well as low level target information.
> >
> > I would not expect the former sequence to tail call. The tail calling code
> > does not know that the return value from memcpy will be a1. Thus the tail
> > calling code has to assume that it'll have to copy a1 into the return
> > register after returning from memcpy, which obviously can't be done if we
> > tail called memcpy.
> >
> > The second form is much more likely to turn into a tail call sequence
> > because the return value from memcpy will be sitting in the proper register.
> > This form out to work for most calling conventions that allow tail calls.
> >
> > We could (in theory) try and exploit the fact that memcpy returns its first
> > argument as a return value, but that would only be helpful on a target where
> > the first argument and return value use the same register. So I'd have a
> > slight preference to rewriting per Prathamesh's suggestion above since it's
> > more general.
> Thanks for the suggestion. The attached patch creates artificial lhs,
> and returns it if the function returns it's argument and that argument
> is used as return-value.
>
> eg:
> f (void * a1, void * a2, long unsigned int a3)
> {
> <bb 2> [0.0%]:
> # .MEM_5 = VDEF <.MEM_1(D)>
> __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D));
> # VUSE <.MEM_5>
> return a1_2(D);
>
> }
>
> is transformed to:
> f (void * a1, void * a2, long unsigned int a3)
> {
> void * _6;
>
> <bb 2> [0.0%]:
> # .MEM_5 = VDEF <.MEM_1(D)>
> _6 = __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D));
> # VUSE <.MEM_5>
> return _6;
>
> }
>
> While testing, I came across an issue with function f() defined
> intail-padding1.C:
> struct X
> {
> ~X() {}
> int n;
> char d;
> };
>
> X f()
> {
> X nrvo;
> __builtin_memset (&nrvo, 0, sizeof(X));
> return nrvo;
> }
>
> input to the pass:
> X f() ()
> {
> <bb 2> [0.0%]:
> # .MEM_3 = VDEF <.MEM_1(D)>
> __builtin_memset (nrvo_2(D), 0, 8);
> # VUSE <.MEM_3>
> return nrvo_2(D);
>
> }
>
> verify_gimple_return failed with:
> tail-padding1.C:13:1: error: invalid conversion in return statement
> }
> ^
> struct X
>
> struct X &
>
> # VUSE <.MEM_3>
> return _4;
>
> It seems the return type of function (struct X) differs with the type
> of return value (struct X&).
> Not sure how this is possible ?
You need to honor DECL_BY_REFERENCE of DECL_RESULT.
> To work around that, I guarded the transform on:
> useless_type_conversion_p (TREE_TYPE (TREE_TYPE (cfun->decl)),
> TREE_TYPE (retval)))
>
> in the patch. Does that look OK ?
>
> Bootstrap+tested on x86_64-unknown-linux-gnu with --enable-languages=all,ada.
> Cross-tested on arm*-*-*, aarch64*-*-*.
>
> Thanks,
> Prathamesh
> >
> >
> > Jeff
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)