This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 25 November 2016 at 21:17, Jeff Law <law@redhat.com> wrote: > On 11/25/2016 01:07 AM, Richard Biener wrote: > >>> For the tail-call, issue should we artificially create a lhs and use that >>> as return value (perhaps by a separate pass before tailcall) ? >>> >>> __builtin_memcpy (a1, a2, a3); >>> return a1; >>> >>> gets transformed to: >>> _1 = __builtin_memcpy (a1, a2, a3) >>> return _1; >>> >>> So tail-call optimization pass would see the IL in it's expected form. >> >> >> As said, a RTL expert needs to chime in here. Iff then tail-call >> itself should do this rewrite. But if this form is required to make >> things work (I suppose you checked it _does_ actually work?) then >> we'd need to make sure later passes do not undo it. So it looks >> fragile to me. OTOH I seem to remember that the flags we set on >> GIMPLE are merely a hint to RTL expansion and the tailcalling is >> verified again there? > > So tail calling actually sits on the border between trees and RTL. > Essentially it's an expand-time decision as we use information from trees as > well as low level target information. > > I would not expect the former sequence to tail call. The tail calling code > does not know that the return value from memcpy will be a1. Thus the tail > calling code has to assume that it'll have to copy a1 into the return > register after returning from memcpy, which obviously can't be done if we > tail called memcpy. > > The second form is much more likely to turn into a tail call sequence > because the return value from memcpy will be sitting in the proper register. > This form out to work for most calling conventions that allow tail calls. > > We could (in theory) try and exploit the fact that memcpy returns its first > argument as a return value, but that would only be helpful on a target where > the first argument and return value use the same register. So I'd have a > slight preference to rewriting per Prathamesh's suggestion above since it's > more general. Thanks for the suggestion. The attached patch creates artificial lhs, and returns it if the function returns it's argument and that argument is used as return-value. eg: f (void * a1, void * a2, long unsigned int a3) { <bb 2> [0.0%]: # .MEM_5 = VDEF <.MEM_1(D)> __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D)); # VUSE <.MEM_5> return a1_2(D); } is transformed to: f (void * a1, void * a2, long unsigned int a3) { void * _6; <bb 2> [0.0%]: # .MEM_5 = VDEF <.MEM_1(D)> _6 = __builtin_memcpy (a1_2(D), a2_3(D), a3_4(D)); # VUSE <.MEM_5> return _6; } While testing, I came across an issue with function f() defined intail-padding1.C: struct X { ~X() {} int n; char d; }; X f() { X nrvo; __builtin_memset (&nrvo, 0, sizeof(X)); return nrvo; } input to the pass: X f() () { <bb 2> [0.0%]: # .MEM_3 = VDEF <.MEM_1(D)> __builtin_memset (nrvo_2(D), 0, 8); # VUSE <.MEM_3> return nrvo_2(D); } verify_gimple_return failed with: tail-padding1.C:13:1: error: invalid conversion in return statement } ^ struct X struct X & # VUSE <.MEM_3> return _4; It seems the return type of function (struct X) differs with the type of return value (struct X&). Not sure how this is possible ? To work around that, I guarded the transform on: useless_type_conversion_p (TREE_TYPE (TREE_TYPE (cfun->decl)), TREE_TYPE (retval))) in the patch. Does that look OK ? Bootstrap+tested on x86_64-unknown-linux-gnu with --enable-languages=all,ada. Cross-tested on arm*-*-*, aarch64*-*-*. Thanks, Prathamesh > > > Jeff
Attachment:
tailcall-5.diff
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |