This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Use TImode for piecewise move in 64-bit mode


On Thu, Aug 11, 2016 at 5:51 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>>>>> Use TImode for piecewise move in 64-bit mode.  When vector register
>>>>>>>> is used for piecewise move, we don't increase stack_alignment_needed
>>>>>>>> since vector register spill isn't required for piecewise move.  Since
>>>>>>>> stack_realign_needed is set to true by checking stack_alignment_estimated
>>>>>>>> set by pseudo vector register usage, we also need to check
>>>>>>>> stack_realign_needed to eliminate frame pointer.
>>>>>>>
>>>>>>> Why only in 64-bit mode? We can use SSE moves also in 32-bit mode.
>>>>>>
>>>>>> I will extend it to 32-bit mode.
>>>>>
>>>>> It doesn't work in 32-bit mode due to
>>>>>
>>>>> #define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode):
>>>>>
>>>>> /export/build/gnu/gcc/build-x86_64-linux/gcc/xgcc
>>>>> -B/export/build/gnu/gcc/build-x86_64-linux/gcc/ -O2
>>>>> -fno-asynchronous-unwind-tables -m32 -S -o x.s x.i
>>>>> x.i: In function ‘foo’:
>>>>> x.i:6:10: internal compiler error: in by_pieces_ninsns, at expr.c:799
>>>>>    return __builtin_mempcpy (dst, src, 32);
>>>>>           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>>>
>>>> This happens since by_pieces_ninsns determines widest mode by calling
>>>> widest_*INT*_mode_for_size, while moves can also use vector-mode
>>>> moves. This is an infrastructure problem, and will bite you on 64bit
>>>> targets when MOVE_MAX_PIECES returns OImode or XImode size.
>>>
>>> I opened:
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74113
>>>
>>>> +#define MOVE_MAX_PIECES \
>>>> +  ((TARGET_64BIT \
>>>> +    && TARGET_SSE2 \
>>>> +    && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \
>>>> +    && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) ? 16 : UNITS_PER_WORD)
>>>>
>>>> The above part is OK with an appropriate ??? comment, describing the
>>>> infrastructure limitation. Also, please use GET_MODE_SIZE (TImode)
>>>> instead of magic constant.
>>>>
>>>> Can you please submit the realignment patch as a separate follow-up
>>>> patch? Let's keep two issues separate.
>>>>
>>>> Uros.
>>>
>>> Here is the updated patch.  OK for trunk?
>>
>> OK, but please do not yet introduce:
>>
>> +/* No need to dynamically realign the stack here.  */
>> +/* { dg-final { scan-assembler-not "and\[^\n\r]*%\[re\]sp" } } */
>> +/* Nor use a frame pointer.  */
>> +/* { dg-final { scan-assembler-not "%\[re\]bp" } } */
>>
>> in the testcases. This should be part of a followup patch.
>
> This is what I checked in.

Playing a bit with a patched gcc, I found no stack realignment insns
in the assembly of the provided testcases. However, if
-mincoming-stack-boundary=3 is added, then no vector instructions are
generated (and also no realignment insns).

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]