64bit aligned SSE va-args save prologues

H.J. Lu hjl.tools@gmail.com
Sun May 2 13:43:00 GMT 2010


On Mon, Apr 19, 2010 at 11:24 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sat, Apr 17, 2010 at 4:50 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Hi,
>> this patch avoids need to align SSE prologues to 128 bits.  This saves some stack
>> when the va arg function is leaf (or calls just local functions).  This is not
>> that terribly common scenario, but it should help in integer only programs
>> (and i.e. Linux kernel).
>>
>> The catch is that we need to expand the register save area.  When stack don't
>> need to be 128bit aligned, we don not need to save whole registers since we
>> know we will never touch them. However this is known only after expansion while
>> register save code is produced during expansion, so I delay the actual
>> expansion of jumptable until after reload.
>>
>> This produce somewhat better code.
>>
>> I also noticed that previously this all worked kind of by accident because va_list
>> itself push alignment to 128bits because local_alignment is bit too serious about
>> bumping alignments up to help SSE instructions.  In the case of va_list this is nonsence,
>> so I fixed that too.
>>
>> Bootstrapped/regtested x86_64-linux, will commit it tomorrow.
>>
>>        * i386.md (UNSPEC_SSE_PROLOGUE_SAVE_LOW): New.
>>        (sse_prologue_save_insn expander): Use new pattern.
>>        (sse_prologue_save_insn1): New pattern and splitter.
>>        (sse_prologue_save_insn): Update to deal also with 64bit aligned
>>        blocks.
>>        * i386.c (setup_incoming_varargs_64): Do not compute jump destination here.
>>        (ix86_gimplify_va_arg): Update alignment needed.
>>        (ix86_local_alignment): Do not align all local arrays
>>        to 128bit.
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43799
>

Hi Jan,

Do you have time to take a look at your x86-64 prologue change?

Thanks.

-- 
H.J.



More information about the Gcc-patches mailing list