This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Extend -falign-FOO=N to N[,M]: the second number is max padding
- From: Andrew Pinski <pinskia at gmail dot com>
- To: Denys Vlasenko <dvlasenk at redhat dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 11 Aug 2016 17:34:23 -0700
- Subject: Re: [PATCH] Extend -falign-FOO=N to N[,M]: the second number is max padding
- Authentication-results: sourceware.org; auth=none
- References: <20160811204900.15846-1-dvlasenk@redhat.com> <CA+=Sn1nqiKCj7bL-sf2dee67u1Fwp0nA_Kf=w0yTittngDn3NA@mail.gmail.com> <3812ae19-6a60-5cdc-11ee-60075eb427fb@redhat.com>
On Thu, Aug 11, 2016 at 5:19 PM, Denys Vlasenko <dvlasenk@redhat.com> wrote:
>
>
> On 08/11/2016 10:59 PM, Andrew Pinski wrote:
>>
>> On Thu, Aug 11, 2016 at 1:49 PM, Denys Vlasenko <dvlasenk@redhat.com>
>> wrote:
>>>
>>> falign-functions=N is too simplistic.
>>>
>>> Ingo Molnar ran some tests and it looks on latest x86 CPUs, 64-byte
>>> alignment
>>> runs fastest (he tried many other possibilites).
>>>
>>> However, developers are less than thrilled by the idea of a slam-dunk
>>> 64-byte
>>> aligning everything. Too much waste:
>>> On 05/20/2015 02:47 AM, Linus Torvalds wrote:
>>> > At the same time, I have to admit that I abhor a 64-byte
>>> function
>>> > alignment, when we have a fair number of functions that are
>>> (much)
>>> > smaller than that.
>>> >
>>> > Is there some way to get gcc to take the size of the function
>>> into
>>> > account? Because aligning a 16-byte or 32-byte function on a
>>> 64-byte
>>> > alignment is just criminally nasty and wasteful.
>>>
>>> This change makes it possible to align function to 64-byte boundaries
>>> *IF*
>>> this does not introduce huge amount of padding.
>>>
>>> Patch drops forced alignment to 8 if requested alignment is higher than
>>> 8:
>>> before the patch, -falign-functions=9 was generating
>>>
>>> .p2align 4,,8
>>> .p2align 3
>>>
>>> which means: "align to 16 if the skip is 8 bytes or less; else align to
>>> 8".
>>> After this change, ".p2align 3" is not emitted.
>>>
>>> It is dropped because I ultimately want to do something
>>> like -falign-functions=64,8 - IOW, I want to align functions to 64 bytes,
>>> but only if that generates padding of less than 8 bytes - otherwise I
>>> want
>>> *no alignment at all*. The forced ".p2align 3" interferes with that
>>> intention.
>>>
>>> Testing:
>>> tested that with -falign-functions=N (tried 8, 15, 16, 17...) the
>>> alignment
>>> directives are the same before and after the patch.
>>> Tested that -falign-functions=N,N (two equal paramenters) works exactly
>>> like -falign-functions=N.
>>
>>
>> Two things I noticed you missed:
>> 1) ChangeLog entries (this is required as per the GNU coding style)
>
>
> Is this form okay? -
This is getting there but needs a little improvement.
>
> ...
> Testing:
> tested that with -falign-functions=N (tried 8, 15, 16, 17...) the alignment
> directives are the same before and after the patch.
> Tested that -falign-functions=N,N (two equal paramenters) works exactly
> like -falign-functions=N.
Oh also maybe adding testcases would be useful, just scanning the
outputted assembly should be good enough.
>
> 2016-08-11 Denys Vlasenko <dvlasenk@redhat.com>
>
> * common.opt: Change definitions so that -falign_FOO= accepts a string.
I think this should be:
* common.opt (-falign-functions): Accept a string instead of an integer.
(falign-loops): Likewise.
....
> * config/i386/freebsd.h: Remove "If N is large, do at least 8 byte
> alignment" code.
This should be something more like:
* config/i386/freebsd.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Remove "If N is
large, do at least 8 byte alignment" code.
> * config/i386/gnu-user.h: Likewise.
* config/i386/gnu-user.h (ASM_OUTPUT_MAX_SKIP_ALIGN): Likewise.
> * config/i386/iamcu.h: Likewise.
> * config/i386/openbsdelf.h: Likewise.
> * config/i386/x86-64.h: Likewise.
> * flags.h (struct target_flag_state): Add x_align_functions_max_skip
> member.
> * toplev.c (parse_N_M): New function.
> (init_alignments): Set align_FOO_log, align_FOO, align_FOO_max_skip
> from specified --align_FOO=N{,M} option
> * varasm.c (assemble_start_function): Use align_functions_max_skip
> instead of align_functions - 1.
>
> Index: gcc/common.opt
> ===================================================================
> ....
>
>
>> 2) Documentation changes.
You did not add this or you doing this in a separate patch now?
The current documentation is done in doc/invoke.texi:
@item -falign-functions
@itemx -falign-functions=@var{n}
@opindex falign-functions
Align the start of functions to the next power-of-two greater than
@var{n}, skipping up to @var{n} bytes. For instance,
@option{-falign-functions=32} aligns functions to the next 32-byte
boundary, but @option{-falign-functions=24} aligns to the next
32-byte boundary only if this can be done by skipping 23 bytes or less.
@option{-fno-align-functions} and @option{-falign-functions=1} are
equivalent and mean that functions are not aligned.
Some assemblers only support this flag when @var{n} is a power of two;
in that case, it is rounded up.
If @var{n} is not specified or is zero, use a machine-dependent default.
---- CUT ----
You should add your documentation there (and the rest of the -falign-* options.
>>
>>>
>>> Index: gcc/common.opt
>>> ===================================================================
>
> ...
>
>>> #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP) \
>>> do { \
>>> if ((LOG) != 0) { \
>>> - if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG)); \
>>> - else { \
>>> + if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG))) \
>>> + fprintf ((FILE), "\t.p2align %d\n", (LOG)); \
>>> + else \
>>> fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP)); \
>>> - /* Make sure that we have at least 8 byte alignment if > 8 byte \
>>> - alignment is preferred. */ \
>>> - if ((LOG) > 3 \
>>> - && (1 << (LOG)) > ((MAX_SKIP) + 1) \
>>> - && (MAX_SKIP) >= 7) \
>>> - fputs ("\t.p2align 3\n", (FILE)); \
>>> - }
>>> \
>>> } \
>>> } while (0)
>>> #undef ASM_OUTPUT_MAX_SKIP_PAD
>>
>>
>> These looks like a bug fix, it most likely should be sent separately;
>> maybe even first.
>
>
> ok
>
>> Also can you make sure the generic ASM_OUTPUT_MAX_SKIP_ALIGN is done
>> correctly?
>
>
> If by generic you mean "non-arch-specific", the situation is as follows:
> if arch does not define ASM_OUTPUT_MAX_SKIP_ALIGN(file, align, max_skip),
> then
> ASM_OUTPUT_ALIGN_WITH_NOP(file, align) or
> ASM_OUTPUT_ALIGN (file, align) macros are used.
> As you se from their arguments, those macros don't take "max_skip" argument.
> IOW: even in existing code, they ignore max_skip, and always pad to the
> specified alignment.
> Whether this is considered "correct" or not, is up to you. The patch
> changes nothing for those arches.
>
> I checked all arches which do have ASM_OUTPUT_MAX_SKIP_ALIGN.
>
> visium has the same "align at least to 8" logic like many x86 flavors,
> do you think it should be removed?
>
> rs6000, aarch64, arm use ".p2align %d,,%d" logic with no special cases,
> look good to me.
>
> rx uses ".balign %d,3,%d" and special-cases TARGET_AS100_SYNTAX,
> this arch is probably fine too.