This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Martin Liška <mliska at suse dot cz>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 9 Jul 2019 11:56:17 +0200
- Subject: Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.
- References: <firstname.lastname@example.org>
> I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the
> basic blocks for which it makes the biggest sense. I quite some binary
> size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> Ready to be installed?
The original idea of distinction between jump alignment and loop
alignment was that they have two basic meanings:
1) jump alignment is there to avoid jumping just to the end of decode
window (if the window is aligned) so CPU will get stuck after reaching
the jump and also to possibly reduce code cache polution by populating
by code that is not executed
2) loop alignment aims to fit loop in as few cache windows as possible
Now if you have loop laid in a way that header of loop is not first
basic block, 2) IMO still apply. I.e.
if cond jump to loopback
So dropping loop alignment for those does not seem to make much sense
from high level. We may want to have differnt alignment for loops
starting by header and loops starting in the middle, but I still liked
more your patch which did bundles for loops.
modern x86 chips are not very good testing targets on it. I guess
generic changes to alignment needs to be tested on other chips too.
> 2019-07-09 Martin Liska <email@example.com>
> * final.c (compute_alignments): Apply the LOOP_ALIGN only
> to basic blocks that all loop headers.
> gcc/final.c | 1 +
> 1 file changed, 1 insertion(+)
> diff --git a/gcc/final.c b/gcc/final.c
> index fefc4874b24..ce2678da988 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -739,6 +739,7 @@ compute_alignments (void)
> if (has_fallthru
> && !(single_succ_p (bb)
> && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
> + && bb->loop_father->header == bb
> && optimize_bb_for_speed_p (bb)
> && branch_count + fallthru_count > count_threshold
> && (branch_count