This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Restrict LOOP_ALIGN to loop headers only.


On 7/11/19 11:42 AM, Richard Biener wrote:
> On Wed, Jul 10, 2019 at 5:52 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
>>
>> On July 10, 2019 2:11:17 PM GMT+02:00, Michael Matz <matz@suse.de> wrote:
>>> Hi,
>>>
>>> On Tue, 9 Jul 2019, Richard Biener wrote:
>>>
>>>>> The basic block index is not a DFS index, so no, that's not a test
>>> for
>>>>> backedge.
>>>>
>>>> I think in CFG RTL mode the BB index designates the order of the BBs
>>> in
>>>> the object file? So this is a way to identify backwards jumps?
>>>
>>> Even if it means a backwards jump (and that's not always the case, the
>>> insns are emitted by following the NEXT_INSN links, without a CFG, and
>>> that all happens after machine-dependend reorg, and going out of cfg
>>> layout might link insn together even from high index BBs to low index
>>> BBs
>>> (e.g. because of fall-through)), that's still not a backedge in the
>>> general case.  If a heuristic is enough here it might be okay, though.
>>>
>>> OTOH, as here a CFG still exists, why not simply rely on a proper DFS
>>> marking backedges?
>>
>> Because proper backedges is not what we want here, see honzas example.
>>
>> So I'm second-guessing why we have different LOOP_ALIGN and when it makes sense to apply.
> 
> So docs have
> 
> @defmac JUMP_ALIGN (@var{label})
> The alignment (log base 2) to put in front of @var{label}, which is
> a common destination of jumps and has no fallthru incoming edge.
> 
> ...
> 
> @defmac LOOP_ALIGN (@var{label})
> The alignment (log base 2) to put in front of @var{label} that heads
> a frequently executed basic block (usually the header of a loop).
> 
> so I would expect the alignment pass to have
> 
>       if ( (branch_count > count_threshold
>               || (bb->count > bb->prev_bb->count.apply_scale (10, 1)
>                   && (bb->prev_bb->count
>                       <= ENTRY_BLOCK_PTR_FOR_FN (cfun)
>                            ->count.apply_scale (1, 2)))))
>         {
>           align_flags alignment = has_fallthru ? JUMP_ALIGN (label) :
> LOOP_ALIGN (label);
>           if (dump_file)
>             fprintf (dump_file, "  jump alignment added.\n");
>           max_alignment = align_flags::max (max_alignment, alignment);
>         }

Sorry for the delay.

> 
> instead of the two different conditions?  Or the JUMP_ALIGN case
> computing "common destination" instead of "frequently executed"
> somehow but I think it makes sense that those are actually the same
> here (frequently executed).  It oddly looks at prev_bb and is not
> guarded with optimize_bb_for_speed_p at all so this all is
> full of heuristics and changing anything here just based on x86
> measurements is surely going to cause issues for targets more
> sensitive to (mis-)alignment.

I like you patch, it's a rapid simplification of the code which
we have there.

I'm sending LNT results before and after the revision. It show
there are quite some significant changes. lbm is a known issue
where we are lucky with the patch.

Thoughts?
Martin

> 
> Richard.
> 
>> Richard.
>>
>>>
>>> Ciao,
>>> Michael.
>>

>From b5cad0c2bdc611aedf32a839ead1774fd82fdad4 Mon Sep 17 00:00:00 2001
From: Martin Liska <mliska@suse.cz>
Date: Tue, 6 Aug 2019 18:09:25 +0200
Subject: [PATCH] Simplify alignment.

---
 gcc/final.c | 32 ++++++++------------------------
 1 file changed, 8 insertions(+), 24 deletions(-)

diff --git a/gcc/final.c b/gcc/final.c
index fefc4874b24..b33dfc6646e 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -722,32 +722,16 @@ compute_alignments (void)
 	 than the predecessor and the predecessor is likely to not be executed
 	 when function is called.  */
 
-      if (!has_fallthru
-	  && (branch_count > count_threshold
-	      || (bb->count > bb->prev_bb->count.apply_scale (10, 1)
-		  && (bb->prev_bb->count
-		      <= ENTRY_BLOCK_PTR_FOR_FN (cfun)
-			   ->count.apply_scale (1, 2)))))
+      if (branch_count > count_threshold
+	  || (bb->count > bb->prev_bb->count.apply_scale (10, 1)
+	      && (bb->prev_bb->count
+		  <= ENTRY_BLOCK_PTR_FOR_FN (cfun)->count.apply_scale (1, 2))))
 	{
-	  align_flags alignment = JUMP_ALIGN (label);
+	  align_flags alignment
+	    = has_fallthru ? JUMP_ALIGN (label) : LOOP_ALIGN (label);
 	  if (dump_file)
-	    fprintf (dump_file, "  jump alignment added.\n");
-	  max_alignment = align_flags::max (max_alignment, alignment);
-	}
-      /* In case block is frequent and reached mostly by non-fallthru edge,
-	 align it.  It is most likely a first block of loop.  */
-      if (has_fallthru
-	  && !(single_succ_p (bb)
-	       && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun))
-	  && optimize_bb_for_speed_p (bb)
-	  && branch_count + fallthru_count > count_threshold
-	  && (branch_count
-	      > fallthru_count.apply_scale
-		    (PARAM_VALUE (PARAM_ALIGN_LOOP_ITERATIONS), 1)))
-	{
-	  align_flags alignment = LOOP_ALIGN (label);
-	  if (dump_file)
-	    fprintf (dump_file, "  internal loop alignment added.\n");
+	    fprintf (dump_file, "  %s alignment added.\n",
+		     has_fallthru ? "loop" : "internal loop");
 	  max_alignment = align_flags::max (max_alignment, alignment);
 	}
       LABEL_TO_ALIGNMENT (label) = max_alignment;
-- 
2.22.0

Attachment: lnt-loop-alignment.pdf.bz2
Description: application/bzip


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]