This is the mail archive of the
mailing list for the GCC project.
Re: Correcting transform_to_exit_first_loop + fix to PR46886
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: Razya Ladelsky <RAZYA at il dot ibm dot com>
- Cc: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 20 Apr 2012 14:19:26 +0200
- Subject: Re: Correcting transform_to_exit_first_loop + fix to PR46886
- References: <OF9527B3CD.F966FDAD-ONC22579E6.002A784E-C22579E6.00429CF2@il.ibm.com>
On Fri, Apr 20, 2012 at 2:07 PM, Razya Ladelsky <RAZYA@il.ibm.com> wrote:
> This patch handles duplicating of the last iteration correctly.
> The current code always duplicates the complete "static" ?iteration from
> the entry to the latch,
> and then sets the number of iterations according to the pattern of the
> loop (according to whether the cond before the body, or the body before
> the cond).
> The correct way to go about is to not assume anything about the control
> flow of the loop, and ?duplicate the last iteration only from
> entry to the block consisting of the cond, that is the real last dynamic
> iteration that would be executed.
> The number of iterations then needs no special care for each loop pattern.
> This was actually Zdenek's original intent by duplicating the last
> This code allows us to remove the do_while cond, and gets PR46886 resolved
> (instead of the solution suggested in
> which handled the number of iterations according to the specific shape of
> the loop)
> Bootstrap and testsuite pass successfully.
> Testsuite with -ftree-parallelize-loops=4 shows only one regression which
> is uninit-17.c test for warnings, which seems
> unrelated to how the loop gets parallelized.
> I also ran spec-2006, and it showed no regressions.
That looks definitely better. A few comments:
gsi = gsi_after_labels (ex_bb);
+ exit = single_dom_exit (loop);
exit gets initialized in three places now - are all of them needed?
@@ -2187,10 +2155,9 @@ parallelize_loops (void)
|| loop_has_blocks_with_irreducible_flag (loop)
|| (loop_preheader_edge (loop)->src->flags & BB_IRREDUCIBLE_LOOP)
/* FIXME: the check for vector phi nodes could be removed. */
- || loop_has_vector_phi_nodes (loop)
+ || loop_has_vector_phi_nodes (loop))
/* FIXME: transform_to_exit_first_loop does not handle not
header-copied loops correctly - see PR46886. */
- || !do_while_loop_p (loop))
the comment should be removed, too.
+bool bb_in_region_p (basic_block bb, basic_block* bbs, unsigned n_region)
canonical formatting is
bb_in_region_p (basic_block bb, basic_block* bbs, unsigned n_region)
+ unsigned int n;
+ for (n = 0; n < n_region; n++)
+ if (bb == bbs[n])
+ return true;
+ return false;
needs a function comment. Seems to be only used in
+ target= loop;
+ for (aloop = orig_loop->inner; aloop; aloop = aloop->next)
+ if (bb_in_region_p (aloop->header, region, n_region))
+ cloop = duplicate_loop (aloop, target);
+ duplicate_subloops (aloop, cloop);
still it's not static - but it has the same name as an inline function
in sese.h with a different prototype. I suggest renaming it and making
Ok with these changes if you give Zdenek 24h to comment, too.
> 2012-04-20 ?Razya Ladelsky ?<firstname.lastname@example.org>
> ? ? ? ? ? ? ? ? PR tree-optimization/46886
> ? ? ? ? ? ? ? ? * tree-parloops.c (transform_to_exit_first_loop): Remove
> setting of number of iterations according to the loop pattern.
> ? ? ? ? ? ? ? ? Duplicate from entry to exit->src instead of loop->latch.
> ? ? ? ? ? ? ? ? (pallelize_loops): Remove the condition preventing
> do-while loops.
> ? ? ? ? ? ? ? ? * tree-cfg.c (bool bb_in_region_p): New.
> ? ? ? ? ? ? ? ? (gimple_duplicate_sese_tail): Adjust duplication of the
> the subloops.
> ? ? ? ? ? ? ? ? Adjust redirection of the duplicated iteration.
> O.K to commit?