This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] Loop distribution for single nested loops
- From: Diego Novillo <dnovillo at google dot com>
- To: Sebastian Pop <sebpop at gmail dot com>
- Cc: "GCC Patches" <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 3 Oct 2007 08:49:11 -0400
- Subject: Re: [patch] Loop distribution for single nested loops
- References: <cb9d34b20709121139u576b817djbd49dad9dd79eabd@mail.gmail.com>
On 12-Sep-07, at 14:39 , Sebastian Pop wrote:
> I ran into several problems while implementing this pass, and
> after having discussed with Richi here at the SuSE conf, we concluded
> that there should be a better solution for the following code in the
> tree-vectorizer.c, where we reset the pending stmts that otherwise
> cause memory corruption.
>
> redirect_edge_and_branch_force (e, new_loop->header);
> PENDING_STMT (e) = NULL;
>
> The code that builds the stmt list in the pending stmts is in
> ssa_redirect_edge. The code from this pending list is supposed to
> be inserted later by flush_pending_stmts, but in this particular case
> there are more phi nodes than arguments in the pending stmt list,
> that causes an ICE if flush is called.
I think Aldy was having trouble with this code when implementing
a replacement for the code that uses PENDING_STMT to store SSA
name mappings instead of statements. Aldy?
> The patch passes the bootstrap stage1 on i686-linux.
> Okay for trunk if we fix the remaining fails?
I like this pass. Would it make sense to enable this pass by
default? Or enable it only with the vectorizer/parallelizer? I
think this is something we could at least enable at -O3.
Before comitting the pass, I think it would be very useful to
enable it by default at -O2 and do a full bootstrap/test cycle,
that will help you shake the most obvious bugs.
Once that's done, I would enable it at -O3 by default.
> +/* Returns true if PARTITION contains a computation that can be
> + splitted out. */
s/splitted/split/
> + for (i = 0; VEC_iterate (rdgc, components, i, x); i++)
> + if (x->num == c)
> + return false;
How about keeping a pointer-set/bitmap/hashtable as index on the
side?
> +#if 0
> +/* Check that PRODUCER and CONSUMERS are in LOOP. */
No #if0 code, please.
> +
> + rdg = build_rdg (loop);
This is more of a comment about build_rdg() than this code. The
creation of the graph seems slow. When you build the rdg edges,
wouldn't it be quicker to have the vertices of the graph in a
pointer map instead of having to resort to the linear scan?
I'm referring to tree-data-ref.c:find_vertex_for_stmt.