This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [GRAPHITE, PATCH] Loop unroll and jam optimization
- From: Sven Verdoolaege <skimo at kotnet dot org>
- To: Mircea Namolaru <mircea dot namolaru at inria dot fr>
- Cc: gcc-patches at gcc dot gnu dot org, Tobias Grosser <tobias at grosser dot es>, Richard Biener <richard dot guenther at gmail dot com>, Albert Cohen <albert dot cohen at inria dot fr>
- Date: Sat, 8 Nov 2014 10:23:08 +0100
- Subject: Re: [GRAPHITE, PATCH] Loop unroll and jam optimization
- Authentication-results: sourceware.org; auth=none
- References: <20141104151755 dot GA13200 at physik dot fu-berlin dot de> <545B5004 dot 6090608 at inria dot fr> <545B5637 dot 1070109 at grosser dot es> <545B5A1C dot 8060307 at inria dot fr> <545B5B52 dot 2050403 at grosser dot es> <545B6A4E dot 7090308 at inria dot fr> <1413233506 dot 9233395 dot 1415278366408 dot JavaMail dot zimbra at inria dot fr> <545B6F73 dot 1010809 at inria dot fr> <158837903 dot 10030807 dot 1415403125838 dot JavaMail dot zimbra at inria dot fr>
On Sat, Nov 08, 2014 at 12:32:05AM +0100, Mircea Namolaru wrote:
> Hello,
>
> This is the patch for unroll and jam optimizations. It is based on the
> code written by Tobias from graphite-optimize-isl.c (the code was
> unreachable till now) extended and enabled it via a new option -floop-unroll-jam.
> The patch takes advantage of the new isl based code generator introduced recently
> in GCC (in fact of the possible options for building the AST).
>
> The code generated for this optimization in the case of non-constant loop bounds
> initially looks as below. This is not very useful because the standard GCC
> unrolling don't succeed to unroll the most inner loop.
>
> ISL AST generated by ISL:
> for (int c0 = 0; c0 < HEIGHT; c0 += 4)
> for (int c1 = 0; c1 < LENGTH - 3; c1 += 1)
> for (int c2 = c0; c2 <= min(HEIGHT - 1, c0 + 3); c2 += 1)
> S_4(c2, c1);
>
> Now, the "separating class" option (set for unroll and jam) produces this nice loop
> structure:
I'm not sure if Tobi or Albert have told you, but the separation_class option
is going to be phased out since its design is fundamentally flawed.
If you can't wait until isl-0.15, then I guess you have no choice but
to use this option, but you should realize that it will remain frozen
in its current broken state (until it is removed at some point).
> +
> +/* Set the unroll AST built option. */
> +
> +static __isl_give isl_union_map *
> +generate_isl_options_2 (scop_p scop, __isl_take isl_union_set *domain,
> + int dim, int cl)
Not a very descriptive function name.
> +{
> + isl_map *map;
> + isl_space *space, *space_sep;
> + isl_ctx *ctx;
> + isl_union_map *mapu;
> + int nsched = get_max_schedule_dimensions (scop);
> +
> + ctx = scop->ctx;
> + space_sep = isl_space_alloc(ctx, 0, 1, 1);
> + space_sep = isl_space_wrap(space_sep);
> + space_sep = isl_space_set_tuple_name(space_sep, isl_dim_set,
> + "separation_class");
> + space = isl_set_get_space (scop->context);
> + space_sep = isl_space_align_params(space_sep, isl_space_copy(space));
> + space = isl_space_map_from_domain_and_range(space, space_sep);
> + space = isl_space_add_dims (space,isl_dim_in, nsched);
Inconsistent spacing, sometimes with a space before "(" and sometimes
without.
I also noticed some tab/spacing inconsistencies later in the patch,
but I already removed that part in my reply.
> + /* Extract the original and auxiliar maps from pbb->transformed.
> + Set pbb->transformed to the original map. */
> + psmap = &smap;
> + psmap->n = 0;
> + res = isl_map_foreach_basic_map (pbb->transformed, separate_map, (void *)psmap);
> + gcc_assert (res == 0);
> +
> + isl_map_free(pbb->transformed);
> + pbb->transformed = isl_map_copy(psmap->map_arr[0]);
> +
I have no idea what this pbb->transformed is supposed to represent,
but you appear to be assuming that it has exactly two disjuncts and that
they appear in some order. Now, perhaps you have explicitly
checked that this map has two disjuncts, but then you should
probably bring the check closer since any operation on sets that
you perform could change the internal representation (even of
other sets). However, in no way can you assume that
isl_map_foreach_basic_map will iterate over these disjuncts
in any specific order.
> + bb_domain_s = isl_set_apply (isl_set_copy (bb_domain),
> + psmap->map_arr[1]);
> + bb_domain_r = isl_set_apply (bb_domain, psmap->map_arr[0]);
> +
> + bb_domain_s = isl_set_complement (bb_domain_s);
> + bb_domain_r = isl_set_subtract(bb_domain_r,bb_domain_s);
Why are you subtracting the complement of a set instead of just
intersecting with that set?
> + /* Create an identity map for everything except DimToVectorize and the
> + point loop. */
> + for (i = 0; i < ScheduleDimensions; i++)
> + {
> + if (i == DimToVectorize)
> + continue;
> +
> + c = isl_equality_alloc (isl_local_space_copy (LocalSpace));
> +
> + isl_constraint_set_coefficient_si (c, isl_dim_in, i, -1);
> + isl_constraint_set_coefficient_si (c, isl_dim_out, i, 1);
> +
> + TilingMap = isl_map_add_constraint (TilingMap, c);
You can use isl_map_equate instead.
> @@ -350,17 +425,31 @@
> SuffixSchedule);
> isl_band_list_free (Children);
Gaack! gcc is using band forests too...
I guess I'll have to keep them around for a while then....
skimo