This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH 0/2] Loop distribution for memset zero
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: Sebastian Pop <sebpop at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org, matz at suse dot de
- Date: Sat, 31 Jul 2010 12:01:41 +0200
- Subject: Re: [PATCH 0/2] Loop distribution for memset zero
- References: <email@example.com>
On Fri, Jul 30, 2010 at 10:40 PM, Sebastian Pop <firstname.lastname@example.org> wrote:
> Michael Matz proposed that it would be a good idea for some CPU2006
> benchmarks to add a separate heuristic for the loop distribution pass
> for the memset zero pattern, and to enable that at -O3 in order to
> exercise the loop distribution code. ?The following two patches
> implement on top of the current loop distribution pass the heuristic,
> and enable it at -O3.
> The new pass starts by adding to the partitions working list the data
> references that are initialized to zero. ?These partitions are then
> code generated in different loops, and the current loop distribution
> detects the memset zero pattern.
> Regstrapped on amd64-linux.
> SPEC2006 passed with -O3 (except the dealII compile fail that I
> haven't fixed in my sources yet...).
> Bootstrap failed with BOOT_CFLAGS="-g -O3", but then when I tried also
> without these two patches it also failed with the same miscompiled
> files, so bootstrap of trunk is broken at -O3, see
> Ok for trunk?
The new pass should be disabled when loop-distribution is enabled, no?
Thus, I think it would make more sense to fold it into the existing pass
which then runs in different modes depending on the flags used.
The flag should be named more general, like -ftree-loop-distribute-patterns
as we probably want to add memcpy or array sin/cos operations as well
Now the code looks very specific at the moment, with
stores_zero_from_loop. I suppose we can't ask loop distribution
to separate stores as is but then only generate separate code for
the memset and ask it to keep the other pieces together?
> ?Add pass_loop_distribute_memset_zero.
> ?Enable flag_tree_loop_distribute_memset_zero at -O3.
> ?gcc/common.opt ? ? ? ? ? ? ? | ? ?4 ++
> ?gcc/doc/invoke.texi ? ? ? ? ?| ? 23 ++++++++++++++-
> ?gcc/opts.c ? ? ? ? ? ? ? ? ? | ? ?1 +
> ?gcc/passes.c ? ? ? ? ? ? ? ? | ? ?1 +
> ?gcc/tree-data-ref.c ? ? ? ? ?| ? 26 +++++++++++++++++
> ?gcc/tree-data-ref.h ? ? ? ? ?| ? ?1 +
> ?gcc/tree-loop-distribution.c | ? 63 ++++++++++++++++++++++++++++++++++++++++++
> ?gcc/tree-pass.h ? ? ? ? ? ? ?| ? ?1 +
> ?8 files changed, 119 insertions(+), 1 deletions(-)