This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Simple optimization for MASK_STORE.
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Yuri Rumyantsev <ysrumyan at gmail dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Igor Zamyatin <izamyatin at gmail dot com>
- Date: Fri, 8 May 2015 11:27:48 +0200
- Subject: Re: [PATCH] Simple optimization for MASK_STORE.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqRmV48Ytdew0azyTQWZcmfFmjX-JaLtYUz8Q3wejL2RnQ at mail dot gmail dot com>
On Wed, May 6, 2015 at 4:04 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi All,
>
> Here is a patch which gives us significant speed-up on HASWELL for
> test containing masked stores. The main goal of that patch is attempt
> to avoid HW hazard for maskmove instructions through inserting
> additional check on zero mask and putting all masked store statements
> into separate block on false edge.All MASK_STORE statements having the
> same mask put into one block. Any comments will be appreciate.
Hmm. I'm not very happy with this "optimization" happening at the
GIMPLE level - it feels more like a mdreorg thing...
The testcase you add doesn't end up with invalid addresses - so what's
the testcase you are inventing this for?
Looking into the implementation I don't see where you are validating
data dependences of any sort but you are moving stores (and possibly
loads when sinking definition stmts of stored values). The code-sinking
part should be handled by the existing pass. Your simple testcase
contains a single masked store, so why does simply conditionalizing
each masked store in mdreorg not work? It's a hazard (hopefully
fixed eventually), thus not really worth optimizing 100%.
The target hook name is awful.
You don't need a extra flag in struct loop - the vectorizer scans all
insns so it can perfectly well re-compute it.
What this all feels like is more like a un-if-conversion pass which might
be useful for aggressively if-converted vectorized code as well (thus
lots of vec_cond expressions for example).
Richard.
> ChangeLog:
> 2015-05-06 Yuri Rumyantsev <ysrumyan@gmail.com>
>
> * cfgloop.h (has_mask_store): Add new field to struct loop.
> * config/i386/i386.c: Include files stringpool.h and tree-ssanames.h.
> (ix86_vectorize_zero_vector): New function.
> (TARGET_VECTORIZE_ZERO_VECTOR): New target macro
> * doc/tm.texi.in: Add @hook TARGET_VECTORIZE_ZERO_VECTOR.
> * doc/tm.texi: Updated.
> * target.def (zero_vector): New DEFHOOK.
> * tree-if-conv.c (predicate_mem_writes): Set has_mask_store for loop.
> * tree-vect-stmts.c : Include tree-into-ssa.h.
> (optimize_mask_stores): New function.
> * tree-vectorizer.c (vectorize_loops): Zero has_mask_store field for
> non-vectorized loops and invoke optimize_mask_stores function.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.