This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
- From: Richard Biener <rguenther at suse dot de>
- To: Evgeny Stupachenko <evstupac at gmail dot com>
- Cc: gcc-patches at gcc dot gnu dot org, Jakub Jelinek <jakub at redhat dot com>, ubizjak at gmail dot com
- Date: Tue, 11 Feb 2014 14:00:30 +0100 (CET)
- Subject: Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.
- Authentication-results: sourceware.org; auth=none
- References: <CAOvf_xxEQ3tm+fwL5EfVSOUKDQnaBt+jTz4huK66T_8+TXzzfQ at mail dot gmail dot com>
On Tue, 11 Feb 2014, Evgeny Stupachenko wrote:
> Hi,
>
> The patch gives an expected 3 times gain for the test case in the PR52252
> (and even 6 times for AVX2).
> It passes make check and bootstrap on x86.
> spec2000/spec2006 got no regressions/gains on x86.
>
> Is this patch ok?
I've worked on generalizing the permutation support in the light
of the availability of the generic shuffle support in the IL
but hit some road-blocks in the way code-generation works for
group loads with permutations (I don't remember if I posted all patches).
This patch seems to be to a slightly different place but it again
special-cases a specific permutation. Why's that? Why can't we
support groups of size 7 for example? So - can this be generalized
to support arbitrary non-power-of-two load/store groups?
Other than that the patch has to wait for stage1 to open again,
of course. And it misses a testcase.
Btw, do you have a copyright assignment on file with the FSF covering
work on GCC?
Thanks,
Richard.
> ChangeLog:
>
> 2014-02-11 Evgeny Stupachenko <evstupac@gmail.com>
>
> * target.h (vect_cost_for_stmt): Defining new cost vec_perm_shuffle.
> * tree-vect-data-refs.c (vect_grouped_store_supported): New
> check for stores group of length 3.
> (vect_permute_store_chain): New permutations for stores group of
> length 3.
> (vect_grouped_load_supported): New check for loads group of length
> 3.
> (vect_permute_load_chain): New permutations for loads group of
> length 3.
> * tree-vect-stmts.c (vect_model_store_cost): New cost
> vec_perm_shuffle
> for the new permutations.
> (vect_model_load_cost): Ditto.
> * config/aarch64/aarch64.c (builtin_vectorization_cost): Adding
> vec_perm_shuffle cost as equvivalent of vec_perm cost.
> * config/arm/arm.c: Ditto.
> * config/rs6000/rs6000.c: Ditto.
> * config/spu/spu.c: Ditto.
> * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow
> byte
> shuffle on some x86 architectures.
> * config/i386/i386.h (processor_costs): Defining pshuffb cost.
> * config/i386/i386.c (processor_costs): Adding pshuffb cost.
> (ix86_builtin_vectorization_cost): Adding cost for the new
> permutations.
> Fixing cost for other permutations.
> (expand_vec_perm_even_odd_1): Avoid byte shuffles when they are
> slow (TARGET_SLOW_PHUFFB).
> (ix86_add_stmt_cost): Adding cost when STMT is WIDEN_MULTIPLY.
> Adding new shuffle cost only when byte shuffle is expected.
> Fixing cost model for Silvermont.
>
> Thanks,
> Evgeny
>
--
Richard Biener <rguenther@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer