This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Improve BB vectorization dependence analysis
- From: Alan Lawrence <alan dot lawrence at arm dot com>
- To: Richard Biener <rguenther at suse dot de>, gcc-patches at gcc dot gnu dot org
- Date: Mon, 16 Nov 2015 18:35:34 +0000
- Subject: Re: [PATCH] Improve BB vectorization dependence analysis
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LSU dot 2 dot 11 dot 1511091351580 dot 10078 at zhemvz dot fhfr dot qr>
On 09/11/15 12:55, Richard Biener wrote:
Currently BB vectorization computes all dependences inside a BB
region and fails all vectorization if it cannot handle some of them.
This is obviously not needed - BB vectorization can restrict the
dependence tests to those that are needed to apply the load/store
motion effectively performed by the vectorization (sinking all
participating loads/stores to the place of the last one).
With restructuring it that way it's also easy to not give up completely
but only for the SLP instance we cannot vectorize (this gives
a slight bump in my SPEC CPU 2006 testing to 756 vectorized basic
block regions).
But first and foremost this patch is to reduce the dependence analysis
cost and somewhat mitigate the compile-time effects of the first patch.
For fixing PR56118 only a cost model issue remains.
Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
Richard.
2015-11-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/56118
* tree-vectorizer.h (vect_find_last_scalar_stmt_in_slp): Declare.
* tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Export.
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences): New
function.
(vect_slp_analyze_data_ref_dependences): Instead of computing
all dependences of the region DRs just analyze the code motions
SLP vectorization will perform. Remove SLP instances that
cannot have their store/load motions applied.
(vect_analyze_data_refs): Allow DRs without a vectype
in BB vectorization.
* gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c: Adjust.
Since this, I've been seeing an ICE on gfortran.dg/vect/vect-9.f90 at on both
aarch64-none-linux-gnu and arm-none-linux-gnueabihf:
spawn /home/alalaw01/build/gcc/testsuite/gfortran4/../../gfortran
-B/home/alalaw01/build/gcc/testsuite/gfortran4/../../
-B/home/alalaw01/build/aarch64-unknown-linux-gnu/./libgfortran/
/home/alalaw01/gcc/gcc/testsuite/gfortran.dg/vect/vect-9.f90
-fno-diagnostics-show-caret -fdiagnostics-color=never -O -O2 -ftree-vectorize
-fvect-cost-model=unlimited -fdump-tree-vect-details -Ofast -S -o vect-9.s
/home/alalaw01/gcc/gcc/testsuite/gfortran.dg/vect/vect-9.f90:5:0: Error:
definition in block 13 follows the use for SSA_NAME: _339 in statement:
vectp.156_387 = &*cc_36(D)[_339];
/home/alalaw01/gcc/gcc/testsuite/gfortran.dg/vect/vect-9.f90:5:0: internal
compiler error: verify_ssa failed
0xcfc61b verify_ssa(bool, bool)
../../gcc-fsf/gcc/tree-ssa.c:1039
0xa2fc0b execute_function_todo
../../gcc-fsf/gcc/passes.c:1952
0xa30393 do_per_function
../../gcc-fsf/gcc/passes.c:1632
0xa3058f execute_todo
../../gcc-fsf/gcc/passes.c:2000
Please submit a full bug report...
FAIL: gfortran.dg/vect/vect-9.f90 -O (internal compiler error)
FAIL: gfortran.dg/vect/vect-9.f90 -O (test for excess errors)
Still there (on aarch64) at r230329.
--Alan