When looking at why GCC is so slow with the himeno benchmark in the usual Phoronix testing I noticed that we do not vectorize float *x; float parm; float test (int start, int end) { int i; for (i = start; i < end; ++i) { float tem = x[i]; x[i] = parm * tem; } } because there is a scalar non-varying load of parm that, when the loop rolls just a single time is aliased by the store to x[i]. We are though vectorizing with at least a vectorization factor of two, which means that x cannot validly point to parm (and a vector store would exceed the scalar variables size, something that after vectorization alias-analysis would use to disambiguate the vector store and the load). Thus we can treat loads of scalar decls as if they were done outside of the loop and vectorize it as D.1234_3 = tem; vec_tmp_4 = { D.1234_3, D.1234_3 }; thus, as if D.1234_3 were a vect_external_def and mark the statement itself as not relevant.
To fix: /* FIXME -- data dependence analysis does not work correctly for objects with invariant addresses in loop nests. Let us fail here until the problem is fixed. */ if (dr_address_invariant_p (dr) && nest) { free_data_ref (dr); if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "\tFAILED as dr address is invariant\n"); ret = false; break; } for this to work and be a real improvement we need to pass down the minimum iterations of the loop (thus, the minimum vectorization factor in case of the vectorizer). We can also fix it up locally in the vectorizer when compute_data_dependences_for_loop would not re-scan the loop for data references (but we'd do it in the vectorizer and remove the scalar loads). We can also avoid computing data dependences until vect_analyze_data_ref_dependences then and save some compile-time for non-vectorized loops.
Mine.
Author: rguenth Date: Thu Jun 30 13:27:43 2011 New Revision: 175704 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175704 Log: 2011-06-30 Richard Guenther <rguenther@suse.de> PR tree-optimization/46787 * tree-data-ref.c (dr_address_invariant_p): Remove. (find_data_references_in_stmt): Invariant accesses are ok now. * tree-vect-stmts.c (vectorizable_load): Handle invariant loads. * tree-vect-data-refs.c (vect_analyze_data_ref_access): Allow invariant loads. * gcc.dg/vect/vect-121.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/vect/vect-121.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-data-ref.c trunk/gcc/tree-vect-data-refs.c trunk/gcc/tree-vect-stmts.c
Fixed.