This testcase does not vectorize at -O3 on x86_64/-mavx or AArch64: void loop (int *in, int *out) { for (int i = 0; i < 256; i++) { out[i] = in[i << 1] + 7; } } -fdump-tree-vect-details reveals: Creating dr for *_12 analyze_innermost: failed: evolution of base is not affine. base_address: offset from base address: constant offset from base address: step: aligned to: base_object: *_12 However, this testcase succeeds: void loop (int *in, int *out) { for (int i = 0; i < 256; i++) { out[i] = in[i * 2] + 7; } } The relevant extract of -fdump-tree-vect-details showing: Creating dr for *_12 analyze_innermost: success. base_address: in_11(D) offset from base address: 0 constant offset from base address: 0 step: 8 aligned to: 256 base_object: *in_11(D) Access function 0: {0B, +, 8}_1 The only difference is the multiplication: $ diff splice{,2}.c.131t.ifcvt 27c27 < _8 = i_19 * 2; --- > _8 = i_19 << 1; $
Confirmed. That's because SCEV interpret_rhs_expr doesn't handle LSHIFT_EXPR (it does handle MULT_EXPR). More places would need to handle LSHIFT_EXPR though, also in tree-chrec.c.
Author: alalaw01 Date: Thu Nov 5 18:39:38 2015 New Revision: 229825 URL: https://gcc.gnu.org/viewcvs?rev=229825&root=gcc&view=rev Log: [PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant gcc/: PR tree-optimization/65963 * tree-scalar-evolution.c (interpret_rhs_expr): Try to handle LSHIFT_EXPRs as equivalent unsigned MULT_EXPRs. gcc/testsuite/: * gcc.dg/pr68112.c: New. * gcc.dg/vect/vect-strided-shift-1.c: New. Added: trunk/gcc/testsuite/gcc.dg/pr68112.c trunk/gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-scalar-evolution.c
The new test gcc.dg/vect/vect-strided-shift-1.c fails at execution on armeb-none-linux-gnueabihf: FAIL: gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test gcc.dg/vect/vect-strided-shift-1.c execution test
I confirm the testcase fails execution on armeb-none-eabi (also at -O0), but it does so both with and without the patch to tree-scalar-evolution.c, which did not change codegen (at -O2 -ftree-vectorize; the loop was not vectorized). So this looks to be exposing a different, pre-existing, bug.
Can I class this as fixed?