Bug 65963 - Missed vectorization of loads strided with << when equivalent * succeeds
Summary: Missed vectorization of loads strided with << when equivalent * succeeds
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 5.0
: P3 normal
Target Milestone: ---
Assignee: Alan Lawrence
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2015-05-01 11:55 UTC by Alan Lawrence
Modified: 2016-02-23 16:25 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2015-05-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alan Lawrence 2015-05-01 11:55:08 UTC
This testcase does not vectorize at -O3 on x86_64/-mavx or AArch64:
void
loop (int *in, int *out)
{
  for (int i = 0; i < 256; i++) {
     out[i] = in[i << 1] + 7;
  }
}

-fdump-tree-vect-details reveals:
Creating dr for *_12
analyze_innermost: failed: evolution of base is not affine.
        base_address: 
        offset from base address: 
        constant offset from base address: 
        step: 
        aligned to: 
        base_object: *_12

However, this testcase succeeds:
void
loop (int *in, int *out)
{
  for (int i = 0; i < 256; i++) {
     out[i] = in[i * 2] + 7;
  }
}

The relevant extract of -fdump-tree-vect-details showing:
Creating dr for *_12
analyze_innermost: success.
        base_address: in_11(D)
        offset from base address: 0
        constant offset from base address: 0
        step: 8
        aligned to: 256
        base_object: *in_11(D)
        Access function 0: {0B, +, 8}_1

The only difference is the multiplication:
$ diff splice{,2}.c.131t.ifcvt
27c27
<   _8 = i_19 * 2;
---
>   _8 = i_19 << 1;
$
Comment 1 Richard Biener 2015-05-04 11:18:45 UTC
Confirmed.  That's because SCEV interpret_rhs_expr doesn't handle LSHIFT_EXPR
(it does handle MULT_EXPR).  More places would need to handle LSHIFT_EXPR
though, also in tree-chrec.c.
Comment 2 Alan Lawrence 2015-11-05 18:40:10 UTC
Author: alalaw01
Date: Thu Nov  5 18:39:38 2015
New Revision: 229825

URL: https://gcc.gnu.org/viewcvs?rev=229825&root=gcc&view=rev
Log:
[PATCH] tree-scalar-evolution.c: Handle LSHIFT by constant

gcc/:

	PR tree-optimization/65963
	* tree-scalar-evolution.c (interpret_rhs_expr): Try to handle
	LSHIFT_EXPRs as equivalent unsigned MULT_EXPRs.

gcc/testsuite/:

	* gcc.dg/pr68112.c: New.
	* gcc.dg/vect/vect-strided-shift-1.c: New.

Added:
    trunk/gcc/testsuite/gcc.dg/pr68112.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-strided-shift-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-scalar-evolution.c
Comment 3 Christophe Lyon 2015-11-06 09:17:08 UTC
The new test gcc.dg/vect/vect-strided-shift-1.c fails at execution on armeb-none-linux-gnueabihf:

FAIL:
  gcc.dg/vect/vect-strided-shift-1.c -flto -ffat-lto-objects execution test
  gcc.dg/vect/vect-strided-shift-1.c execution test
Comment 4 Alan Lawrence 2015-11-06 15:19:23 UTC
I confirm the testcase fails execution on armeb-none-eabi (also at -O0), but it does so both with and without the patch to tree-scalar-evolution.c, which did not change codegen (at -O2 -ftree-vectorize; the loop was not vectorized). So this looks to be exposing a different, pre-existing, bug.
Comment 5 Alan Lawrence 2016-02-23 16:25:06 UTC
Can I class this as fixed?