Bug 56625 - After if-conversion vectorizer doesn't recognize similar loads
Summary: After if-conversion vectorizer doesn't recognize similar loads
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.8.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2013-03-15 12:17 UTC by Michael Zolotukhin
Modified: 2016-09-16 09:22 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-03-15 00:00:00


Attachments
Reproducer (113 bytes, text/plain)
2013-03-15 12:17 UTC, Michael Zolotukhin
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Zolotukhin 2013-03-15 12:17:39 UTC
Created attachment 29673 [details]
Reproducer

In the following example there are two stores a[i]<-b[i] after if-conversion.
void foo (double a[], double b[])
{
  int i;
  for (i = 0; i < 100; i++)
    {
      if (a[i] == 0)
        a[i] = b[i]*4;
      else
        a[i] = b[i]*3;
    }
}
As vectorizer knows nothing about dependencies between a and b, it needs a runtime test for it. But in the given example, vectorizer generates two runtime-tests instead of one:
note: mark for run-time aliasing test between *_11 and *_8
note: mark for run-time aliasing test between *_15 and *_8

The test is attached, command line to reproduce:
gcc if-conv-runtime-tests.c -O3 -c -ftree-vectorizer-verbose=15 -ftree-loop-if-convert-stores  -fdump-tree-vect
Comment 1 Richard Biener 2013-03-15 12:27:08 UTC
For me cselim already sinks the store, making this a non-issue (and vectorizing
the loop).  On x86_64, that is.

So - which target?
Comment 2 Richard Biener 2013-03-15 12:29:50 UTC
Oh, it's about the loads!

  <bb 3>:
  # i_22 = PHI <i_19(4), 0(2)>
  # ivtmp_28 = PHI <ivtmp_24(4), 100(2)>
  _5 = (long unsigned int) i_22;
  _6 = _5 * 8;
  _8 = a_7(D) + _6;
  _9 = *_8;
  _11 = b_10(D) + _6;
  _12 = *_11;
  _13 = _12 * 4.0e+0;
  _15 = b_10(D) + _6;
  _16 = *_15;
  _17 = _16 * 3.0e+0;
  cstore_18 = _9 == 0.0 ? _13 : _17;
  *_8 = cstore_18;

and

t.c:4: note: versioning for alias required: can't determine dependence between *_11 and *_8
t.c:4: note: mark for run-time aliasing test between *_11 and *_8
t.c:4: note: versioning for alias required: can't determine dependence between *_15 and *_8
t.c:4: note: mark for run-time aliasing test between *_15 and *_8

Creating dr for *_11
analyze_innermost: success.
        base_address: b_10(D)
        offset from base address: 0
        constant offset from base address: 0
        step: 8
        aligned to: 128
        base_object: *b_10(D)
        Access function 0: {0B, +, 8}_1
Creating dr for *_15
analyze_innermost: success.
        base_address: b_10(D)
        offset from base address: 0
        constant offset from base address: 0
        step: 8
        aligned to: 128
        base_object: *b_10(D)
        Access function 0: {0B, +, 8}_1

somehow the equality test (which we have!) doesn't work here.
Comment 3 bin cheng 2016-04-20 15:42:17 UTC
Author: amker
Date: Wed Apr 20 15:41:45 2016
New Revision: 235289

URL: https://gcc.gnu.org/viewcvs?rev=235289&root=gcc&view=rev
Log:
	PR tree-optimization/56625
	PR tree-optimization/69489
	* tree-data-ref.h (DR_INNERMOST): New macro.
	* tree-if-conv.c (innermost_loop_behavior_hash): New class for
	hashing struct innermost_loop_behavior.
	(ref_DR_map): Remove.
	(innermost_DR_map): New map.
	(baseref_DR_map): Revise comment.
	(hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
	to innermost_DR_map accroding to its innermost loop behavior.
	(ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according
	to its innermost loop behavior.
	(if_convertible_loop_p_1): Remove intialization for ref_DR_map.
	Add initialization for innermost_DR_map.  Record memory reference
	in DR_BASE_ADDRESS if the reference is compound one or it doesn't
	have innermost loop behavior.
	(if_convertible_loop_p): Remove release for ref_DR_map.  Release
	innermost_DR_map.

	gcc/testsuite/ChangeLog
	PR tree-optimization/56625
	PR tree-optimization/69489
	* gcc.dg/vect/pr56625.c: New test.
	* gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.


Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
    trunk/gcc/testsuite/gcc.dg/vect/pr56625.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-data-ref.h
    trunk/gcc/tree-if-conv.c
Comment 4 bin cheng 2016-09-16 09:22:22 UTC
Fixed.