Bug 32375 - vectorized with alias check: can't determine dependence (array sections)
Summary: vectorized with alias check: can't determine dependence (array sections)
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.3.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: alias, missed-optimization
Depends on: 32075
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2007-06-17 15:02 UTC by Tim Prince
Modified: 2019-09-26 04:31 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-06-21 23:31:05


Attachments
source code test case (239 bytes, text/plain)
2007-06-17 15:04 UTC, Tim Prince
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Prince 2007-06-17 15:02:32 UTC
gfortran -O2  -ftree-vectorize -ftree-vectorizer-verbose=2 -c -v s414a.f

The source and destination sections of aa(:,:) do not overlap, unless there is a subscript over-run.  Even that case could be taken care of by loop reversal.

This is a simplification of a case from:
http://www.netlib.org/benchmark/vectors
Comment 1 Tim Prince 2007-06-17 15:04:55 UTC
Created attachment 13720 [details]
source code test case
Comment 2 Andrew Pinski 2007-06-18 03:13:55 UTC
I think some of this is related to PR 32075.
Comment 3 dorit 2007-07-01 12:46:31 UTC
Looks like a similar problem to PR32378:

(compute_affine_dependence
  (stmt_a =
D.1398_74 = (*aa_73(D))[D.1397_72])
  (stmt_b =
(*aa_73(D))[D.1393_68] = D.1408_88)
(subscript_dependence_tester
(analyze_overlapping_iterations
  (chrec_a = {D.1396_71 + 1, +, 1}_2)
  (chrec_b = {D.1392_67 + 1, +, 1}_2)
(analyze_siv_subscript
siv test failed: unimplemented.
)
  (overlap_iterations_a = not known
)
  (overlap_iterations_b = not known
)
)
(dependence classified: scev_not_known)

Comment 4 Sebastian Pop 2007-10-30 17:01:03 UTC
Subject: Re:  not vectorized: can't determine dependence (array sections)

On 1 Jul 2007 12:46:31 -0000, dorit at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
> (dependence classified: scev_not_known)
>

I still see the data dependence problem, but now the loop gets
vectorized on i686-linux with "-O3 -msse2" and also with "-O2
-ftree-vectorize -msse2"

> Looks like a similar problem to PR32378:

The same with this other PR.  The testcase gets vectorized even with
the failed dependence test.

Sebastian
Comment 5 Sebastian Pop 2007-10-30 17:36:55 UTC
Subject: Re:  not vectorized: can't determine dependence (array sections)

> The testcase gets vectorized even with the failed dependence test.
>

This is okay.  The test for non-dependence is added to the run-time
condition with which we version the code.  We call the function
vect_mark_for_runtime_alias_test instead of failing because of the
undetermined dependence test.

Sebastian
Comment 6 sebpop@gmail.com 2007-10-30 17:59:16 UTC
Subject: Re:  not vectorized: can't determine dependence (array sections)

I would like to keep the two bugs, PR32375 and PR32378, open as we can
vectorize them without having to version the code.
Comment 7 Richard Biener 2012-07-13 08:58:32 UTC
Link to vectorizer missed-optimization meta-bug.
Comment 8 Richard Biener 2012-07-16 13:03:10 UTC
The issue is that we have

<bb 5>:
  # j_2 = PHI <2(4), j_30(10)>
  pretmp.38_71 = (integer(kind=8)) j_2;
  pretmp.38_72 = stride.2_6 * pretmp.38_71;
  pretmp.38_73 = offset.3_8 + pretmp.38_72;
  pretmp.39_75 = j_2 + -1;
  pretmp.40_76 = (integer(kind=8)) pretmp.39_75;
  pretmp.40_77 = stride.2_6 * pretmp.40_76;
  pretmp.40_78 = offset.3_8 + pretmp.40_77;

<bb 6>:
  # i_37 = PHI <i_29(7), 1(5)>
...
  D.1940_22 = *aa_21(D)[D.1939_20];
  *aa_21(D)[D.1934_14] = D.1950_28;

and data-dependence analysis sees

(analyze_overlapping_iterations
  (chrec_a = {pretmp.40_78 + 1, +, 1}_2)
  (chrec_b = {pretmp.38_73 + 1, +, 1}_2)

and

(analyze_overlapping_iterations
  (chrec_a = {{(stride.2_6 + offset.3_8) + 1, +, stride.2_6}_1, +, 1}_2)
  (chrec_b = {{(stride.2_6 * 2 + offset.3_8) + 1, +, stride.2_6}_1, +, 1}_2)