gfortran -O2 -ftree-vectorize -ftree-vectorizer-verbose=2 -c -v s414a.f The source and destination sections of aa(:,:) do not overlap, unless there is a subscript over-run. Even that case could be taken care of by loop reversal. This is a simplification of a case from: http://www.netlib.org/benchmark/vectors
Created attachment 13720 [details] source code test case
I think some of this is related to PR 32075.
Looks like a similar problem to PR32378: (compute_affine_dependence (stmt_a = D.1398_74 = (*aa_73(D))[D.1397_72]) (stmt_b = (*aa_73(D))[D.1393_68] = D.1408_88) (subscript_dependence_tester (analyze_overlapping_iterations (chrec_a = {D.1396_71 + 1, +, 1}_2) (chrec_b = {D.1392_67 + 1, +, 1}_2) (analyze_siv_subscript siv test failed: unimplemented. ) (overlap_iterations_a = not known ) (overlap_iterations_b = not known ) ) (dependence classified: scev_not_known)
Subject: Re: not vectorized: can't determine dependence (array sections) On 1 Jul 2007 12:46:31 -0000, dorit at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > (dependence classified: scev_not_known) > I still see the data dependence problem, but now the loop gets vectorized on i686-linux with "-O3 -msse2" and also with "-O2 -ftree-vectorize -msse2" > Looks like a similar problem to PR32378: The same with this other PR. The testcase gets vectorized even with the failed dependence test. Sebastian
Subject: Re: not vectorized: can't determine dependence (array sections) > The testcase gets vectorized even with the failed dependence test. > This is okay. The test for non-dependence is added to the run-time condition with which we version the code. We call the function vect_mark_for_runtime_alias_test instead of failing because of the undetermined dependence test. Sebastian
Subject: Re: not vectorized: can't determine dependence (array sections) I would like to keep the two bugs, PR32375 and PR32378, open as we can vectorize them without having to version the code.
Link to vectorizer missed-optimization meta-bug.
The issue is that we have <bb 5>: # j_2 = PHI <2(4), j_30(10)> pretmp.38_71 = (integer(kind=8)) j_2; pretmp.38_72 = stride.2_6 * pretmp.38_71; pretmp.38_73 = offset.3_8 + pretmp.38_72; pretmp.39_75 = j_2 + -1; pretmp.40_76 = (integer(kind=8)) pretmp.39_75; pretmp.40_77 = stride.2_6 * pretmp.40_76; pretmp.40_78 = offset.3_8 + pretmp.40_77; <bb 6>: # i_37 = PHI <i_29(7), 1(5)> ... D.1940_22 = *aa_21(D)[D.1939_20]; *aa_21(D)[D.1934_14] = D.1950_28; and data-dependence analysis sees (analyze_overlapping_iterations (chrec_a = {pretmp.40_78 + 1, +, 1}_2) (chrec_b = {pretmp.38_73 + 1, +, 1}_2) and (analyze_overlapping_iterations (chrec_a = {{(stride.2_6 + offset.3_8) + 1, +, stride.2_6}_1, +, 1}_2) (chrec_b = {{(stride.2_6 * 2 + offset.3_8) + 1, +, stride.2_6}_1, +, 1}_2)