[PATCH PR78114]Refine gfortran.dg/vect/fast-math-mgrid-resid.f

Thu Nov 17 10:27:00 GMT 2016

On Thu, Nov 17, 2016 at 8:32 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Nov 16, 2016 at 6:20 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> Currently test gfortran.dg/vect/fast-math-mgrid-resid.f checks all predictive commoning opportunities for all possible loops.  This makes it fragile because vectorizer may peel the loop differently, as well as may choose different vector factors.  For example, on x86-solaris, vectorizer doesn't peel for prologue loop; for -march=haswell, the case is long time failed because vector factor is 4, while iteration distance of predictive commoning opportunity is smaller than 4.  This patch refines it by only checking if predictive commoning variable is created when vector factor is 2; or vectorization variable is created when factor is 4.  This works since we have only one main loop, and only one vector factor can be used.
>> Test result checked for various x64 targets.  Is it OK?
>
> I think that as you write the test is somewhat fragile.  But rather
> than adjusting the scanning like you do
> I'd add --param vect-max-peeling-for-alignment=0 and -mprefer-avx128
In this way, is it better to add "--param
vect-max-peeling-for-alignment=0" for all targets?  Otherwise we still
need to differentiate test string to handle different targets.  But I
have another question here: what if a target can't handle unaligned
access and vectorizer have to peel for alignment for it?
Also do you think it's ok to check predictive commoning PHI node as below?
# vectp_u.122__lsm0.158_94 = PHI <vectp_u.122__lsm0.158_95(8), _96(6)>
In this way, we don't need to take possible prologue/epilogue loops
into consideration.

> as additional option on x86_64-*-* i?86-*-*.
>
> Your new pattern would fail with avx512 if vector (8) real would be used.
>
> What's the actual change that made the testcase fail btw?
There are two cases.
A) After vect_do_peeling change, vectorizer may only peel one
iteration for prologue loop (if vf == 2), below test string was added
for this reason:
! { dg-final { scan-tree-dump-times "Loop iterates only 1 time,
nothing to do" 1 "pcom" } }
This fails on x86_64 solaris because prologue loop is not peeled at all.
B) Depending on ilp, I think below test strings fail for long time with haswell:
! { dg-final { scan-tree-dump-times "Executing predictive commoning
without unrolling" 1 "pcom" { target lp64 } } }
! { dg-final { scan-tree-dump-times "Executing predictive commoning
without unrolling" 2 "pcom" { target ia32 } } }
Because vectorizer choose vf==4 in this case, and there is no
predictive commoning opportunities at all.
Also the newly added test string fails in this case too because the
prolog peeled iterates more than 1 times.

Thanks,
bin
>
> Richard.
>
>> Thanks,
>> bin
>>
>> gcc/testsuite/ChangeLog
>> 2016-11-16  Bin Cheng  <bin.cheng@arm.com>
>>
>>         PR testsuite/78114
>>         * gfortran.dg/vect/fast-math-mgrid-resid.f: Refine test by
>>         checking predictive commining variables in vectorized loop
>>         wrto vector factor.