This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH PR71734] Add missed check that reference defined inside loop.
- From: Yuri Rumyantsev <ysrumyan at gmail dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 9 Aug 2016 15:05:35 +0300
- Subject: Re: [PATCH PR71734] Add missed check that reference defined inside loop.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqTfomhZ7PBOz+Quyx_1ZKsiOw9+OE=xSvZAphQezPE8vQ@mail.gmail.com> <CAEoMCqSroBFn_PPu+7N-2+c_v4SpPzfe03xUS=wK0ibwJNAhAg@mail.gmail.com> <CAFiYyc0VRd3rAR0S+-WWiA=0tfRp_DrgbjMwvCAo3tOfRTA-2g@mail.gmail.com> <CAEoMCqTVc2xFUz-GqnVMP66HQZSAyQG8KsbUkWsA7kDYb=yypw@mail.gmail.com> <CAFiYyc02xah6YyojHeTcK2+QY8iT7qRGawtYeJzd8C3gCqdPoQ@mail.gmail.com> <CAEoMCqSN8uMc7=rfvtRwve=CcD39v39fYWWs2ZFjc-t7tZ+e-A@mail.gmail.com> <CAFiYyc2LWtdLXADo799LJSfAX3EPxaVEoaOrSN-D2ahSN=+v4g@mail.gmail.com> <CAEoMCqSgtuGOQSHCwD7mmTxg7ZR1yt39oxeJrfK74AsgYTmpaw@mail.gmail.com> <CAFiYyc2uqQFYvJZDU8Yy0ZuOMYrr3hhhN2VyvSbCyEa-2JZopw@mail.gmail.com> <CAFiYyc3gS7W8vQRLiq+o2My3+e7oFAz+PsKhg2Xa+2PZSQT70w@mail.gmail.com> <CAEoMCqQYXtHgJtSmBnTMhKjRd9NjtCWR_rRiC+ndaDTTxzCJKA@mail.gmail.com> <CAFiYyc3UqfahbU6K-DEF6RpARKLmEYvMkabpWAS5qD+3oESk3w@mail.gmail.com> <CAEoMCqRyXetGHjNtDG-gS1AuV8a3TmMLVdhEMffSzso-JLHWhQ@mail.gmail.com> <578E4767.7040704@foss.arm.com> <CAEoMCqRpA6Dp3mSWDJYtOB0cQjeJHCa7WNruuUHEJ_Ng4QSAWw@mail.gmail.com> <CAEoMCqR3=nwgQMf6vE8A=OkWBbf_W3H-8gNSvkHDoch0rmWbYw@mail.gmail.com> <CAFiYyc0wG_f0YUyM0BotfLg+j-puvKCdMUgwBJsV6D15fBVWdw@mail.gmail.com> <CAEoMCqR1uNw=5_Ta7ooxbRz3HxqEsAV50kwQuoWjy_WTthaiCw@mail.gmail.com> <CAMe9rOoZX8RGHxz3fuU1ixbWSCx3cNNp34SgFDUhc8eAhc0jpg@mail.gmail.com> <CAEoMCqTJCPTz_hv1bHxme9S151ErPxjcuxp1s+ZeSL_TF3FQUw@mail.gmail.com> <CAEoMCqQQj6hbZQDAYKDmaGKUvJge9uzrzySyJB3-DUEPR3t3YQ@mail.gmail.com> <CAFiYyc2SJXBivoRz-BB8jyHvCWuNSHNSqtbn5mVEB60=2jcFbQ@mail.gmail.com> <CAEoMCqQo8wkH7M9uzrfnQ58EjaKSmmTetPVna2ug=J7Z0RHiOg@mail.gmail.com> <CAFiYyc1D9njK3YgZ7JepYX9momidA+PVogynkYBp12qX9-43eg@mail.gmail.com> <CAEoMCqRyCr804MQoRNNKYaajSjFE=guN8m1dvvV3jKUzpyNFMg@mail.gmail.com> <CAFiYyc3C7S18HejjE8BpGYCrPaujLT9K9AETZqejKjbbUW=xtQ@mail.gmail.com> <CAEoMCqRwWNVDwrmmLZhXHu_EjEbixMn15TWmTmGjHuwDCfNeUw@mail.gmail.com> <CAFiYyc3vGxm4m74b-RjcTy9eOD87nmVOYpKxkq+adfsETMQiZw@mail.gmail.com> <CAEoMCqS67FS8n-p6tyoCqDQh1E4VU5UaQ+dqJ3cm4zQYkJa23Q@mail.gmail.com> <CAFiYyc0cAcvPbt9KNWWUEt2NmrKGjj5+1LpLsHEZFVDmFvRuFw@mail.gmail.com>
Richard,
I checked that this move helps.
Does it mean that I've got approval to integrate it to trunk?
2016-08-09 14:33 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
> On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Richard,
>>
>> The patch proposed by you does not work properly for
>> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
>> been cached as dependent for outer loop and loop is not vectorized:
>>
>> g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
>> -fdump-tree-vect-details
>> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect
>> <not found>
>>
>> You missed additional check I added before check on cached dependence.
>
> Ok, but it should get the correctness right?
>
> I suppose that if you move the cache checks inside the else clause it
> would work?
> I'd be ok with that change.
>
> Richard.
>
>> 2016-08-09 13:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Yes it is impossible since all basic blocks are handled from outer
>>>> loops to innermost so we can not have the sequence with wrong
>>>> dependence, i.e. we cached that reference is independent (due to
>>>> safelen) but the same reference in outer loop must be evaluated as
>>>> dependent. So we must re-evaluate only dependent references in loops
>>>> having non-zero safelen attribute.
>>>
>>> Hmm. I don't like depending on this implementation detail. Does the
>>> attached patch work
>>> which simply avoids any positive/negative caching on safelen affected
>>> refs? It also makes
>>> the query cheaper by avoiding the dive into child loops.
>>>
>>> Richard.
>>>
>>>> 2016-08-09 11:44 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Richard,
>>>>>>
>>>>>> I added additional check before caching dependencies since (1) all
>>>>>> statements in loop are handled in loop postorder order, i.e. form
>>>>>> outer to inner; (2) we can change dependency for reference in subloops
>>>>>> which have non-zero safelen attribute. So I propose to re-evaluate it
>>>>>> in such cases. I don't see why we need to avoid dependence caching for
>>>>>> all loop nests since pragma omp simd is used very rarely.
>>>>>
>>>>> You think it is impossible to construct a testcase which hits the
>>>>> correctness issue?
>>>>> "very rarely" is not a good argument to generate wrong code.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> 2016-08-05 16:50 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>> Richard,
>>>>>>>>
>>>>>>>> Here is updated patch which implements your proposal - I pass loop
>>>>>>>> instead of stmt to determine either REF is defined inside LOOP nest or
>>>>>>>> not. I checked that for pr70729-nest.cc the reference this->S_n for
>>>>>>>> statements which are out of innermost loop are not considered as
>>>>>>>> independent as you pointed out.
>>>>>>>>
>>>>>>>> Regression testing did not show any new failures and both failed tests
>>>>>>>> from libgomp.fortran suite now passed.
>>>>>>>>
>>>>>>>> Is it OK for trunk?
>>>>>>>
>>>>>>> I don't quite understand
>>>>>>>
>>>>>>> + /* Ignore dependence for loops having greater safelen. */
>>>>>>> + if (new_safelen == safelen
>>>>>>> + && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, stored_p)))
>>>>>>> return false;
>>>>>>>
>>>>>>> this seems to suggest (correctly I think) that we cannot rely on the caching
>>>>>>> for safelen, neither for optimal results (you seem to address that) but also
>>>>>>> not for correctness (we cache the no-dep result from a safelen run and
>>>>>>> then happily re-use that info for a ref that is not safe for safelen).
>>>>>>>
>>>>>>> It seems to me we need to avoid any caching if we made things independent
>>>>>>> because of safelen and simply not do the dep test afterwards. this means
>>>>>>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great way
>>>>>>> to do this w/o confusing the control flow).
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>> ChangeLog:
>>>>>>>> 2016-08-05 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>
>>>>>>>> PR tree-optimization/71734
>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
>>>>>>>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
>>>>>>>> ref_indep_loop_p.
>>>>>>>> (ref_indep_loop_p_1): Fix commentary.
>>>>>>>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
>>>>>>>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
>>>>>>>> dependencde for loop having greater safelen value, pass REF_LOOP in
>>>>>>>> recursive call.
>>>>>>>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>>>>>>>>
>>>>>>>> 2016-08-03 16:44 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>> Hi Richard.
>>>>>>>>>>
>>>>>>>>>> It turned out that the fix proposed by you does not work for liggomp
>>>>>>>>>> tests simd3 and simd4.
>>>>>>>>>> The reason is that we can't change safelen value for references not
>>>>>>>>>> defined inside loop. So I add missed check on it to patch.
>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>
>>>>>>>>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as
>>>>>>>>> that operation can end up being quadratic in the loop depth/width.
>>>>>>>>>
>>>>>>>>> But I also wonder about correctness given that LIM "commons"
>>>>>>>>> references. So we can have
>>>>>>>>>
>>>>>>>>> for (;;)
>>>>>>>>> .. = ref; (1)
>>>>>>>>> for (;;) // safelen == 2 (2)
>>>>>>>>> ... = ref;
>>>>>>>>>
>>>>>>>>> and when looking at the ref at (1) which according to you should not
>>>>>>>>> have safelen applied your function will happily return that ref is defined
>>>>>>>>> in the inner loop.
>>>>>>>>>
>>>>>>>>> So it looks like to be able to apply safelen the caller of ref_indep_loop_p
>>>>>>>>> needs to pass down a ref plus a location (a stmt). In which case your
>>>>>>>>> function can simply use flow_loop_nested_p (loop, gimple_bb
>>>>>>>>> (stmt)->loop_father);
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> ChangeLog:
>>>>>>>>>> 2016-07-29 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>>
>>>>>>>>>> PR tree-optimization/71734
>>>>>>>>>> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function.
>>>>>>>>>> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP.
>>>>>>>>>>
>>>>>>>>>> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>>>>>>>>>>> Sorry H.J.
>>>>>>>>>>>
>>>>>>>>>>> I checked both these tests manually but forgot to pass "-fopenmp" option.
>>>>>>>>>>> I will fix the issue asap.
>>>>>>>>>>>
>>>>>>>>>>> 2016-07-29 0:33 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>>>>>>>>>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>>> Richard,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I prepare a patch which is based on yours. New test is also included.
>>>>>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>> ChangeLog:
>>>>>>>>>>>>> 2016-07-28 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>> PR tree-optimization/71734
>>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen
>>>>>>>>>>>>> attribute instead of REF_LOOP and use it.
>>>>>>>>>>>>> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and
>>>>>>>>>>>>> set it for Loops having non-zero safelen attribute.
>>>>>>>>>>>>> (ref_indep_loop_p): Pass zero as initial value for safelen.
>>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>> * g++.dg/vect/pr70729-nest.cc: New test.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Does this cause
>>>>>>>>>>>>
>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -fomit-frame-pointer
>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution
>>>>>>>>>>>> test
>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90 -O3 -g execution test
>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -fomit-frame-pointer
>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution
>>>>>>>>>>>> test
>>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90 -O3 -g execution test
>>>>>>>>>>>>
>>>>>>>>>>>> on AVX machines and
>>>>>>>>>>>>
>>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90 -O3 -fomit-frame-pointer
>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution
>>>>>>>>>>>> test
>>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90 -O3 -g execution test
>>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90 -O3 -fomit-frame-pointer
>>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions execution
>>>>>>>>>>>> test
>>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90 -O3 -g execution test
>>>>>>>>>>>>
>>>>>>>>>>>> on non-AVX machines?
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> H.J.