This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH PR71734] Add missed check that reference defined inside loop.


On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> The patch proposed by you does not work properly for
> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
> been cached as dependent for outer loop and loop is not vectorized:
>
>  g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
> -fdump-tree-vect-details
> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect
> <not found>
>
> You missed additional check I added before check on cached dependence.

Ok, but it should get the correctness right?

I suppose that if you move the cache checks inside the else clause it
would work?
I'd be ok with that change.

Richard.

> 2016-08-09 13:00 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Yes it is impossible since all basic blocks are handled from outer
>>> loops to innermost so we can not have the sequence with wrong
>>> dependence, i.e. we cached that reference is independent (due to
>>> safelen) but the same reference in outer loop must be evaluated as
>>> dependent. So we must re-evaluate only dependent references in loops
>>> having non-zero safelen attribute.
>>
>> Hmm.  I don't like depending on this implementation detail.  Does the
>> attached patch work
>> which simply avoids any positive/negative caching on safelen affected
>> refs?  It also makes
>> the query cheaper by avoiding the dive into child loops.
>>
>> Richard.
>>
>>> 2016-08-09 11:44 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> I added additional check before caching dependencies since (1) all
>>>>> statements in loop are handled in loop postorder order, i.e. form
>>>>> outer to inner; (2) we can change dependency for reference in subloops
>>>>> which have non-zero safelen attribute. So I propose to re-evaluate it
>>>>> in such cases. I don't see why we need to avoid dependence caching for
>>>>> all loop nests since pragma omp simd is used very rarely.
>>>>
>>>> You think it is impossible to construct a testcase which hits the
>>>> correctness issue?
>>>> "very rarely" is not a good argument to generate wrong code.
>>>>
>>>> Richard.
>>>>
>>>>> 2016-08-05 16:50 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> Here is updated patch which implements your proposal - I pass loop
>>>>>>> instead of stmt to determine either REF is defined inside LOOP nest or
>>>>>>> not. I checked that for pr70729-nest.cc the reference this->S_n  for
>>>>>>> statements which are out of innermost loop are  not considered as
>>>>>>> independent as you pointed out.
>>>>>>>
>>>>>>> Regression testing did not show any new failures and both failed tests
>>>>>>> from libgomp.fortran suite now passed.
>>>>>>>
>>>>>>> Is it OK for trunk?
>>>>>>
>>>>>> I don't quite understand
>>>>>>
>>>>>> +  /* Ignore dependence for loops having greater safelen.  */
>>>>>> +  if (new_safelen == safelen
>>>>>> +      && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, stored_p)))
>>>>>>      return false;
>>>>>>
>>>>>> this seems to suggest (correctly I think) that we cannot rely on the caching
>>>>>> for safelen, neither for optimal results (you seem to address that) but also
>>>>>> not for correctness (we cache the no-dep result from a safelen run and
>>>>>> then happily re-use that info for a ref that is not safe for safelen).
>>>>>>
>>>>>> It seems to me we need to avoid any caching if we made things independent
>>>>>> because of safelen and simply not do the dep test afterwards.  this means
>>>>>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great way
>>>>>> to do this w/o confusing the control flow).
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> ChangeLog:
>>>>>>> 2016-08-05  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>
>>>>>>> PR tree-optimization/71734
>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
>>>>>>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
>>>>>>> ref_indep_loop_p.
>>>>>>> (ref_indep_loop_p_1): Fix commentary.
>>>>>>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
>>>>>>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
>>>>>>> dependencde for loop having greater safelen value, pass REF_LOOP in
>>>>>>> recursive call.
>>>>>>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>>>>>>>
>>>>>>> 2016-08-03 16:44 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Hi Richard.
>>>>>>>>>
>>>>>>>>> It turned out that the fix proposed by you does not work for liggomp
>>>>>>>>> tests simd3 and simd4.
>>>>>>>>> The reason is that we can't change safelen value for references not
>>>>>>>>> defined inside loop. So I add missed check on it to patch.
>>>>>>>>> Is it OK for trunk?
>>>>>>>>
>>>>>>>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as
>>>>>>>> that operation can end up being quadratic in the loop depth/width.
>>>>>>>>
>>>>>>>> But I also wonder about correctness given that LIM "commons"
>>>>>>>> references.  So we can have
>>>>>>>>
>>>>>>>>   for (;;)
>>>>>>>>     .. = ref;  (1)
>>>>>>>>     for (;;) // safelen == 2  (2)
>>>>>>>>       ... = ref;
>>>>>>>>
>>>>>>>> and when looking at the ref at (1) which according to you should not
>>>>>>>> have safelen applied your function will happily return that ref is defined
>>>>>>>> in the inner loop.
>>>>>>>>
>>>>>>>> So it looks like to be able to apply safelen the caller of ref_indep_loop_p
>>>>>>>> needs to pass down a ref plus a location (a stmt).  In which case your
>>>>>>>> function can simply use flow_loop_nested_p (loop, gimple_bb
>>>>>>>> (stmt)->loop_father);
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> ChangeLog:
>>>>>>>>> 2016-07-29  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>
>>>>>>>>> PR tree-optimization/71734
>>>>>>>>> * tree-ssa-loop-im.c (ref_defined_in_loop_p): New function.
>>>>>>>>> (ref_indep_loop_p_2): Change SAFELEN value for REF defined inside LOOP.
>>>>>>>>>
>>>>>>>>> 2016-07-29 13:08 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>>>>>>>>>> Sorry H.J.
>>>>>>>>>>
>>>>>>>>>> I checked both these tests manually but forgot to pass "-fopenmp" option.
>>>>>>>>>> I will fix the issue asap.
>>>>>>>>>>
>>>>>>>>>> 2016-07-29 0:33 GMT+03:00 H.J. Lu <hjl.tools@gmail.com>:
>>>>>>>>>>> On Thu, Jul 28, 2016 at 6:49 AM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>>> Richard,
>>>>>>>>>>>>
>>>>>>>>>>>> I prepare a patch which is based on yours. New test is also included.
>>>>>>>>>>>> Bootstrapping and regression testing did not show any new failures.
>>>>>>>>>>>> Is it OK for trunk?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>> ChangeLog:
>>>>>>>>>>>> 2016-07-28  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>>>>
>>>>>>>>>>>> PR tree-optimization/71734
>>>>>>>>>>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Pass value of safelen
>>>>>>>>>>>> attribute instead of REF_LOOP and use it.
>>>>>>>>>>>> (ref_indep_loop_p_2): Use SAFELEN argument instead of REF_LOOP and
>>>>>>>>>>>> set it for Loops having non-zero safelen attribute.
>>>>>>>>>>>> (ref_indep_loop_p): Pass zero as initial value for safelen.
>>>>>>>>>>>> gcc/testsuite/ChangeLog:
>>>>>>>>>>>>         * g++.dg/vect/pr70729-nest.cc: New test.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Does this cause
>>>>>>>>>>>
>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>> test
>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-1.f90   -O3 -g  execution test
>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>> test
>>>>>>>>>>> FAIL: libgomp.fortran/pr71734-2.f90   -O3 -g  execution test
>>>>>>>>>>>
>>>>>>>>>>> on AVX machines and
>>>>>>>>>>>
>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>> test
>>>>>>>>>>> FAIL: libgomp.fortran/simd3.f90   -O3 -g  execution test
>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90   -O3 -fomit-frame-pointer
>>>>>>>>>>> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution
>>>>>>>>>>> test
>>>>>>>>>>> FAIL: libgomp.fortran/simd4.f90   -O3 -g  execution test
>>>>>>>>>>>
>>>>>>>>>>> on non-AVX machines?
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> H.J.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]