This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Unswitching outer loops.
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Yuri Rumyantsev <ysrumyan at gmail dot com>
- Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>, Igor Zamyatin <izamyatin at gmail dot com>
- Date: Mon, 3 Aug 2015 09:27:09 +0200
- Subject: Re: [PATCH] Unswitching outer loops.
- Authentication-results: sourceware.org; auth=none
- References: <CAEoMCqRi+=wqTGyJaYa+vQiYcxTz8SQGCk1KDjHTX9Ws2+sp6w at mail dot gmail dot com> <CAFiYyc3P5Pzy+dbJp1_j6wo8NKJFQGJJP_tntt5vX74xwXo0cw at mail dot gmail dot com> <CAEoMCqTX7Jw+uLy9MzrsygBKNjLKk2hfZKrCV7exD9kCD1SWAg at mail dot gmail dot com> <CAFiYyc30yxVbLVYCLX2bEszOP7J00+FZWL8DVvw+Eb7-T63epw at mail dot gmail dot com> <CAEoMCqTBc5ZJ7OeOGKzt_VZiRU_aJbF39QmYNxF4F8eYJ415PA at mail dot gmail dot com>
On Fri, Jul 31, 2015 at 1:17 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi Richard,
>
> I learned your updated patch for 23825 and it is more general in
> comparison with my.
> I'd like to propose you a compromise - let's consider my patch only
> for force-vectorize outer loop only to allow outer-loop
> vecctorization.
I don't see why we should special-case that if the approach in 23825
is sensible.
> Note that your approach will not hoist invariant
> guards if loops contains something else except for inner-loop, i.e. it
> won't be empty for taken branch.
Yes, it does not perform unswitching but guard hoisting. Note that this
is originally Zdenek Dvoraks patch.
> I also would like to answer on your last question - CFG cleanup is
> invoked to perform deletion of single-argument phi nodes from tail
> block through substitution - such phi's prevent outer-loop
> vectorization. But it is clear that such transformation can be done
> other pass.
Hmm, I wonder why the copy_prop pass after unswitching does not
get rid of them?
> What is your opinion?
My opinion is that if we want to enhance unswitching to catch this
(or similar) cases then we should make it a lot more general than
your pattern-matching approach. I see nothing that should prevent
us from considering unswitching non-innermost loops in general.
It should be only a cost consideration to not do non-innermost loop
unswitching (in addition to maybe a --param specifying the maximum
depth of a loop nest to unswitch).
So my first thought when seeing your patch still holds - the patch
looks very much too specific.
Richard.
> Yuri.
>
> 2015-07-28 13:50 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Thu, Jul 23, 2015 at 4:45 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Hi Richard,
>>>
>>> I checked that both test-cases from 23855 are sucessfully unswitched
>>> by proposed patch. I understand that it does not catch deeper loop
>>> nest as
>>> for (i=0; i<10; i++)
>>> for (j=0;j<n;j++)
>>> for (k=0;k<20;k++)
>>> ...
>>> but duplication of middle-loop does not look reasonable.
>>>
>>> Here is dump for your second test-case:
>>>
>>> void foo(int *ie, int *je, double *x)
>>> {
>>> int i, j;
>>> for (j=0; j<*je; ++j)
>>> for (i=0; i<*ie; ++i)
>>> x[i+j] = 0.0;
>>> }
>>> grep -i unswitch t6.c.119t.unswitch
>>> ;; Unswitching outer loop
>>
>> I was saying that why go with a limited approach when a patch (in
>> unknown state...)
>> is available that does it more generally? Also unswitching is quite
>> expensive compared
>> to "moving" the invariant condition.
>>
>> In your patch:
>>
>> + if (!nloop->force_vectorize)
>> + nloop->force_vectorize = true;
>> + if (loop->safelen != 0)
>> + nloop->safelen = loop->safelen;
>>
>> I see no guard on force_vectorize so = true looks bogus here. Please just use
>> copy_loop_info.
>>
>> + if (integer_nonzerop (cond_new))
>> + gimple_cond_set_condition_from_tree (cond_stmt, boolean_true_node);
>> + else if (integer_zerop (cond_new))
>> + gimple_cond_set_condition_from_tree (cond_stmt, boolean_false_node);
>>
>> gimple_cond_make_true/false (cond_stmt);
>>
>> btw, seems odd that we have to recompute which loop is the true / false variant
>> when we just fed a guard condition to loop_version. Can't we statically
>> determine whether loop or nloop has the in-loop condition true or false?
>>
>> + /* Clean-up cfg to remove useless one-argument phi in exit block of
>> + outer-loop. */
>> + cleanup_tree_cfg ();
>>
>> I know unswitching is already O(number-of-unswitched-loops * size-of-function)
>> because it updates SSA form after each individual unswitching (and it does that
>> because it invokes itself recursively on unswitched loops). But do you really
>> need to invoke CFG cleanup here?
>>
>> Richard.
>>
>>> Yuri.
>>>
>>> 2015-07-14 14:06 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>> On Fri, Jul 10, 2015 at 12:02 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Hi All,
>>>>>
>>>>> Here is presented simple transformation which tries to hoist out of
>>>>> outer-loop a check on zero trip count for inner-loop. This is very
>>>>> restricted transformation since it accepts outer-loops with very
>>>>> simple cfg, as for example:
>>>>> acc = 0;
>>>>> for (i = 1; i <= m; i++) {
>>>>> for (j = 0; j < n; j++)
>>>>> if (l[j] == i) { v[j] = acc; acc++; };
>>>>> acc <<= 1;
>>>>> }
>>>>>
>>>>> Note that degenerative outer loop (without inner loop) will be
>>>>> completely deleted as dead code.
>>>>> The main goal of this transformation was to convert outer-loop to form
>>>>> accepted by outer-loop vectorization (such test-case is also included
>>>>> to patch).
>>>>>
>>>>> Bootstrap and regression testing did not show any new failures.
>>>>>
>>>>> Is it OK for trunk?
>>>>
>>>> I think this is
>>>>
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=23855
>>>>
>>>> as well. It has a patch adding a invariant loop guard hoisting
>>>> phase to loop-header copying. Yeah, it needs updating to
>>>> trunk again I suppose. It's always non-stage1 when I come
>>>> back to that patch.
>>>>
>>>> Your patch seems to be very specific and only handles outer
>>>> loops of innermost loops.
>>>>
>>>> Richard.
>>>>
>>>>> ChangeLog:
>>>>> 2015-07-10 Yuri Rumyantsev <ysrumyan@gmail.com>
>>>>>
>>>>> * tree-ssa-loop-unswitch.c: Include "tree-cfgcleanup.h" and
>>>>> "gimple-iterator.h", add prototype for tree_unswitch_outer_loop.
>>>>> (tree_ssa_unswitch_loops): Add invoke of tree_unswitch_outer_loop.
>>>>> (tree_unswitch_outer_loop): New function.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>> * gcc.dg/tree-ssa/unswitch-outer-loop-1.c: New test.
>>>>> * gcc.dg/vect/vect-outer-simd-3.c: New test.