This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Missed optimization in PRE?
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: "Bin.Cheng" <amker dot cheng at gmail dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Mon, 9 Apr 2012 13:02:35 +0200
- Subject: Re: Missed optimization in PRE?
- References: <CAHFci2_d_vcB_-iag7p6MeCOjKL60YrAreAbz6AWgY8+=3rCQg@mail.gmail.com> <CAFiYyc1EZM1pN5Zj19HugN_Ex=Nb3bf8sSquMiPkgCqOi-A_9Q@mail.gmail.com> <CAHFci2_4QJFN-zbN8vsf4xe+gavs5KJCV_SSworyvgj3W5GG4Q@mail.gmail.com> <CAFiYyc1SD6vdqMYjKVPPMOMLDXo6+5YO-39qQ86+F=02ZEuQNQ@mail.gmail.com> <CAHFci28gpFyRQvmJA-wv9EhRQjH46J137BD5DYBHw8-fermQBA@mail.gmail.com> <CAFiYyc0NrFsu=3iVUAFCVXaJuKj1c+mP3dFMeYOrVPPFd5mO+w@mail.gmail.com> <CAHFci2_egYpzF00z_FZ4swqc+1Z+fcrtZCjkrSKvww3FcaouVA@mail.gmail.com> <CAHFci2_P7L4=EGisnROrpq=8iuSTHh6KPFK6ZfgfHAdSei=bow@mail.gmail.com>
On Mon, Apr 9, 2012 at 8:00 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Mar 30, 2012 at 5:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Fri, Mar 30, 2012 at 4:15 PM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Thu, Mar 29, 2012 at 5:25 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>> On Thu, Mar 29, 2012 at 6:14 PM, Richard Guenther
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Thu, Mar 29, 2012 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>>> On Thu, Mar 29, 2012 at 6:07 PM, Richard Guenther
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>> On Thu, Mar 29, 2012 at 12:02 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>>>>> Hi,
>>>>>>>> Following is the tree dump of 094t.pre for a test program.
>>>>>>>> Question is loads of D.5375_12/D.5375_14 are redundant on path <bb2,
>>>>>>>> bb7, bb5, bb6>,
>>>>>>>> but why not lowered into basic block 3, where it is used.
>>>>>>>>
>>>>>>>> BTW, seems no tree pass handles this case currently.
>>>>>>>
>>>>>>> tree-ssa-sink.c should do this.
>>>>>>>
>>>>>> It does not work for me, I will double check and update soon.
>>>>>
>>>>> Well, "should" as in, it's the place to do it. ?And certainly the pass can sink
>>>>> loads, so this must be a missed optimization.
>>>>>
>>>> Curiously, it is said explicitly that "We don't want to sink loads from memory."
>>>> in tree-ssa-sink.c function statement_sink_location, and the condition is
>>>>
>>>> ?if (stmt_ends_bb_p (stmt)
>>>> ? ? ?|| gimple_has_side_effects (stmt)
>>>> ? ? ?|| gimple_has_volatile_ops (stmt)
>>>> ? ? ?|| (gimple_vuse (stmt) && !gimple_vdef (stmt))
>>>> <-----------------check load
>>>> ? ? ?|| (cfun->has_local_explicit_reg_vars
>>>> ? ? ? ? ?&& TYPE_MODE (TREE_TYPE (gimple_assign_lhs (stmt))) == BLKmode))
>>>> ? ?return false;
>>>>
>>>> I haven't found any clue about this decision in ChangeLogs.
>>>
>>> Ah, that's probably because usually you want to hoist loads and sink stores,
>>> separating them (like a scheduler would do). ?We'd want to restrict sinking
>>> of loads to sink into not post-dominated regions (thus where they end up
>>> being executed less times).
>
> Hi Richard,
> I am testing a patch to sink load of memory to proper basic block.
> Everything goes fine except auto-vectorization, sinking of load sometime
> corrupts the canonical form of data references. I haven't touched auto-vec
> before and cannot tell whether it's good or bad to do sink before auto-vec.
> For example, the slp-cond-1.c
>
> <bb 3>:
> ?# i_39 = PHI <i_32(11), 0(2)>
> ?D.5150_5 = i_39 * 2;
> ?D.5151_10 = D.5150_5 + 1;
> ?D.5153_17 = a[D.5150_5];
> ?D.5154_19 = b[D.5150_5];
> ?if (D.5153_17 >= D.5154_19)
> ? ?goto <bb 9>;
> ?else
> ? ?goto <bb 4>;
>
> <bb 9>:
> ?d0_6 = d[D.5150_5]; ? ?<-----this is sunk from bb3
> ?goto <bb 5>;
>
> <bb 4>:
> ?e0_8 = e[D.5150_5]; ? ?<-----this is sunk from bb3
>
> <bb 5>:
> ?# d0_2 = PHI <d0_6(9), e0_8(4)>
> ?k[D.5150_5] = d0_2;
> ?D.5159_26 = a[D.5151_10];
> ?D.5160_29 = b[D.5151_10];
> ?if (D.5159_26 >= D.5160_29)
> ? ?goto <bb 10>;
> ?else
> ? ?goto <bb 6>;
>
>
> <bb 10>:
> ?d1_11 = d[D.5151_10]; ? ?<-----this is sunk from bb3
> ?goto <bb 7>;
>
> <bb 6>:
> ?e1_14 = e[D.5151_10]; ? ?<-----this is sunk from bb3
>
> <bb 7>:
> .......
>
> I will look into auto-vect but not sure how to handle this case.
>
> Any comments? Thanks very much.
Simple - the vectorizer expects empty latch blocks. So simply
never sink stuff into latch-blocks - I think the current code already
tries to avoid that for regular computations.
Richard.
> --
> Best Regards.