[PATCH GCC 6/9]Simplify control flow graph for vectorized loop

Bin.Cheng amker.cheng@gmail.com
Wed Sep 21 08:53:00 GMT 2016


On Wed, Sep 14, 2016 at 5:43 PM, Jeff Law <law@redhat.com> wrote:
> On 09/14/2016 07:21 AM, Richard Biener wrote:
>>
>> On Tue, Sep 6, 2016 at 8:52 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>
>>> Hi,
>>> This is the main patch improving control flow graph for vectorized loop.
>>> It generally rewrites loop peeling stuff in vectorizer.  As described in
>>> patch, for a typical loop to be vectorized like:
>>>
>>>        preheader:
>>>      LOOP:
>>>        header_bb:
>>>          loop_body
>>>          if (exit_loop_cond) goto exit_bb
>>>          else                goto header_bb
>>>        exit_bb:
>>>
>>> This patch peels prolog and epilog from the loop, adds guards skipping
>>> PROLOG and EPILOG for various conditions.  As a result, the changed CFG
>>> would look like:
>>>
>>>        guard_bb_1:
>>>          if (prefer_scalar_loop) goto merge_bb_1
>>>          else                    goto guard_bb_2
>>>
>>>        guard_bb_2:
>>>          if (skip_prolog) goto merge_bb_2
>>>          else             goto prolog_preheader
>>>
>>>        prolog_preheader:
>>>      PROLOG:
>>>        prolog_header_bb:
>>>          prolog_body
>>>          if (exit_prolog_cond) goto prolog_exit_bb
>>>          else                  goto prolog_header_bb
>>>        prolog_exit_bb:
>>>
>>>        merge_bb_2:
>>>
>>>        vector_preheader:
>>>      VECTOR LOOP:
>>>        vector_header_bb:
>>>          vector_body
>>>          if (exit_vector_cond) goto vector_exit_bb
>>>          else                  goto vector_header_bb
>>>        vector_exit_bb:
>>>
>>>        guard_bb_3:
>>>          if (skip_epilog) goto merge_bb_3
>>>          else             goto epilog_preheader
>>>
>>>        merge_bb_1:
>>>
>>>        epilog_preheader:
>>>      EPILOG:
>>>        epilog_header_bb:
>>>          epilog_body
>>>          if (exit_epilog_cond) goto merge_bb_3
>>>          else                  goto epilog_header_bb
>>>
>>>        merge_bb_3:
>>>
>>>
>>> Note this patch peels prolog and epilog only if it's necessary, as well
>>> as adds different guard_conditions/branches.  Also the first guard/branch
>>> could be further improved by merging it with loop versioning.
>>>
>>> Before this patch, up to 4 branch instructions need to be executed before
>>> the vectorized loop is reached in the worst case, while the number is
>>> reduced to 2 with this patch.  The patch also does better in compile time
>>> analysis to avoid unnecessary peeling/branching.
>>> From implementation's point of view, vectorizer needs to update induction
>>> variables and iteration bounds along with control flow changes.
>>> Unfortunately, it also becomes much harder to follow because slpeel_*
>>> functions updates SSA by itself, rather than using update_ssa interface.
>>> This patch tries to factor out SSA/IV/Niter_bound changes from CFG changes.
>>> This should make the implementation easier to read, and I think it maybe a
>>> step forward to replace slpeel_* functions with generic GIMPLE loop copy
>>> interfaces as Richard suggested.
>>
>>
>> I've skimmed over the patch and it looks reasonable to me.
>
> THanks.  I was maybe 15% of the way through the main patch.  Nothing that
> gave me cause for concern, but I wasn't ready to ACK it myself yet.
Hi Jeff,
Any update on this one?  Well, it might conflict with the epilogue
vectorization patch set?

Thanks,
bin



More information about the Gcc-patches mailing list