[PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops

Richard Guenther richard.guenther@gmail.com
Sun Dec 19 00:29:00 GMT 2010


On Thu, Dec 16, 2010 at 6:22 PM, Fang, Changpeng <Changpeng.Fang@amd.com> wrote:
> My initial intention is Not to unroll prologue and epilogue loops. An estimated trip count
> may not be that useful for the unrolling decision. To me, unrolling a loop that has at most
> 3 (or 7) iterations does not make sense. RTL unrolling does not use the estimated trip
> count to determine the unroll factor, and thus it may still unroll the loop 4 or 8 times if
> the loop is small ( #insns). To make things simple, we just don't unroll such loops.
>
> However, a prologue or epilogue loop may still be a hot loop, depending on the outer
> loops. It may still be beneficial to perform other optimizations on such loops, if the code
> size is not expanded multiple times.
>
> For prefetching of prologue or epilogue loops, we have two choices (1) prefetching not
> not unrolling, (2) not prefetching.  Which one do you prefer?

For small loop bodies it might make sense to completely peel the
prologue/epilogue loops (think of vectorizing doubles where those
loops roll at most once).  It would be nice to figure out if (or if not)
loop analysis (or later jump threading) is able to do that.

Richard.

> Thanks,
>
> Changpeng
>
>
>
> ________________________________________
> From: Zdenek Dvorak [rakdver@kam.mff.cuni.cz]
> Sent: Thursday, December 16, 2010 6:09 AM
> To: Richard Guenther
> Cc: Xinliang David Li; Fang, Changpeng; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, Loop optimizer]: Add logic to disable certain loop optimizations on pre-/post-loops
>
> Hi,
>
>> Btw, any reason why we do not use static profiles for number of iteration
>> estimates?  We after all _do_ use the static profile to guide the
>> maybe_hot/cold_bb tests.
>
> for loops for that we cannot determine the # of iterations statically,
> basically the only important predictors are PRED_LOOP_BRANCH and
> PRED_LOOP_EXIT, which predict that the loop will iterate about 10 times.  So,
> by using static profile, we would just learn that every such loop is expected
> to iterate 10 times, which is kind of useless,
>
> Zdenek
>



More information about the Gcc-patches mailing list