This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] [RFC] loop index promotion pass
On Wed, May 20, 2009 at 4:23 PM, Nathan Froyd <froydnj@codesourcery.com> wrote:
> On Wed, May 20, 2009 at 03:51:11PM +0200, Richard Guenther wrote:
>> On Wed, May 20, 2009 at 3:39 PM, Nathan Froyd <froydnj@codesourcery.com> wrote:
>> > If INDEX is not modified elsewhere in the loop, INDEX and END are of the
>> > same type, and INCR is 1, then we know overflow does not occur. ?This
>> > assumes INDEX is signed: I see after looking at
>> > analyze_loop_index_definition_pattern that it has comments about the
>> > safety of the transformation if INDEX is unsigned, but it never checks
>> > for TYPE_UNSIGNED. ?Those checks might have gotten lost in the
>> > conversion to tuples; I will need to fix that.
>>
>> Hmm. ?So use simple_iv () to check if INDEX is of this kind. ?Indeed
>> most of this would fit better in IVOPTs (simply add promoted
>> candidates and adjust their cost) like steven said.
>
> I don't recall Steven's mail and I can't seem to find it in the
> archives. ?Was this on IRC?
>
> I initially tried doing this pass using the loop optimization superpass,
> simple_iv and friends. ?IIRC, there were two problems:
>
> - The loop indices weren't recognized as IVs because of the type
> ?conversions done to them when being incremented and as part of the
> ?loop exit testing. ?(Possibly just the incrementing; I don't remember
> ?that clearly and don't know the loop optimization bits well enough.)
> ?Therefore, simple_iv &co was not an option.
It would be nice to have a smaller testcase that shows this.
> - After you've promoted the variables, there are not enough scalar
> ?optimizations running to make use of the newly promoted IVs. ?I think
> ?on the relevant benchmarks, there were a number of opportunities for
> ?FRE to run, for instance. ?Hence the current placement of the pass.
True. I am playing with the idea of moving PRE to after the loop optimizations
for this (and for other reasons) - I just didn't get around benchmarking that.
> If the limitations with IV recognition can be fixed, then I'm happy to
> make the pass use them. ?But the pass needs to be placed much earlier in
> the pipeline than where IVOPTs &co run.
It is indeed a pass ordering problem we have here. I am not totally against
a separate pass for this canonicalization, but it should try to build upon
existing infrastructure to make maintainance easier (like use SCEV
and simple_iv).
Can you produce compile-time numbers to show how much overhead the
pass adds?
>> Note that IVOPTs even can create complex uses of a promoted IV,
>> like inserting truncations if they are necessary - you seem to
>> disqualify most complex uses.
>
> I would not be surprised if the pass is missing some opportunities. ?If
> the pass can be made to use more robust machinery and automagically pick
> up those opportunities, that'd be great.
Of course.
Thanks,
Richard.