This is the mail archive of the
mailing list for the GCC project.
Re: Guard use of modulo in cshift (speedup protein)
On Tue, Apr 10, 2012 at 5:40 PM, Michael Matz <firstname.lastname@example.org> wrote:
> On Tue, 10 Apr 2012, Steven Bosscher wrote:
>> This is OK.
>> Do you think it would be worthwhile to do this transformation in the
>> middle end too, based on profile information for values?
> I'd think so, but it probably requires a new profiler that counts for how
> often 0 <= A <= B for every "A % B". ?Just profiling the range of values
> might be misleading (because A <= N and B <= M and N <= M doesn't imply
> that A <= B often holds).
> But it would possibly be an interesting experiment already to do such
> transformation generally (without profiling) and see what it gives on some
> benchmarks. ?Just to get a feel what's on the plate.
The question is, of course, why on earth is a modulo operation in the
loop setup so expensive that avoiding it improves the performance of
the overall routine so much ... did you expect the code-gen difference
of your patch?
>> IIRC value-prof
>> handles constant divmod but not ranges for modulo operations.