[PATCH] Optimize certain end of loop conditions into min/max operation
Marc Glisse
marc.glisse@inria.fr
Fri Sep 18 07:41:00 GMT 2015
Just a couple extra points. We can end up with a mix of < and >, which
might prevent from matching:
_3 = b_1(D) > a_2(D);
_5 = a_2(D) < c_4(D);
_8 = _3 & _5;
Just like with &, we could also transform:
x < y | x < z ---> x < max(y, z)
(but maybe wait to make sure reviewers are ok with the first
transformation before generalizing)
On Fri, 18 Sep 2015, Marc Glisse wrote:
> On Thu, 17 Sep 2015, Michael Collison wrote:
>
>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR
>> expressions. I need some assistance; this test case will fail on targets
>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy
>> this short of enumerating whether a target support MIN/MAX in
>> testsuite/lib/target_support?
>>
>> 2015-07-24 Michael Collison <michael.collison@linaro.org>
>> Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>
>> * match.pd ((x < y) && (x < z) -> x < min (y,z),
>> (x > y) and (x > z) -> x > max (y,z))
>> * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 5e8fd32..8691710 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see
>> (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>> (convert:utype @4)))))))
>>
>> +
>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>> +(for op (lt le)
>> +(simplify
>
> You seem to be missing all indentation.
>
>> +(bit_and:c (op @0 @1) (op @0 @2))
>
> :c seems useless here. On the other hand, it might make sense to use op:s
> since this is mostly useful if it removes the 2 original comparisons.
>
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>
> How did you chose this restriction? It seems safe enough, but the
> transformation could make sense in other cases as well. It can always be
> generalized later though.
>
>> +(op @0 (min @1 @2)))))
>> +
>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>> +(for op (gt ge)
>
> Note that you could unify the patterns with something like:
> (for op (lt le gt ge)
> ext (min min max max)
> (simplify ...
>
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> +(op @0 (max @1 @2)))))
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> new file mode 100644
>> index 0000000..cc0189a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> @@ -0,0 +1,23 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +#define N 1024
>> +
>> +int a[N], b[N], c[N];
>> +
>> +void add (unsigned int m, unsigned int n)
>> +{
>> + unsigned int i;
>> + for (i = 0; i < m && i < n; ++i)
>
> Maybe writing '&' instead of '&&' would make it depend less on the target.
> Also, both tests seem to be for GENERIC (i.e. I expect that you are already
> seeing the optimized version with -fdump-tree-original or
> -fdump-tree-gimple). Maybe something as simple as:
> int f(long a, long b, long c) {
> int cmp1 = a < b;
> int cmp2 = a < c;
> return cmp1 & cmp2;
> }
>
>> + a[i] = b[i] + c[i];
>> +}
>> +
>> +void add2 (unsigned int m, unsigned int n)
>> +{
>> + unsigned int i;
>> + for (i = N-1; i > m && i > n; --i)
>> + a[i] = b[i] + c[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>
>
--
Marc Glisse
More information about the Gcc-patches
mailing list