[PATCH] Optimize certain end of loop conditions into min/max operation

Fri Sep 18 07:41:00 GMT 2015

Just a couple extra points. We can end up with a mix of < and >, which 
might prevent from matching:

   _3 = b_1(D) > a_2(D);
   _5 = a_2(D) < c_4(D);
   _8 = _3 & _5;

Just like with &, we could also transform:
x < y | x < z  --->  x < max(y, z)

(but maybe wait to make sure reviewers are ok with the first 
transformation before generalizing)

On Fri, 18 Sep 2015, Marc Glisse wrote:

> On Thu, 17 Sep 2015, Michael Collison wrote:
>
>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR 
>> expressions. I need some assistance; this test case will fail on targets 
>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy 
>> this short of enumerating whether a target support MIN/MAX in 
>> testsuite/lib/target_support?
>> 
>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>                    Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>
>>                * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>                (x > y) and (x > z) -> x > max (y,z))
>>                * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 5e8fd32..8691710 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>>     (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>               (convert:utype @4)))))))
>> 
>> +
>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>> +(for op (lt le)
>> +(simplify
>
> You seem to be missing all indentation.
>
>> +(bit_and:c (op @0 @1) (op @0 @2))
>
> :c seems useless here. On the other hand, it might make sense to use op:s 
> since this is mostly useful if it removes the 2 original comparisons.
>
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>
> How did you chose this restriction? It seems safe enough, but the 
> transformation could make sense in other cases as well. It can always be 
> generalized later though.
>
>> +(op @0 (min @1 @2)))))
>> +
>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>> +(for op (gt ge)
>
> Note that you could unify the patterns with something like:
> (for op (lt le gt ge)
>     ext (min min max max)
> (simplify ...
>
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> +(op @0 (max @1 @2)))))
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> new file mode 100644
>> index 0000000..cc0189a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> @@ -0,0 +1,23 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +#define N 1024
>> +
>> +int a[N], b[N], c[N];
>> +
>> +void add (unsigned int m, unsigned int n)
>> +{
>> +  unsigned int i;
>> +  for (i = 0; i < m && i < n; ++i)
>
> Maybe writing '&' instead of '&&' would make it depend less on the target. 
> Also, both tests seem to be for GENERIC (i.e. I expect that you are already 
> seeing the optimized version with -fdump-tree-original or 
> -fdump-tree-gimple). Maybe something as simple as:
> int f(long a, long b, long c) {
>  int cmp1 = a < b;
>  int cmp2 = a < c;
>  return cmp1 & cmp2;
> }
>
>> +    a[i] = b[i] + c[i];
>> +}
>> +
>> +void add2 (unsigned int m, unsigned int n)
>> +{
>> +  unsigned int i;
>> +  for (i = N-1; i > m && i > n; --i)
>> +    a[i] = b[i] + c[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>
>

-- 
Marc Glisse