This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Factor out division by squares and remove division around comparisons (1/2)


On Tue, Aug 29, 2017 at 1:35 PM, Jackson Woodruff
<jackson.woodruff@foss.arm.com> wrote:
> Hi all,
>
> Apologies again to those CC'ed, who (again) received this twice.
>
> Joseph: Yes you are correct. I misread the original thread, now fixed.
>
> Richard: I've moved the optimizations out of fold-const.c. One has been
> replicated in match.pd, and the other (x / C +- y / C -> (x +- y) / C) I've
> deleted as it only introduced a new optimization when running with the flags
> '-O0 -funsafe-math-optimizations'.

Hmm, how did you verify that, that it only adds sth with -O0
-funsafe-math-optimizations?
Is that because in GIMPLE the reassoc pass should do this transform?

You added

+/* Simplify x / (- y) to -x / y.  */
+(simplify
+ (rdiv @0 (negate @1))
+ (rdiv (negate @0) @1))

for

      /* (-A) / (-B) -> A / B  */
      if (TREE_CODE (arg0) == NEGATE_EXPR && negate_expr_p (arg1))
        return fold_build2_loc (loc, RDIV_EXPR, type,
                            TREE_OPERAND (arg0, 0),
                            negate_expr (arg1));
      if (TREE_CODE (arg1) == NEGATE_EXPR && negate_expr_p (arg0))
        return fold_build2_loc (loc, RDIV_EXPR, type,
                            negate_expr (arg0),
                            TREE_OPERAND (arg1, 0));

presumably?  This isn't equivalent.  It's more like

(simplify
  (rdiv (negate_expr_p @0) (negate @1))
  (rdiv (negate @0) @1))
(simplify
  (rdiv (negate @0) (negate_expr_p @1))
  (rdiv @0 (negate @1)))

and you should remove the corresponding fold-const.c code.

Please do these changes independently to aid bisecting in case of issues.

 (if (flag_reciprocal_math)
- /* Convert (A/B)/C to A/(B*C)  */
+ /* Convert (A/B)/C to A/(B*C) where neither B nor C are constant.  */
  (simplify
   (rdiv (rdiv:s @0 @1) @2)
-   (rdiv @0 (mult @1 @2)))
+   (if (TREE_CODE (@1) != REAL_CST && TREE_CODE (@1) != REAL_CST)
+    (rdiv @0 (mult @1 @2))))

why?  I guess to avoid ping-poning with

+ /* Simplify x / (C * y) to (x / C) / y where C is a constant.  */
+ (simplify
+  (rdiv @0
+   (mult @1 REAL_CST@2))
+  (rdiv (rdiv @0 @2) @1))

?  If so why not just disallow for @1 == REAL_CST?

> On O1 and up, the pattern that replaces 'x / C' with 'x * (1 / C)'
> is enabled and then the pattern is covered by that and
> (x * C +- y * C -> C * (x +- y)) (which is already present in match.pd)
>
> I have also updated the testcase for those optimizations to use 'O1' to
> avoid that case.
>
>
> On 08/24/2017 10:06 PM, Jeff Law wrote:
>>
>> On 08/17/2017 03:55 AM, Wilco Dijkstra wrote:
>>>
>>> Richard Biener wrote:
>>>>
>>>> On Tue, Aug 15, 2017 at 4:11 PM, Wilco Dijkstra <Wilco.Dijkstra@arm.com>
>>>> wrote:
>>>>>
>>>>> Richard Biener wrote:
>>>>>>>
>>>>>>> We also change the association of
>>>>>>>
>>>>>>>        x / (y * C) -> (x / C) / y
>>>>>>>
>>>>>>> If C is a constant.
>>>>>>
>>>>>>
>>>>>> Why's that profitable?
>>>>>
>>>>>
>>>>> It enables (x * C1) / (y * C2) -> (x * C1/C2) / y for example.
>>>>> Also 1/y is now available to the reciprocal optimization, see
>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71026 for details.
>>>>
>>>>
>>>> Sure, but on its own it's going to be slower.  So this isn't the
>>>> correct way to enable those followup transforms.
>>>
>>>
>>> How can it be any slower? It's one division and one multiply in both
>>> cases.
>>
>> x / (y * C) -> (x / C) / y
>>
>> Goes from one division and one multiplication to two divisions.  I'm
>> guessing that's what Richi is (reasonably) concerned about.
>>
>> So it may be the case that we need more complex pattern matching here
>> for when to perform the first transformation to ultimately enable the
>> second.
>>
>
> The only case where we don't remove the division but still execute this
> pattern is when run with'-O0 -freciprocal-math'.
>
> As long as we have 'O1' or greater (and -freciprocal-math), that is enough
> to enable the removal of the reciprocal.

I don't see this.  I presume you mean this happens in the recip pass?
But we don't optimize this when optimizing for size (but the above pattern
still applies) or when targetm.min_divisions_for_recip_mul is too large.

So I still think this is the wrong place to do this and instead the recip
pass should be extended.



>
> Jackson.
>
>>
>> Jeff
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]