Fix for 56175

Yuri Rumyantsev ysrumyan@gmail.com
Thu Feb 21 15:16:00 GMT 2013


Richard,

Sorry for my previous message - I did not undersatnd it properly.

Anyway I proposed another fix that avoid  (type) x & c --> (type) (x &
(type-x) c) transformation if x has short type:

+++ gcc/tree-ssa-forwprop.c     (working copy)
@@ -1789,8 +1789,11 @@
   defcodefor_name (arg1, &def1_code, &def1_arg1, &def1_arg2);
   defcodefor_name (arg2, &def2_code, &def2_arg1, &def2_arg2);

-  /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)).  */
+  /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)).
+     Do that only if X has not short type.  */
   if (TREE_CODE (arg2) == INTEGER_CST
+      && (TYPE_PRECISION (TREE_TYPE (arg1))
+          >= TYPE_PRECISION (integer_type_node))
       && CONVERT_EXPR_CODE_P (def1_code)
       && INTEGRAL_TYPE_P (TREE_TYPE (def1_arg1))
       && int_fits_type_p (arg2, TREE_TYPE (def1_arg1)))

Does this fix look suitable?

Yuri.

2013/2/21 Richard Biener <richard.guenther@gmail.com>:
> On Thu, Feb 21, 2013 at 3:26 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>> Richard,
>>
>> I double checked that with and without my fix compiler produces the
>> same output with -fdump-tree-optimized.
>
> For what testcase?
>
> Richard.
>
>> What patch did you apply? I think that you should apply the second one
>> - 56175.diff.
>>
>> Yuri.
>>
>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>> On Thu, Feb 21, 2013 at 2:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>> Richard,
>>>>
>>>> This regression was introduced by Kai
>>>>
>>>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01988.html
>>>>
>>>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>>>
>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Improve
>>>>         type sinking.
>>>>         * tree-ssa-math-opts.c (execute_optimize_bswap): Separate
>>>>         search for di/si mode patterns for finding widest match.
>>>>
>>>> Is it sufficient for you to accept my patch?
>>>
>>> No, I still don't think the fix is sound.  A proper fix would revert the
>>> above change (the simplify_bitwise_binary one), watch for testsuite
>>> fallout (I can immediately see gcc.dg/tree-ssa/bitwise-sink.c failing)
>>> and fix those failures in a better way.
>>>
>>> Richard.
>>>
>>>> Best regards.
>>>> yuri.
>>>>
>>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Thu, Feb 21, 2013 at 1:39 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>> Richard,
>>>>>>
>>>>>> As we know Kai is working on this problem for 4.9 and I assume that
>>>>>> type sinking will be deleted from forwprop pass. Could we stay on this
>>>>>> fix but more common fix will be done.
>>>>>
>>>>> Well, unless you show it is a regression the patch is not applicable for 4.8
>>>>> anyway.  Not sure if the code will be deleted from forwprop pass in 4.9 either,
>>>>> it is after all a canonicalization - fold seems to perform the opposite one
>>>>> though:
>>>>>
>>>>>       /* Convert (T)(x & c) into (T)x & (T)c, if c is an integer
>>>>>          constants (if x has signed type, the sign bit cannot be set
>>>>>          in c).  This folds extension into the BIT_AND_EXPR.
>>>>>
>>>>> note that what forwprop does (T)x & c -> (T)(x & c') I'd call type hoisting,
>>>>> not sinking.  Generally frontends and fold try to narrow operands when
>>>>> possible (even though some targets later widen them again because of
>>>>> instruction set constraints).
>>>>>
>>>>> Most of this hoisting code was done to make lowering logical && and ||
>>>>> I believe.  Looking at the testcases added tells us that while matching
>>>>> the two patterns as done now helps them but only because that pattern
>>>>> feeds single-operand instructions that then simplify.  So doing the transform
>>>>> starting from that single-operand instructions instead looks like a better
>>>>> fix (op !=/== 0/1 and (T) op) and also would not disagree with the
>>>>> canonicalization done by fold.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> I also can propose to introduce new hook for it but need to know your
>>>>>> opinion since we don't went to waste our time on preparing dead
>>>>>> patches. Note that x86 supports all short types in HW and such type
>>>>>> sinkning is usually useless if short types are involved.
>>>>>>
>>>>>> Best regards.
>>>>>> Yuri.
>>>>>>
>>>>>>
>>>>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>>>>> On Wed, Feb 20, 2013 at 4:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>> Richard,
>>>>>>>>
>>>>>>>> First of all, your proposal to move type sinking to the end of
>>>>>>>> function does not work since we handle each statement in function and
>>>>>>>> we want that 1st type folding of X & C will not happen.
>>>>>>>> Note that we have the following sequence of gimple before forwprop1:
>>>>>>>>
>>>>>>>>    x.0_10 = (signed char) x_8;
>>>>>>>>   _11 = x.0_10 & 1;
>>>>>>>>   _12 = (signed char) y_9;
>>>>>>>>   _13 = _12 & 1;
>>>>>>>>   _14 = _11 ^ _13;
>>>>>>>
>>>>>>> Ah, indeed.  Reminds me of some of my dead patches that separated
>>>>>>> forwprop into a forward and backward stage.  Of course then you have
>>>>>>> the ordering issue of whether to first forward or backward.
>>>>>>>
>>>>>>> Which means that I bet you can construct a testcase that with
>>>>>>> your change is no longer optimized (just make pushing the conversion
>>>>>>> make the types _match_).  Which is always the case
>>>>>>> with this kind of local pattern-matching transforms.
>>>>>>>
>>>>>>> Currently forwprop processes leafs of expression trees first (well, inside
>>>>>>> a basic-block), similar to how fold () is supposed to be operated, based
>>>>>>> on the idea that simplified / canonicalized leafs helps keeping pattern
>>>>>>> recognition simple and cost considerations more accurate.
>>>>>>>
>>>>>>> When one order works better than another you always have to consider
>>>>>>> that the user could already have written the code in a way that results
>>>>>>> in the input that isn't well handled.
>>>>>>>
>>>>>>> Not that this helps very much for the situation ;)
>>>>>>>
>>>>>>> But I don't like the use of first_pass_instance ... and the fix isn't
>>>>>>> an improvement but just a hack for the benchmark.
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>> I also added comment to my fix and create new test for it. I also
>>>>>>>> checked that this test is passed with patched compiler  only. So
>>>>>>>> Change Log was also modified:
>>>>>>>>
>>>>>>>> ChangeLog
>>>>>>>>
>>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>
>>>>>>>>         PR tree-optimization/56175
>>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type sinking
>>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>         for short integer types.
>>>>>>>>         * gcc.dg/pr56175.c: New test.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2013/2/20 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>> Hi All,
>>>>>>>>>>
>>>>>>>>>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>>> pattern in simpify_bitwise_binary for short integer types.
>>>>>>>>>> The fix is very simple - we simply turn off short type sinking at the
>>>>>>>>>> first pass of forward propagation allows to get
>>>>>>>>>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
>>>>>>>>>> +5-7% for other x86 platforms too.
>>>>>>>>>> Bootstrapping and regression testing were successful on x86-64.
>>>>>>>>>>
>>>>>>>>>> Is it Ok for trunk?
>>>>>>>>>
>>>>>>>>> It definitely needs a comment before the checks.
>>>>>>>>>
>>>>>>>>> Also I think it simply shows that the code is placed at the wrong spot.
>>>>>>>>> Simply moving it down in simplify_bitwise_binary to be done the very last
>>>>>>>>> should get both of the effects done.
>>>>>>>>>
>>>>>>>>> Can you rework the patch according to that?
>>>>>>>>>
>>>>>>>>> You also miss a testcase, we should make sure to not regress again here.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> ChangeLog.
>>>>>>>>>>
>>>>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>>
>>>>>>>>>>         PR tree-optimization/56175
>>>>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
>>>>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>>>         for short integer types.



More information about the Gcc-patches mailing list