This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix for 56175


On Thu, Feb 21, 2013 at 4:16 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> Sorry for my previous message - I did not undersatnd it properly.
>
> Anyway I proposed another fix that avoid  (type) x & c --> (type) (x &
> (type-x) c) transformation if x has short type:
>
> +++ gcc/tree-ssa-forwprop.c     (working copy)
> @@ -1789,8 +1789,11 @@
>    defcodefor_name (arg1, &def1_code, &def1_arg1, &def1_arg2);
>    defcodefor_name (arg2, &def2_code, &def2_arg1, &def2_arg2);
>
> -  /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)).  */
> +  /* Try to fold (type) X op CST -> (type) (X op ((type-x) CST)).
> +     Do that only if X has not short type.  */
>    if (TREE_CODE (arg2) == INTEGER_CST
> +      && (TYPE_PRECISION (TREE_TYPE (arg1))
> +          >= TYPE_PRECISION (integer_type_node))
>        && CONVERT_EXPR_CODE_P (def1_code)
>        && INTEGRAL_TYPE_P (TREE_TYPE (def1_arg1))
>        && int_fits_type_p (arg2, TREE_TYPE (def1_arg1)))
>
> Does this fix look suitable?

I think the fix is to disable the transform if it widens the operation.
Thus, sth like

    if (TREE_CODE (arg2) == INTEGER_CST
       && CONVERT_EXPR_CODE_P (def1_code)
       && INTEGRAL_TYPE_P (TREE_TYPE (def1_arg1))
+     && (TYPE_PRECISION (TREE_TYPE (arg1))
+            >= TYPE_PRECISION (TREE_TYPE (def1_arg1)))
       && int_fits_type_p (arg2, TREE_TYPE (def1_arg1)))

see the restriction we place on the transform for the (T1) x & (T2) y case:

  /* For bitwise binary operations apply operand conversions to the
     binary operation result instead of to the operands.  This allows
     to combine successive conversions and bitwise binary operations.  */
  if (CONVERT_EXPR_CODE_P (def1_code)
      && CONVERT_EXPR_CODE_P (def2_code)
      && types_compatible_p (TREE_TYPE (def1_arg1), TREE_TYPE (def2_arg1))
      /* Make sure that the conversion widens the operands, or has same
         precision,  or that it changes the operation to a bitfield
         precision.  */
      && ((TYPE_PRECISION (TREE_TYPE (def1_arg1))
           <= TYPE_PRECISION (TREE_TYPE (arg1)))
          || (GET_MODE_CLASS (TYPE_MODE (TREE_TYPE (arg1)))
              != MODE_INT)
          || (TYPE_PRECISION (TREE_TYPE (arg1))
              != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (arg1))))))

ideally you'd split out that condition into a predicate function taking the
two type and use it in both places.

Richard.

> Yuri.
>
> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>> On Thu, Feb 21, 2013 at 3:26 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> I double checked that with and without my fix compiler produces the
>>> same output with -fdump-tree-optimized.
>>
>> For what testcase?
>>
>> Richard.
>>
>>> What patch did you apply? I think that you should apply the second one
>>> - 56175.diff.
>>>
>>> Yuri.
>>>
>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>> On Thu, Feb 21, 2013 at 2:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> This regression was introduced by Kai
>>>>>
>>>>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01988.html
>>>>>
>>>>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>>>>
>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Improve
>>>>>         type sinking.
>>>>>         * tree-ssa-math-opts.c (execute_optimize_bswap): Separate
>>>>>         search for di/si mode patterns for finding widest match.
>>>>>
>>>>> Is it sufficient for you to accept my patch?
>>>>
>>>> No, I still don't think the fix is sound.  A proper fix would revert the
>>>> above change (the simplify_bitwise_binary one), watch for testsuite
>>>> fallout (I can immediately see gcc.dg/tree-ssa/bitwise-sink.c failing)
>>>> and fix those failures in a better way.
>>>>
>>>> Richard.
>>>>
>>>>> Best regards.
>>>>> yuri.
>>>>>
>>>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Thu, Feb 21, 2013 at 1:39 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> As we know Kai is working on this problem for 4.9 and I assume that
>>>>>>> type sinking will be deleted from forwprop pass. Could we stay on this
>>>>>>> fix but more common fix will be done.
>>>>>>
>>>>>> Well, unless you show it is a regression the patch is not applicable for 4.8
>>>>>> anyway.  Not sure if the code will be deleted from forwprop pass in 4.9 either,
>>>>>> it is after all a canonicalization - fold seems to perform the opposite one
>>>>>> though:
>>>>>>
>>>>>>       /* Convert (T)(x & c) into (T)x & (T)c, if c is an integer
>>>>>>          constants (if x has signed type, the sign bit cannot be set
>>>>>>          in c).  This folds extension into the BIT_AND_EXPR.
>>>>>>
>>>>>> note that what forwprop does (T)x & c -> (T)(x & c') I'd call type hoisting,
>>>>>> not sinking.  Generally frontends and fold try to narrow operands when
>>>>>> possible (even though some targets later widen them again because of
>>>>>> instruction set constraints).
>>>>>>
>>>>>> Most of this hoisting code was done to make lowering logical && and ||
>>>>>> I believe.  Looking at the testcases added tells us that while matching
>>>>>> the two patterns as done now helps them but only because that pattern
>>>>>> feeds single-operand instructions that then simplify.  So doing the transform
>>>>>> starting from that single-operand instructions instead looks like a better
>>>>>> fix (op !=/== 0/1 and (T) op) and also would not disagree with the
>>>>>> canonicalization done by fold.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> I also can propose to introduce new hook for it but need to know your
>>>>>>> opinion since we don't went to waste our time on preparing dead
>>>>>>> patches. Note that x86 supports all short types in HW and such type
>>>>>>> sinkning is usually useless if short types are involved.
>>>>>>>
>>>>>>> Best regards.
>>>>>>> Yuri.
>>>>>>>
>>>>>>>
>>>>>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> On Wed, Feb 20, 2013 at 4:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Richard,
>>>>>>>>>
>>>>>>>>> First of all, your proposal to move type sinking to the end of
>>>>>>>>> function does not work since we handle each statement in function and
>>>>>>>>> we want that 1st type folding of X & C will not happen.
>>>>>>>>> Note that we have the following sequence of gimple before forwprop1:
>>>>>>>>>
>>>>>>>>>    x.0_10 = (signed char) x_8;
>>>>>>>>>   _11 = x.0_10 & 1;
>>>>>>>>>   _12 = (signed char) y_9;
>>>>>>>>>   _13 = _12 & 1;
>>>>>>>>>   _14 = _11 ^ _13;
>>>>>>>>
>>>>>>>> Ah, indeed.  Reminds me of some of my dead patches that separated
>>>>>>>> forwprop into a forward and backward stage.  Of course then you have
>>>>>>>> the ordering issue of whether to first forward or backward.
>>>>>>>>
>>>>>>>> Which means that I bet you can construct a testcase that with
>>>>>>>> your change is no longer optimized (just make pushing the conversion
>>>>>>>> make the types _match_).  Which is always the case
>>>>>>>> with this kind of local pattern-matching transforms.
>>>>>>>>
>>>>>>>> Currently forwprop processes leafs of expression trees first (well, inside
>>>>>>>> a basic-block), similar to how fold () is supposed to be operated, based
>>>>>>>> on the idea that simplified / canonicalized leafs helps keeping pattern
>>>>>>>> recognition simple and cost considerations more accurate.
>>>>>>>>
>>>>>>>> When one order works better than another you always have to consider
>>>>>>>> that the user could already have written the code in a way that results
>>>>>>>> in the input that isn't well handled.
>>>>>>>>
>>>>>>>> Not that this helps very much for the situation ;)
>>>>>>>>
>>>>>>>> But I don't like the use of first_pass_instance ... and the fix isn't
>>>>>>>> an improvement but just a hack for the benchmark.
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> I also added comment to my fix and create new test for it. I also
>>>>>>>>> checked that this test is passed with patched compiler  only. So
>>>>>>>>> Change Log was also modified:
>>>>>>>>>
>>>>>>>>> ChangeLog
>>>>>>>>>
>>>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>
>>>>>>>>>         PR tree-optimization/56175
>>>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type sinking
>>>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>>         for short integer types.
>>>>>>>>>         * gcc.dg/pr56175.c: New test.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2013/2/20 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>>>> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>>>> pattern in simpify_bitwise_binary for short integer types.
>>>>>>>>>>> The fix is very simple - we simply turn off short type sinking at the
>>>>>>>>>>> first pass of forward propagation allows to get
>>>>>>>>>>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
>>>>>>>>>>> +5-7% for other x86 platforms too.
>>>>>>>>>>> Bootstrapping and regression testing were successful on x86-64.
>>>>>>>>>>>
>>>>>>>>>>> Is it Ok for trunk?
>>>>>>>>>>
>>>>>>>>>> It definitely needs a comment before the checks.
>>>>>>>>>>
>>>>>>>>>> Also I think it simply shows that the code is placed at the wrong spot.
>>>>>>>>>> Simply moving it down in simplify_bitwise_binary to be done the very last
>>>>>>>>>> should get both of the effects done.
>>>>>>>>>>
>>>>>>>>>> Can you rework the patch according to that?
>>>>>>>>>>
>>>>>>>>>> You also miss a testcase, we should make sure to not regress again here.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>> ChangeLog.
>>>>>>>>>>>
>>>>>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>>>
>>>>>>>>>>>         PR tree-optimization/56175
>>>>>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
>>>>>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>>>>         for short integer types.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]