This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix for 56175


On Thu, Feb 21, 2013 at 3:26 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Richard,
>
> I double checked that with and without my fix compiler produces the
> same output with -fdump-tree-optimized.

For what testcase?

Richard.

> What patch did you apply? I think that you should apply the second one
> - 56175.diff.
>
> Yuri.
>
> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>> On Thu, Feb 21, 2013 at 2:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Richard,
>>>
>>> This regression was introduced by Kai
>>>
>>> http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01988.html
>>>
>>> 2011-06-27  Kai Tietz  <ktietz@redhat.com>
>>>
>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Improve
>>>         type sinking.
>>>         * tree-ssa-math-opts.c (execute_optimize_bswap): Separate
>>>         search for di/si mode patterns for finding widest match.
>>>
>>> Is it sufficient for you to accept my patch?
>>
>> No, I still don't think the fix is sound.  A proper fix would revert the
>> above change (the simplify_bitwise_binary one), watch for testsuite
>> fallout (I can immediately see gcc.dg/tree-ssa/bitwise-sink.c failing)
>> and fix those failures in a better way.
>>
>> Richard.
>>
>>> Best regards.
>>> yuri.
>>>
>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>> On Thu, Feb 21, 2013 at 1:39 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>> Richard,
>>>>>
>>>>> As we know Kai is working on this problem for 4.9 and I assume that
>>>>> type sinking will be deleted from forwprop pass. Could we stay on this
>>>>> fix but more common fix will be done.
>>>>
>>>> Well, unless you show it is a regression the patch is not applicable for 4.8
>>>> anyway.  Not sure if the code will be deleted from forwprop pass in 4.9 either,
>>>> it is after all a canonicalization - fold seems to perform the opposite one
>>>> though:
>>>>
>>>>       /* Convert (T)(x & c) into (T)x & (T)c, if c is an integer
>>>>          constants (if x has signed type, the sign bit cannot be set
>>>>          in c).  This folds extension into the BIT_AND_EXPR.
>>>>
>>>> note that what forwprop does (T)x & c -> (T)(x & c') I'd call type hoisting,
>>>> not sinking.  Generally frontends and fold try to narrow operands when
>>>> possible (even though some targets later widen them again because of
>>>> instruction set constraints).
>>>>
>>>> Most of this hoisting code was done to make lowering logical && and ||
>>>> I believe.  Looking at the testcases added tells us that while matching
>>>> the two patterns as done now helps them but only because that pattern
>>>> feeds single-operand instructions that then simplify.  So doing the transform
>>>> starting from that single-operand instructions instead looks like a better
>>>> fix (op !=/== 0/1 and (T) op) and also would not disagree with the
>>>> canonicalization done by fold.
>>>>
>>>> Richard.
>>>>
>>>>> I also can propose to introduce new hook for it but need to know your
>>>>> opinion since we don't went to waste our time on preparing dead
>>>>> patches. Note that x86 supports all short types in HW and such type
>>>>> sinkning is usually useless if short types are involved.
>>>>>
>>>>> Best regards.
>>>>> Yuri.
>>>>>
>>>>>
>>>>> 2013/2/21 Richard Biener <richard.guenther@gmail.com>:
>>>>>> On Wed, Feb 20, 2013 at 4:41 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>> Richard,
>>>>>>>
>>>>>>> First of all, your proposal to move type sinking to the end of
>>>>>>> function does not work since we handle each statement in function and
>>>>>>> we want that 1st type folding of X & C will not happen.
>>>>>>> Note that we have the following sequence of gimple before forwprop1:
>>>>>>>
>>>>>>>    x.0_10 = (signed char) x_8;
>>>>>>>   _11 = x.0_10 & 1;
>>>>>>>   _12 = (signed char) y_9;
>>>>>>>   _13 = _12 & 1;
>>>>>>>   _14 = _11 ^ _13;
>>>>>>
>>>>>> Ah, indeed.  Reminds me of some of my dead patches that separated
>>>>>> forwprop into a forward and backward stage.  Of course then you have
>>>>>> the ordering issue of whether to first forward or backward.
>>>>>>
>>>>>> Which means that I bet you can construct a testcase that with
>>>>>> your change is no longer optimized (just make pushing the conversion
>>>>>> make the types _match_).  Which is always the case
>>>>>> with this kind of local pattern-matching transforms.
>>>>>>
>>>>>> Currently forwprop processes leafs of expression trees first (well, inside
>>>>>> a basic-block), similar to how fold () is supposed to be operated, based
>>>>>> on the idea that simplified / canonicalized leafs helps keeping pattern
>>>>>> recognition simple and cost considerations more accurate.
>>>>>>
>>>>>> When one order works better than another you always have to consider
>>>>>> that the user could already have written the code in a way that results
>>>>>> in the input that isn't well handled.
>>>>>>
>>>>>> Not that this helps very much for the situation ;)
>>>>>>
>>>>>> But I don't like the use of first_pass_instance ... and the fix isn't
>>>>>> an improvement but just a hack for the benchmark.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> I also added comment to my fix and create new test for it. I also
>>>>>>> checked that this test is passed with patched compiler  only. So
>>>>>>> Change Log was also modified:
>>>>>>>
>>>>>>> ChangeLog
>>>>>>>
>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>
>>>>>>>         PR tree-optimization/56175
>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type sinking
>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>         for short integer types.
>>>>>>>         * gcc.dg/pr56175.c: New test.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2013/2/20 Richard Biener <richard.guenther@gmail.com>:
>>>>>>>> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>>>>>>>> Hi All,
>>>>>>>>>
>>>>>>>>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>> pattern in simpify_bitwise_binary for short integer types.
>>>>>>>>> The fix is very simple - we simply turn off short type sinking at the
>>>>>>>>> first pass of forward propagation allows to get
>>>>>>>>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
>>>>>>>>> +5-7% for other x86 platforms too.
>>>>>>>>> Bootstrapping and regression testing were successful on x86-64.
>>>>>>>>>
>>>>>>>>> Is it Ok for trunk?
>>>>>>>>
>>>>>>>> It definitely needs a comment before the checks.
>>>>>>>>
>>>>>>>> Also I think it simply shows that the code is placed at the wrong spot.
>>>>>>>> Simply moving it down in simplify_bitwise_binary to be done the very last
>>>>>>>> should get both of the effects done.
>>>>>>>>
>>>>>>>> Can you rework the patch according to that?
>>>>>>>>
>>>>>>>> You also miss a testcase, we should make sure to not regress again here.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> ChangeLog.
>>>>>>>>>
>>>>>>>>> 2013-02-20  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>>>>>>
>>>>>>>>>         PR tree-optimization/56175
>>>>>>>>>         * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
>>>>>>>>>         at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>>>>>>         for short integer types.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]