This is the mail archive of the
mailing list for the GCC project.
Re: Fix for 56175
On Thu, Feb 21, 2013 at 1:39 PM, Yuri Rumyantsev <email@example.com> wrote:
> As we know Kai is working on this problem for 4.9 and I assume that
> type sinking will be deleted from forwprop pass. Could we stay on this
> fix but more common fix will be done.
Well, unless you show it is a regression the patch is not applicable for 4.8
anyway. Not sure if the code will be deleted from forwprop pass in 4.9 either,
it is after all a canonicalization - fold seems to perform the opposite one
/* Convert (T)(x & c) into (T)x & (T)c, if c is an integer
constants (if x has signed type, the sign bit cannot be set
in c). This folds extension into the BIT_AND_EXPR.
note that what forwprop does (T)x & c -> (T)(x & c') I'd call type hoisting,
not sinking. Generally frontends and fold try to narrow operands when
possible (even though some targets later widen them again because of
instruction set constraints).
Most of this hoisting code was done to make lowering logical && and ||
I believe. Looking at the testcases added tells us that while matching
the two patterns as done now helps them but only because that pattern
feeds single-operand instructions that then simplify. So doing the transform
starting from that single-operand instructions instead looks like a better
fix (op !=/== 0/1 and (T) op) and also would not disagree with the
canonicalization done by fold.
> I also can propose to introduce new hook for it but need to know your
> opinion since we don't went to waste our time on preparing dead
> patches. Note that x86 supports all short types in HW and such type
> sinkning is usually useless if short types are involved.
> Best regards.
> 2013/2/21 Richard Biener <firstname.lastname@example.org>:
>> On Wed, Feb 20, 2013 at 4:41 PM, Yuri Rumyantsev <email@example.com> wrote:
>>> First of all, your proposal to move type sinking to the end of
>>> function does not work since we handle each statement in function and
>>> we want that 1st type folding of X & C will not happen.
>>> Note that we have the following sequence of gimple before forwprop1:
>>> x.0_10 = (signed char) x_8;
>>> _11 = x.0_10 & 1;
>>> _12 = (signed char) y_9;
>>> _13 = _12 & 1;
>>> _14 = _11 ^ _13;
>> Ah, indeed. Reminds me of some of my dead patches that separated
>> forwprop into a forward and backward stage. Of course then you have
>> the ordering issue of whether to first forward or backward.
>> Which means that I bet you can construct a testcase that with
>> your change is no longer optimized (just make pushing the conversion
>> make the types _match_). Which is always the case
>> with this kind of local pattern-matching transforms.
>> Currently forwprop processes leafs of expression trees first (well, inside
>> a basic-block), similar to how fold () is supposed to be operated, based
>> on the idea that simplified / canonicalized leafs helps keeping pattern
>> recognition simple and cost considerations more accurate.
>> When one order works better than another you always have to consider
>> that the user could already have written the code in a way that results
>> in the input that isn't well handled.
>> Not that this helps very much for the situation ;)
>> But I don't like the use of first_pass_instance ... and the fix isn't
>> an improvement but just a hack for the benchmark.
>>> I also added comment to my fix and create new test for it. I also
>>> checked that this test is passed with patched compiler only. So
>>> Change Log was also modified:
>>> 2013-02-20 Yuri Rumyantsev <firstname.lastname@example.org>
>>> PR tree-optimization/56175
>>> * tree-ssa-forwprop.c (simplify_bitwise_binary): Avoid type sinking
>>> at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>> for short integer types.
>>> * gcc.dg/pr56175.c: New test.
>>> 2013/2/20 Richard Biener <email@example.com>:
>>>> On Wed, Feb 20, 2013 at 1:00 PM, Yuri Rumyantsev <firstname.lastname@example.org> wrote:
>>>>> Hi All,
>>>>> This patch is aimed to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>> pattern in simpify_bitwise_binary for short integer types.
>>>>> The fix is very simple - we simply turn off short type sinking at the
>>>>> first pass of forward propagation allows to get
>>>>> +10% speedup for important benchmark Coremark 1.0 at x86 Atom and
>>>>> +5-7% for other x86 platforms too.
>>>>> Bootstrapping and regression testing were successful on x86-64.
>>>>> Is it Ok for trunk?
>>>> It definitely needs a comment before the checks.
>>>> Also I think it simply shows that the code is placed at the wrong spot.
>>>> Simply moving it down in simplify_bitwise_binary to be done the very last
>>>> should get both of the effects done.
>>>> Can you rework the patch according to that?
>>>> You also miss a testcase, we should make sure to not regress again here.
>>>>> 2013-02-20 Yuri Rumyantsev <email@example.com>
>>>>> PR tree-optimization/56175
>>>>> * tree-ssa-forwprop.c (simplify_bitwise_binary) : Avoid type sinking
>>>>> at 1st forwprop pass to recognize (A & C) ^ (B & C) -> (A ^ B) & C
>>>>> for short integer types.