This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments

From: Richard Biener <richard dot guenther at gmail dot com>
To: kugan <kugan dot vivekanandarajah at linaro dot org>
Cc: Jakub Jelinek <jakub at redhat dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
Date: Mon, 19 Sep 2016 15:40:47 +0200
Subject: Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments
Authentication-results: sourceware.org; auth=none
References: <0a1eaaf8-3ede-cd56-ffb5-40b25f94e46e@linaro.org> <98613cff-7c48-1a56-0014-6d87c35a8f26@linaro.org> <20160809214617.GB14857@tucnak.redhat.com> <7210cceb-be3b-44b1-13b7-4152e89d2a4f@linaro.org> <20160809215527.GC14857@tucnak.redhat.com> <0c53b0f3-4af6-387c-9350-95b1ae85850d@linaro.org> <20160810085703.GH14857@tucnak.redhat.com> <CAFiYyc0bLsCOTU-OZ4OKKNyrsmpNx63E+jSCGCYiMpY7=-z9nQ@mail.gmail.com> <e331b985-4951-1111-6f99-5af718064c78@linaro.org> <CAFiYyc1=XxNsGGOYs+r5k+8Xpiz8Kefg4==_epPMdK6fJLnshA@mail.gmail.com> <CAELXzTNVFBjDxa5_cyYh1rL+dHjnrrrgKE=W3L2ZPeUXm3NiTg@mail.gmail.com> <CAFiYyc1SbBr59T8Md1pbgyi_30Z_2W4MLdF7NWz2vdPZT45jnA@mail.gmail.com> <0f3b4359-f5ff-d14c-1b15-2ae647e6fd3a@linaro.org>

On Sun, Sep 18, 2016 at 10:21 PM, kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi Richard,
>
>
> On 14/09/16 21:31, Richard Biener wrote:
>>
>> On Fri, Sep 2, 2016 at 10:09 AM, Kugan Vivekanandarajah
>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> Hi Richard,
>>>
>>> On 25 August 2016 at 22:24, Richard Biener <richard.guenther@gmail.com>
>>> wrote:
>>>>
>>>> On Thu, Aug 11, 2016 at 1:09 AM, kugan
>>>> <kugan.vivekanandarajah@linaro.org> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>> On 10/08/16 20:28, Richard Biener wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 10, 2016 at 10:57 AM, Jakub Jelinek <jakub@redhat.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 10, 2016 at 08:51:32AM +1000, kugan wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I see it now. The problem is we are just looking at (-1) being in
>>>>>>>> the
>>>>>>>> ops
>>>>>>>> list for passing changed to rewrite_expr_tree in the case of
>>>>>>>> multiplication
>>>>>>>> by negate.  If we have combined (-1), as in the testcase, we will
>>>>>>>> not
>>>>>>>> have
>>>>>>>> the (-1) and will pass changed=false to rewrite_expr_tree.
>>>>>>>>
>>>>>>>> We should set changed based on what happens in
>>>>>>>> try_special_add_to_ops.
>>>>>>>> Attached patch does this. Bootstrap and regression testing are
>>>>>>>> ongoing.
>>>>>>>> Is
>>>>>>>> this OK for trunk if there is no regression.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think the bug is elsewhere.  In particular in
>>>>>>> undistribute_ops_list/zero_one_operation/decrement_power.
>>>>>>> All those look problematic in this regard, they change RHS of
>>>>>>> statements
>>>>>>> to something that holds a different value, while keeping the LHS.
>>>>>>> So, generally you should instead just add a new stmt next to the old
>>>>>>> one,
>>>>>>> and adjust data structures (replace the old SSA_NAME in some ->op
>>>>>>> with
>>>>>>> the new one).  decrement_power might be a problem here, dunno if all
>>>>>>> the
>>>>>>> builtins are const in all cases that DSE would kill the old one,
>>>>>>> Richard, any preferences for that?  reset flow sensitive info + reset
>>>>>>> debug
>>>>>>> stmt uses, or something different?  Though, replacing the LHS with a
>>>>>>> new
>>>>>>> anonymous SSA_NAME might be needed too, in case it is before SSA_NAME
>>>>>>> of
>>>>>>> a
>>>>>>> user var that doesn't yet have any debug stmts.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'd say replacing the LHS is the way to go, with calling the
>>>>>> appropriate
>>>>>> helper
>>>>>> on the old stmt to generate a debug stmt for it / its uses (would need
>>>>>> to look it
>>>>>> up here).
>>>>>>
>>>>>
>>>>> Here is an attempt to fix it. The problem arises when in
>>>>> undistribute_ops_list, we linearize_expr_tree such that NEGATE_EXPR is
>>>>> added
>>>>> (-1) MULT_EXPR (OP). Real problem starts when we handle this in
>>>>> zero_one_operation. Unlike what was done earlier, we now change the
>>>>> stmt
>>>>> (with propagate_op_to_signle use or by directly) such that the value
>>>>> computed by stmt is no longer what it used to be. Because of this, what
>>>>> is
>>>>> computed in undistribute_ops_list and rewrite_expr_tree are also
>>>>> changed.
>>>>>
>>>>> undistribute_ops_list already expects this but rewrite_expr_tree will
>>>>> not if
>>>>> we dont pass the changed as an argument.
>>>>>
>>>>> The way I am fixing this now is, in linearize_expr_tree, I set
>>>>> ops_changed
>>>>> to true if we change NEGATE_EXPR to (-1) MULT_EXPR (OP). Then when we
>>>>> call
>>>>> zero_one_operation with ops_changed = true, I replace all the LHS in
>>>>> zero_one_operation with the new SSA and replace all the uses. I also
>>>>> call
>>>>> the rewrite_expr_tree with changed = false in this case.
>>>>>
>>>>> Does this make sense? Bootstrapped and regression tested for
>>>>> x86_64-linux-gnu without any new regressions.
>>>>
>>>>
>>>> I don't think this solves the issue.  zero_one_operation associates the
>>>> chain starting at the first *def and it will change the intermediate
>>>> values
>>>> of _all_ of the stmts visited until the operation to be removed is
>>>> found.
>>>> Note that this is independent of whether try_special_add_to_ops did
>>>> anything.
>>>>
>>>> Even for the regular undistribution cases we get this wrong.
>>>>
>>>> So we need to back-track in zero_one_operation, replacing each LHS
>>>> and in the end the op in the opvector of the main chain.  That's
>>>> basically
>>>> the same as if we'd do a regular re-assoc operation on the sub-chains.
>>>> Take their subops, simulate zero_one_operation by
>>>> appending the cancelling operation and optimizing the oplist, and then
>>>> materializing the associated ops via rewrite_expr_tree.
>>>>
>>> Here is a draft patch which records the stmt chain when in
>>> zero_one_operation and then fixes it when OP is removed. when we
>>> update *def, that will update the ops vector. Does this looks sane?
>>
>>
>> Yes.  A few comments below
>>
>> +  /* PR72835 - Record the stmt chain that has to be updated such that
>> +     we dont use the same LHS when the values computed are different.  */
>> +  auto_vec<gimple *> stmts_to_fix;
>>
>> use auto_vec<gimple *, 64> here so we get stack allocation only most
>> of the times
>
> Done.
>
>>           if (stmt_is_power_of_op (stmt, op))
>>             {
>> +             make_new_ssa_for_all_defs (def, op, stmts_to_fix);
>>               if (decrement_power (stmt) == 1)
>>                 propagate_op_to_single_use (op, stmt, def);
>>
>> for the cases you end up with propagate_op_to_single_use its argument
>> stmt is handled superfluosly in the new SSA making, I suggest to pop it
>> from the stmts_to_fix vector in that case.  I suggest to break; instead
>> of return in all cases and do the make_new_ssa_for_all_defs call at
>> the function end instead.
>>
> Done.
>
>> @@ -1253,14 +1305,18 @@ zero_one_operation (tree *def, enum tree_code
>> opcode, tree op)
>>               if (gimple_assign_rhs1 (stmt2) == op)
>>                 {
>>                   tree cst = build_minus_one_cst (TREE_TYPE (op));
>> +                 stmts_to_fix.safe_push (stmt2);
>> +                 make_new_ssa_for_all_defs (def, op, stmts_to_fix);
>>                   propagate_op_to_single_use (cst, stmt2, def);
>>                   return;
>>
>> this safe_push should be unnecessary for the above reason (others are
>> conditionally unnecessary).
>>
> Done.
>
> Bootstrapped and regression tested on X86_64-linux-gnu with no new
> regression. Is this OK?

+static void
+make_new_ssa_for_all_defs (tree *def, tree op,
+               auto_vec<gimple *, 64> &stmts_to_fix)

I think you need to use vec<gimple *> &stmts_to_fix here AFAIK.

Ok with that change.

Richard.

> Thanks,
> Kugan
>
>
>> I thought about simplifying the whole thing by instead of clearing an
>> op from the chain pre-pend
>> one that does the job by means of visiting the chain from reassoc
>> itself but that doesn't work out
>> for RDIV_EXPR nor does it play well with undistribute handling
>> mutliple opportunities on the same
>> chain.
>>
>> Thanks,
>> Richard.
>>
>>
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu with no new
>>> regressions.
>>>
>>> Thanks,
>>> Kugan

Follow-Ups:
- Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments
  - From: kugan

References:
- Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments
  - From: Kugan Vivekanandarajah
- Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments
  - From: Richard Biener
- Re: [PR72835] Incorrect arithmetic optimization involving bitfield arguments
  - From: kugan

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]