This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: Bin Cheng <bin dot cheng at arm dot com>
- Cc: "Bin.Cheng" <amker dot cheng at gmail dot com>, gcc-patches at gcc dot gnu dot org, Richard Earnshaw <Richard dot Earnshaw at arm dot com>
- Date: Tue, 18 Sep 2012 11:50:10 +0200
- Subject: Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-const.c
- References: <50111a4e.c1d3440a.06c3.ffffd750SMTPIN_ADDED@mx.google.com> <CA+=Sn1mJBS4as8zqqo+uKSFVYH-P6vYK2e+BiOJ1NTm2v3TFiw@mail.gmail.com> <501143BD.6070508@arm.com> <5045d11d.85d2d80a.7762.ffffe452SMTPIN_ADDED@mx.google.com> <CAFiYyc33zkuabVRq1wJ4DjpJkY+bXGzxA5dq43iyJyDqzXSzrw@mail.gmail.com> <5045DD8E.50803@arm.com> <CAHFci29uvxehoLBeX_ksucMmLMXP30KJn_fPtrOutSjr=exetg@mail.gmail.com> <CAHFci28jGG_3QTggGzj-1+C+uacQeCsjzP2VcJ_4kn2zO4eB4A@mail.gmail.com> <505840b1.e785d80a.2337.ffffbe5bSMTPIN_ADDED@mx.google.com>
On Tue, Sep 18, 2012 at 11:32 AM, Bin Cheng <bin.cheng@arm.com> wrote:
> Ping.
I already approved your original patch upthread.
Richard.
>> -----Original Message-----
>> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-owner@gcc.gnu.org]
> On
>> Behalf Of Bin.Cheng
>> Sent: Tuesday, September 04, 2012 11:20 PM
>> To: Richard Guenther; gcc-patches@gcc.gnu.org
>> Cc: Richard Earnshaw
>> Subject: Re: Ping^2: [PATCH]Remove duplicate check on BRANCH_COST in fold-
>> const.c
>>
>> Sorry, I mis-sent this offline.
>>
>> On Tue, Sep 4, 2012 at 11:00 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> >>>
>> >>> It's not ok (I btw fail to see the patch in this thread). The
>> >>> current way LOGICAL_OP_NON_SHORT_CIRCUIT is implemented/used should
>> >>> instead be changed to always match the pattern
>> >>>
>> >>> LOGICAL_OP_NON_SHORT_CIRCUIT
>> >>> && (BRANCH_COST (optimize_function_for_speed_p (cfun),
>> >>> false) >= 2)
>> >>>
>> >>> and the default value of LOGICAL_OP_NON_SHORT_CIRCUIT should be 1,
>> >>> defined in defaults.h (and the docs updated).
>> >>>
>> >>
>> >> That's not going to work for modern ARM cores. We want to set
>> >> BRANCH_COST to 1 but still have it generate the non-short-circuit
>> >> code (because conditional compares are really cheap.
>> >>
>> >
>> > Hi Richard,
>> > For now, LOGICAL_OP_NON_SHORT_CIRCUIT macro is defined as below, which
>> > is duplicate of the BRANCH_COST condition.
>> >
>> > #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
>> > #define LOGICAL_OP_NON_SHORT_CIRCUIT \
>> > (BRANCH_COST (optimize_function_for_speed_p (cfun), \
>> > false) >= 2)
>> > #endif
>> >
>> > Recently we measured performance on some ARM processors and found it
>> > would be better to have non-short-circuit optimization while setting
>> > BRANCH_COST to 1, which is impossible with present codes. So here
>> > comes this patch as below:
>> >
>> > Index: gcc/fold-const.c
>> > ===================================================================
>> > --- gcc/fold-const.c (revision 189835)
>> > +++ gcc/fold-const.c (working copy)
>> > @@ -8443,9 +8443,7 @@
>> > if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0)
>> > return tem;
>> >
>> > - if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
>> > - false) >= 2)
>> > - && LOGICAL_OP_NON_SHORT_CIRCUIT
>> > + if (LOGICAL_OP_NON_SHORT_CIRCUIT
>> > && (code == TRUTH_AND_EXPR
>> > || code == TRUTH_ANDIF_EXPR
>> > || code == TRUTH_OR_EXPR
>> >
>> > The purpose is to remove the duplicate check on BRANCH_COST.
>> >
>> > As Andrew pointed out that the patch may change behavior if some
>> > back-ends define the macro independent of BRANCH_COST. After looking
>> > into the code, there are two uses of the macro in fold-const.c, each
>> > controls one kind code transformation. The first use is:
>> >
>> > else if (LOGICAL_OP_NON_SHORT_CIRCUIT
>> > && lhs != 0 && rhs != 0
>> > && (code == TRUTH_ANDIF_EXPR
>> > || code == TRUTH_ORIF_EXPR)
>> > && operand_equal_p (lhs, rhs, 0))
>> >
>> > The second one is:
>> >
>> > if ((BRANCH_COST (optimize_function_for_speed_p (cfun),
>> > false) >= 2)
>> > && LOGICAL_OP_NON_SHORT_CIRCUIT
>> > && (code == TRUTH_AND_EXPR
>> > || code == TRUTH_ANDIF_EXPR
>> > || code == TRUTH_OR_EXPR
>> > || code == TRUTH_ORIF_EXPR))
>> >
>> > I am not sure why the 2nd condition is designed in current way and
>> > haven't found any useful changelog on it.
>> > But considering back end can factor BRANCH_COST in
>> > LOGICAL_OP_NON_SHORT_CIRCUIT or not, we can conclude that the behavior
>> > will only be changed if some back-end want to control the two
>> > transformations differently. So the problem becomes whether the 2nd
>> > condition should be changed. Either way there is scenario cannot be
>> > covered.
>> >
>>
>> And for now,
>> FTR, only two targets redefine L_O_N_S_C: mips and rs6000. Both set it to
>> zero so won't be affected by this change.
>>
>
> Hi Richard,
> I have tried to explain the change, but I am not sure whether it is agreed
> or...
>
> Thanks very much.
>
>
>