This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: extend fwprop optimization


On Sun, Mar 17, 2013 at 12:15 AM, Wei Mi <wmi@google.com> wrote:
> Hi,
>
> On Sat, Mar 16, 2013 at 3:48 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Tue, Mar 12, 2013 at 8:18 AM, Wei Mi wrote:
>>> For the motivational case, I need insn splitting to get the cost
>>> right. insn splitting is not very intrusive. All I need is to call
>>> split_insns func.
>>
>> It may not look very intrusive, but there's a lot happening in the
>> back ground. You're creating a lot of new RTL, and then just throw it
>> away again. You fake the compiler into thinking you're much deeper in
>> the pipeline than you really are. You're assuming there are no
>> side-effects other than that some insn gets split, but there are back
>> ends where splitters may have side-effects.
>
> Ok, then I will remove the split_insns call.
>
>>
>> Even though I've asked twice now, you still have not explained this
>> motivational case, except to say that there is one. *What* are you
>> trying to do, *what* is not happening without the splits, and *what*
>> happens if you split. Only if you explain that in a lot more detail
>> than "I have a motivational case" then we can look into what is a
>> proper solution.
>
> :-). Sorry, I didn't say it clearly. The motivational case is the one
> mentioned in the following posts (split_insns changes a << (b & 63) to
> a << b).
> http://gcc.gnu.org/ml/gcc/2013-01/msg00181.html
> http://gcc.gnu.org/ml/gcc-patches/2013-02/msg01144.html



>
> If I remove the split_insns call and related cost estimation
> adjustment, the fwprop 18-->22 and 18-->23 will punt, because fwprop
> here looks like a reverse process of cse, the total cost after fwprop
> change is increased.
>
> Def insn 18:
>         Use insn 23
>         Use insn 22
>
> If we include the split_insns cost estimation adjustment.
>   extra benefit by removing def insn 18 = 5
>   change[0]: benefit = 0, verified - ok  // The cost of insn 22 will
> not change after fwprop + insn splitting.
>   change[1]: benefit = 0, verified - ok  // The insn 23 is the same with insn 22
> Total benefit is 5, fwprop will go on.
>
> If we remove the split_insns cost estimation adjustment.
>   extra benefit by removing def insn 18 = 5
>   change[0]: benefit = -4, verified - ok   // The costs of insn 22 and
> insn 23 will increase after fwprop.
>   change[1]: benefit = -4, verified - ok   // The insn 23 is the same
> with insn 22
> Total benefit is -3, fwprop will punt.
>
> How about adding the (a << (b&63) ==> a << b) transformation in
> simplify_binary_operation_1, becuase (a << (b&63) ==> a << b) is a
> kind of architecture specific expr simplification? Then fwprop could
> do the propagation as I expect.
>
>>
>> The problem with some of the splitters is that they exist to break up
>> RTL from 'expand' to initially keep some pattern together to allow the
>> code transformation passes to handle the pattern as one instruction.
>> This made sense when RTL was the only intermediate representation and
>> splitting too early would inhibit some optimizations. But I would
>> expect most (if not all) such cases to be less relevant because of the
>> GIMPLE middle-end work. The only splitters you can trigger are the
>> pre-reload splitters (all the reload_completed conditions obviously
>> can't trigger if you're splitting from fwprop). Perhaps those
>> splitters can/should run earlier, or be made obsolete by expanding
>> directly to the post-splitting insns.
>>
>> Unfortunately, it's not possible to tell for your case, because you
>> haven't explained it yet...
>>
>>
>>> So how about keep split_insns and remove peephole in the cost estimation func?
>>
>> I'd strongly oppose this. I do not believe this is necessary, and I
>> think it's conceptually wrong.
>>
>>
>>>> What happens if you propagate into an insn that uses the same register
>>>> twice? Will the DU chains still be valid (I don't think that's
>>>> guaranteed)?
>>>
>>> I think the DU chains still be valid. If propagate into the insn that
>>> uses the same register twice, the two uses will be replaced when the
>>> first use is seen (propagate_rtx_1 will propagate all the occurrances
>>> of the same reg in the use insn).  When the second use is seen, the
>>> df_use and use insn in its insn_info are still available.
>>> forward_propagate_into will early return after check reg_mentioned_p
>>> (DF_REF_REG (use), parent) and find out no reg is used  any more.
>>
>> With reg_mentioned_p you cannot verify that the DF_REF_LOC of USE is
>> still valid.
>
> I think DF_REF_LOC of USE may be invalid if dangling rtx will be
> recycled by garbage collection very soon (I don't know when GC will
> happen). Although DF_REF_LOC of USE maybe invalid, the early return in
> forward_propagate_into ensure it will not cause any correctness
> problem.
>
>>
>> In any case, returning to the RD problem for DU/UD chains is probably
>> a good idea, now that RD is not such a hog anymore. In effect fwprop.c
>> would return to what it looked like before the patch of r149010.
>
> I remove MD problem and use DU/UD instead.
>
>>
>> As a way forward on all of this, I'd suggest the following steps, each
>> with a separate patch:
>
> Thanks for the suggestion!
>
>> 1. replace the MD problem with RD again, and build full DU/UD chains.
>
> I include patch.1 attached.
>
>> 2. post all the recog changes separately, with minimum impact on the
>> parts of the compiler you don't really change. (For apply_change_group
>> you could even choose to overload it, or use a NUM argument with a
>> default value -- not sure if default argument values are OK for GCC
>> tho'.)
>
> patch.2 attached.
>
>> 3. implement propagation into multiple USEs, but without the splitting
>> and peepholing.
>
> patch.3 attached.
>
>> 4. see about fixing the back end to either split earlier or expand to
>> the desired patterns directly.
>
> I havn't included this part. If you agree with the proposal to add the
> transformation (a << (b&63) ==> a << b) in
> simplify_binary_operation_1, I will send out another patch about it

Sounds like you need to look into using SHIFT_COUNT_TRUNCATED and
TARGET_SHIFT_TRUNCATION_MASK which you could use in
simplify_binary_operation_1 without it being target specific as it
uses the target hooks to find out that :).

Thanks,
Andrew

.
>
> Thanks,
> Wei.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]