This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [GCC RFC]A new and simple pass merging paired load store instructions


On Thu, May 15, 2014 at 6:31 PM, Oleg Endo <oleg.endo@t-online.de> wrote:
> Hi,
>
> On 15 May 2014, at 09:26, "bin.cheng" <bin.cheng@arm.com> wrote:
>
>> Hi,
>> Targets like ARM and AARCH64 support double-word load store instructions,
>> and these instructions are generally faster than the corresponding two
>> load/stores.  GCC currently uses peephole2 to merge paired load/store into
>> one single instruction which has a disadvantage.  It can only handle simple
>> cases like the two instructions actually appear sequentially in instruction
>> stream, and is too weak to handle cases in which the two load/store are
>> intervened by other irrelevant instructions.
>>
>> Here comes up with a new GCC pass looking through each basic block and
>> merging paired load store even they are not adjacent to each other.  The
>> algorithm is pretty simple:
>> 1) In initialization pass iterating over instruction stream it collects
>> relevant memory access information for each instruction.
>> 2) It iterates over each basic block, tries to find possible paired
>> instruction for each memory access instruction.  During this work, it checks
>> dependencies between the two possible instructions and also records the
>> information indicating how to pair the two instructions.  To avoid quadratic
>> behavior of the algorithm, It introduces new parameter
>> max-merge-paired-loadstore-distance and set the default value to 4, which is
>> large enough to catch major part of opportunities on ARM/cortex-a15.
>> 3) For each candidate pair, it calls back-end's hook to do target dependent
>> check and merge the two instructions if possible.
>>
>> Though the parameter is set to 4, for miscellaneous benchmarks, this pass
>> can merge numerous opportunities except ones already merged by peephole2
>> (same level numbers of opportunities comparing to peepholed ones).  GCC
>> bootstrap can also confirm this finding.
>
> This is interesting.  E.g. on SH there are insns to load/store SFmode pairs.  However, these insns require a mode switch and have some constraints on register usage.  So in the SH case the load/store pairing would need to be done before reg alloc and before mode switching.
>
>>
>> Yet there is an open issue about when we should run this new pass.  Though
>> register renaming is disabled by default now, I put this pass after it,
>> because renaming can resolve some false dependencies thus benefit this pass.
>> Another finding is, it can capture a lot more opportunities if it's after
>> sched2, but I am not sure whether it will mess up with scheduling results in
>> this way.
>
> How about the following.
> Instead of adding new hooks and inserting the pass to the general pass list, make the new
> pass class take the necessary callback functions directly.  Then targets can just instantiate
> the pass, passing their impl of the callbacks, and insert the pass object into the pass list at
> a place that fits best for the target.
Oh, I don't know we can do this in GCC.  But yes, a target may want to
run it at some place that fits best for the target.

Thanks,
bin
>
>
>>
>> So, any comments about this?
>>
>> Thanks,
>> bin
>>
>>
>> 2014-05-15  Bin Cheng  <bin.cheng@arm.com>
>>    * common.opt (flag_merge_paired_loadstore): New option.
>>    * merge-paired-loadstore.c: New file.
>>    * Makefile.in: Support new file.
>>    * config/arm/arm.c (TARGET_MERGE_PAIRED_LOADSTORE): New macro.
>>    (load_latency_expanded_p, arm_merge_paired_loadstore): New function.
>>    * params.def (PARAM_MAX_MERGE_PAIRED_LOADSTORE_DISTANCE): New param.
>>    * doc/invoke.texi (-fmerge-paired-loadstore): New.
>>    (max-merge-paired-loadstore-distance): New.
>>    * doc/tm.texi.in (TARGET_MERGE_PAIRED_LOADSTORE): New.
>>    * doc/tm.texi: Regenerated.
>>    * target.def (merge_paired_loadstore): New.
>>    * tree-pass.h (make_pass_merge_paired_loadstore): New decl.
>>    * passes.def (pass_merge_paired_loadstore): New pass.
>>    * timevar.def (TV_MERGE_PAIRED_LOADSTORE): New time var.
>>
>> gcc/testsuite/ChangeLog
>> 2014-05-15  Bin Cheng  <bin.cheng@arm.com>
>>
>>    * gcc.target/arm/merge-paired-loadstore.c: New test.
>>
>> <merge-paired-loadstore-20140515.txt>



-- 
Best Regards.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]