This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH AARCH64]load store pair optimization using sched_fusion pass.


Ping.  Anybody have a look?

Thanks,
bin

On Tue, Nov 18, 2014 at 4:34 PM, Bin Cheng <bin.cheng@arm.com> wrote:
> Hi,
> This is the patch implementing ldp/stp optimization for aarch64.  It
> consists of two parts.  The first one is peephole part, which further
> includes ldp/stp patterns (both peephole patterns and the insn match
> patterns) and auxiliary functions (both checking the validity and merging).
> The second part implements the aarch64 backend hook for sched-fusion pass,
> which calculates appropriate priorities for different kinds of load/store
> instructions.  With these priorities, sched-fusion pass can schedule as many
> load/store instructions together as possible, thus the coming peephole2 pass
> can merge them.
>
> I collected data for miscellaneous benchmarks.  Some cases are improved;
> most of the rest cases are not regressed; only couple of them are regressed
> a little by 2-3%.  After looking into the regressions I can confirm that
> code transformation is generally good with many load/stores paired.  These
> regressions are most probably false alarms and caused by other issues.
>
> Conclusion is this patch can pair lots of consecutive load/store
> instructions into ldp/stp.  The conclusion can be proven by code size
> improvement of benchmarks.  E.g., in general it cuts off text size of
> spec2k6 binaries (O3 level, not statically linked in my build) by 1.68%.
>
> Bootstrap and test on aarch64.  Is it OK?
>
> 2014-11-18  Bin Cheng  <bin.cheng@arm.com>
>
>         * config/aarch64/aarch64.md (load_pair<mode>): Split to
>         load_pairsi, load_pairdi, load_pairsf and load_pairdf.
>         (load_pairsi, load_pairdi, load_pairsf, load_pairdf): Split
>         from load_pair<mode>.  New alternative to support int/fp
>         registers in fp/int mode patterns.
>         (store_pair<mode>:): Split to store_pairsi, store_pairdi,
>         store_pairsf and store_pairdi.
>         (store_pairsi, store_pairdi, store_pairsf, store_pairdf): Split
>         from store_pair<mode>.  New alternative to support int/fp
>         registers in fp/int mode patterns.
>         (*load_pair_extendsidi2_aarch64): New pattern.
>         (*load_pair_zero_extendsidi2_aarch64): New pattern.
>         (aarch64-ldpstp.md): Include.
>         * config/aarch64/aarch64-ldpstp.md: New file.
>         * config/aarch64/aarch64-protos.h (aarch64_gen_adjusted_ldpstp):
> New.
>         (extract_base_offset_in_addr): New.
>         (aarch64_operands_ok_for_ldpstp): New.
>         (aarch64_operands_adjust_ok_for_ldpstp): New.
>         * config/aarch64/aarch64.c (enum sched_fusion_type): New enum.
>         (TARGET_SCHED_FUSION_PRIORITY): New hook.
>         (fusion_load_store): New functon.
>         (extract_base_offset_in_addr): New function.
>         (aarch64_gen_adjusted_ldpstp): New function.
>         (aarch64_sched_fusion_priority): New function.
>         (aarch64_operands_ok_for_ldpstp): New function.
>         (aarch64_operands_adjust_ok_for_ldpstp): New function.
>
> 2014-11-18  Bin Cheng  <bin.cheng@arm.com>
>
>         * gcc.target/aarch64/ldp-stp-1.c: New test.
>         * gcc.target/aarch64/ldp-stp-2.c: New test.
>         * gcc.target/aarch64/ldp-stp-3.c: New test.
>         * gcc.target/aarch64/ldp-stp-4.c: New test.
>         * gcc.target/aarch64/ldp-stp-5.c: New test.
>         * gcc.target/aarch64/lr_free_1.c: Disable scheduling fusion
>         and peephole2 pass.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]