This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC][PATCH 0/5] Loop unrolling and memory load streams


While loop unrolling helps to keep the pipeline busy in modern
processors, it also can increase the memory streams resulting in
collisions for the hardware prefetcher that can impact performance.
This patch series tries to detect this and limit the loop unrolling.

Patch 1 : Add separate parms for rtl unroller:

Patch2: Add number of hw prefetchers available to cpu_prefetch_tune so it can
be used in loop unrolling decisions:

Patch3: Prevent tree unroller from completely unrolling inner loops if that
results in excessive strided-loads in outer loop:

Patch4: Change iv_analyze_result to take const_rtx. This is just to make the
next patch compile. No functional changes:

Patch5: add aarch64_loop_unroll_adjust to limit partial unrolling in rtl
based on strided-loads in loop:

Bootstrapped and tested on aarch64-linux-gnu (with
–funroll-all-loops). Testing on x86_64-linux-gnu ongoing.

Thanks,
Kugan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]