This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH/AARCH64] Enable software prefetching (-fprefetch-loop-arrays) for ThunderX 88xxx



On 26/01/17 20:56, Andrew Pinski wrote:
Hi,
   This patch enables -fprefetch-loop-arrays for -mcpu=thunderxt88 and
-mcpu=thunderxt88p1.  I filled out the tuning structures for both
thunderx and thunderx2t99.  No other core current enables software
prefetching so I set them to 0 which does not change the default
parameters.

OK?  Bootstrapped and tested on both ThunderX2 CN99xx and ThunderX
CN88xx with no regressions.  I got a 2x improvement for 462.libquantum
on CN88xx, overall a 10% improvement on SPEC INT on CN88xx at -Ofast.
CN99xx's SPEC did not change.

Thanks,
Andrew Pinski

ChangeLog:
* config/aarch64/aarch64-protos.h (struct tune_params): Add
prefetch_latency, simultaneous_prefetches, l1_cache_size, and
l2_cache_size fields.
(enum aarch64_autoprefetch_model): Add AUTOPREFETCHER_SW.
* config/aarch64/aarch64.c (generic_tunings): Update to include
prefetch_latency, simultaneous_prefetches, l1_cache_size, and
l2_cache_size fields to 0.
(cortexa35_tunings): Likewise.
(cortexa53_tunings): Likewise.
(cortexa57_tunings): Likewise.
(cortexa72_tunings): Likewise.
(cortexa73_tunings): Likewise.
(exynosm1_tunings): Likewise.
(thunderx_tunings): Fill out some of the new fields.
(thunderxt88_tunings): New variable.
(xgene1_tunings): Update to include prefetch_latency,
simultaneous_prefetches, l1_cache_size, and l2_cache_size fields to 0.
(qdf24xx_tunings): Likewise.
(thunderx2t99_tunings): Fill out some of the new fields.
(aarch64_override_options_internal): Consider AUTOPREFETCHER_SW like
AUTOPREFETCHER_OFF.
Set param values if the fields are non-zero.  Turn on
prefetch-loop-arrays if AUTOPREFETCHER_SW and optimize level is at
least 3 or profile feed usage is enabled.
* config/aarch64/aarch64-cores.def (thunderxt88p1): Use thunderxt88 tuning.
(thunderxt88): Likewise.

--- config/aarch64/aarch64-protos.h	(revision 244917)
+++ config/aarch64/aarch64-protos.h	(working copy)
@@ -220,10 +220,19 @@ struct tune_params
   unsigned int max_case_values;
   /* Value for PARAM_L1_CACHE_LINE_SIZE; or 0 to use the default.  */
   unsigned int cache_line_size;
+  /* Value for PARAM_PREFETCH_LATENCY; or 0 to use the default.  */
+  unsigned int prefetch_latency;
+  /* Value for PARAM_SIMULTANEOUS_PREFETCHES; or 0 to use the default.  */
+  unsigned int simultaneous_prefetches;
+  /* Value for PARAM_L1_CACHE_SIZE; or 0 to use the default.  */
+  unsigned int l1_cache_size;
+  /* Value for PARAM_L2_CACHE_SIZE; or 0 to use the default.  */
+  unsigned int l2_cache_size;

Not a blocker to the patch but I wonder whether it would be a good idea to group these prefetch-related parameters
(plus autoprefetcher_model) into a new nested struct here (prefetch_tunings or something) since there's a decent
number of them and they're all related.

Thanks,
Kyrill




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]