This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
[patch] Improve loop array prefetch for IA-64
- From: Canqun Yang <canqun at yahoo dot com dot cn>
- To: gcc at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org
- Date: Fri, 2 Jun 2006 17:13:54 +0800 (CST)
- Subject: [patch] Improve loop array prefetch for IA-64
Hi, all
This patch results a performance increase of 4% for SPECfp2000 and 13% for NAS benchmark suite on
Itanium-2 system, respectively. More performance increase is hopeful by further tuning the
parameters and improving the prefetch algorithm at tree level.
Details of NAS benchmarks are listed below.
GCC options: -O3 -fprefetch-loop-arrays
Target: Itanium-2 1.6GHz; L2 Cache 256K, L3 Cache 6M
Execution times in seconds
-this patch +this patch
bt.W 14.43 14.17
cg.A 13.76 6.86
ep.W 7.83 7.79
ft.A 18.73 20.15
is.B 11.85 10.94
lu.W 20.55 20.27
mg.A 15.09 11.86
sp.W 37.11 35.49
geomean 15.84 13.94
speedup 13.68%
2006-06-02 Canqun Yang <canqun@nudt.edu.cn>
* config/ia64/ia64.h (SIMULTANEOUS_PREFETCHES): Define to 18.
(PREFETCH_BLOCK): Define to 128.
(PREFETCH_LATENCY): Define to 400.
Index: ia64.h
===================================================================
--- ia64.h (revision 114307)
+++ ia64.h (working copy)
@@ -1985,13 +1985,18 @@
??? This number is bogus and needs to be replaced before the value is
actually used in optimizations. */
-#define SIMULTANEOUS_PREFETCHES 6
+#define SIMULTANEOUS_PREFETCHES 18
/* If this architecture supports prefetch, define this to be the size of
the cache line that is prefetched. */
-#define PREFETCH_BLOCK 32
+#define PREFETCH_BLOCK 128
+/* A number that should roughly corresponding to the nunmber of instructions
+ executed before the prefetch is completed. */
+
+#define PREFETCH_LATENCY 400
+
#define HANDLE_SYSV_PRAGMA 1
/* A C expression for the maximum number of instructions to execute via
Canqun Yang
__________________________________________________
赶快注册雅虎超大容量免费邮箱?
http://cn.mail.yahoo.com