PATCH: Turn on x86_rep_movl_optimal for m_GENERIC64
H. J. Lu
hjl@lucon.org
Fri Nov 17 16:10:00 GMT 2006
On Fri, Nov 17, 2006 at 07:19:51AM -0800, H. J. Lu wrote:
>
> I found the following on Core 2 Duo:
>
> 1. The optimized memory functions don't help SPEC CPU 2K FP much.
> 2. The optimized memory functions help SPEC CPU 2K INT:
> -O2 + optimized memory vs -O2
> 164.gzip 1.44928%
> 175.vpr -0.522952%
> 176.gcc 23.6236%
> 181.mcf -1.30276%
> 186.crafty -0.576258%
> 197.parser 0.64%
> 252.eon 0.480769%
> 253.perlbmk -1.60281%
> 254.gap -0.406504%
> 255.vortex 10.6028%
> 256.bzip2 0.0993542%
> 300.twolf -0.0783392%
> Est. SPECint_base2000 2.457%
>
> 3. rep_movl_optimal + optimized memory functions don't help SPEC CPU
> 2K FP much.
> 4. rep_movl_optimal + optimized memory functions is a mixed bag on
> SPEC CPU 2K INT:
>
> -O2 + optimized memory + rep_movl_optimal vs -O2 + optimized memory
> 164.gzip 0%
> 175.vpr -0.233645%
> 176.gcc -2.18623%
> 181.mcf 1.10876%
> 186.crafty 0.772798%
> 197.parser -0.317965%
> 252.eon -0.438596%
> 253.perlbmk 1.58919%
> 254.gap 0.272109%
> 255.vortex 0.231929%
> 256.bzip2 0%
> 300.twolf 0.117601%
> Est. SPECint_base2000 0.0959233%
>
> Given that rep_movl_optimal improves 176.gcc significantly with the
> old memory fuctions and doesn't have no significant negative impact
> with optimized memory functions, I think we should turn it on for
> m_GENERIC64. We can always fine tune rep_movl_optimal later.
>
>
Here is the patch.
H.J.
----
2006-11-17 H.J. Lu <hongjiu.lu@intel.com>
* config/i386/i386.c (x86_rep_movl_optimal): Turn it on for
m_GENERIC64.
--- gcc/config/i386/i386.c.movl 2006-11-17 07:44:54.000000000 -0800
+++ gcc/config/i386/i386.c 2006-11-17 07:46:39.000000000 -0800
@@ -870,7 +870,7 @@ const int x86_sse_split_regs = m_ATHLON_
const int x86_sse_typeless_stores = m_ATHLON_K8;
const int x86_sse_load0_by_pxor = m_PPRO | m_PENT4 | m_NOCONA;
const int x86_use_ffreep = m_ATHLON_K8;
-const int x86_rep_movl_optimal = m_386 | m_PENT | m_PPRO | m_K6_GEODE;
+const int x86_rep_movl_optimal = m_386 | m_PENT | m_PPRO | m_K6_GEODE | m_GENERIC64;
const int x86_use_incdec = ~(m_PENT4 | m_NOCONA | m_GENERIC);
/* ??? Allowing interunit moves makes it all too easy for the compiler to put
More information about the Gcc-patches
mailing list