This is the mail archive of the
mailing list for the GCC project.
Re: [patch] tuning gcc for Intel Core2
- From: Vladimir Makarov <vmakarov at redhat dot com>
- To: Andi Kleen <ak at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 13 Nov 2006 13:23:11 -0500
- Subject: Re: [patch] tuning gcc for Intel Core2
- References: <4558A4DD.firstname.lastname@example.org> <email@example.com>
Andi Kleen wrote:
Vladimir Makarov <firstname.lastname@example.org> writes:I tried these parameters and got better results (although I don't
remeber exact numbers). Actually I've tried all parameters. I started
the work when intel's guide was not public so I had to try all parameters.
+const int x86_accumulate_outgoing_args = m_ATHLON_K8 | m_CORE2 | m_PENT4 | m_NOCONA | m_PPRO | m_GENERIC;
+const int x86_prologue_using_move = m_ATHLON_K8 | m_PPRO | m_CORE2 | m_GENERIC;
+const int x86_epilogue_using_move = m_ATHLON_K8 | m_PPRO | m_CORE2 | m_GENERIC;
Are you sure this is correct? Using moves in epilogue/prolgue
generates much bigger code and AFAIK Core2 has special hardware
to avoid any dependencies on the stack pointer, so shorter push/pop
should be as fast here and use less icache.
Even if Core2 has special hardware to decrease problem of dependencies
on stack pointer, it does not mean that usage of push/pop will be better.