This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: x86_64 varargs setup jump table
On 07/22/2010 11:02 AM, Sebastian Pop wrote:
> Here are the results on AMD Phenom(tm) 9950 Quad-Core.
>
> Old: Gcc 4.6.0 revision 162355
> New: Gcc 4.6.0 revision 162355 + this patch.
> Flags: -O3 -funroll-loops -fpeel-loops -ffast-math -march=native
>
> The number is the run time percentage: (old - new) / old * 100
> (positive is better)
>
> [ no positive results ]
Hmm. At least HJ had some positive results. I'm surprised
that there are none on the AMD box.
Does movaps have reformatting stalls that perhaps movdqa does
with that particular micro-architecture? Or are stores exempt
from reformatting stalls now?
Otherwise the only thing I can think is that the computed jump
was in practice very predictable (i.e. lots of calls containing
the same sequence of types), and that performing a few less
stores makes that difference.
r~