This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

more on djbfft (performance regression)


>From D.J. Bernstein:

> This is stunning: gcc doesn't guarantee 8-byte alignment for double
> variables on the x86.

> This accounts for a small part of the -O6 slowdown that I mentioned. It
> also accounts for some major benchmark weirdnesses that the FFTW people
> saw a few months ago.

> I reorganized djbfft 0.60 for further speedups and to be a bit nicer
> to gcc -O6. Some timings, all with proper alignment:

>   591 594 egcs -O1 -fo-f-p
>   606 606 gcc -O1 -fo-f-p
>   885 641 gcc -O6 -fo-f-p
>   1072 741 egcs -O6 -fo-f-p -mpentiumpro
>   1272 752 egcs -O6 -fo-f-p -mpentium
>   1276 760 egcs -O6 -fo-f-p

Note that "-O6" is worse than "-O1" throughout, that egcs is slightly
faster than gcc with no special options, but that -mpentium and
-mpentiumpro make things a lot worse.

Again, the code is at
ftp://koobera.math.uic.edu/pub/software/djbfft-0.60.tar.gz



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]