This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

inconsistent gcc performance on this code



Hopefully someone can point out what I'm missing. I'm using stock gcc 3.4.1 compiled from source on an opteron system, and am experimenting with automatically generated C source to perform very large convolutions.

www.boo.net/~jasonp/fgt.tar.gz contains two source files,
fgt5a.c and fgt5b.c; each contain automatically generated
code that computes a 32-point fast Galois transform (think
of it as a 32 point FFT where the elements are 64-bit
integers reduced modulo a 61-bit prime). Both files
generate the same answers, and perform the same arithmetic
in the same order. Both cases have one function with a
single basic block that does a massive amount of arithmetic
on a very large set of automatic variables. Both functions
also use inline assembly to access the 64bit->128bit multiply
on the opteron.

Compiling fgt5b with '-O3 -fomit-frame-pointer' generates
code that runs ~20% faster and is ~25% smaller than fgt5a.
The only difference between the two files is that 5a
writes each result to a different variable, while 5b
sometimes reuses the same set of 8 variables for common
(temporary) operations.

I'm trying to understand why there's a difference here,
essentially as a result of picking different variable names.
The number of variables is not the root cause; I've produced
other versions of this code that attempt to minimize the
number of declared variables, and that code is also slow.
-fnew-ra does a uniformly worse job.

Are there any heuristics that I can use to nudge gcc's
register allocator into doing a better job on code like this?
I would have thought that the compiler could figure out for
itself how best to conserve registers. The FFTW library used
to have the same problem; disabling the second scheduling pass
made FFTW 30% faster and half the size.

Any help appreciated.

jasonp


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]