This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: inlining inefficiencies
Dan Nicolaescu <dann@godzilla.ICS.UCI.EDU> writes:
| Gabriel Dos Reis <gdr@codesourcery.com> writes:
|
| > Dan Nicolaescu <dann@godzilla.ICS.UCI.EDU> writes:
| >
| > | Gabriel Dos Reis <gdr@codesourcery.com> writes:
| > |
| > | > Dan Nicolaescu <dann@godzilla.ICS.UCI.EDU> writes:
| > | >
| > | > | There are some problems with inlining as shown by the code below
| > | > | (derived from oopack)
| > | > |
| > | > | class Complex_d {
| > | > | public:
| > | > | double re, im;
| > | > | Complex_d (double r, double i) : re(r), im(i) {}
| > | > | Complex_d () {}
| > | > | };
| > | >
| > | > Incidentely, I would like to mention that the compiler seems to have
| > | > some unexplained difficulty to optimize similar constructs with
| > | > double __complex__ -- that used to be mentioned in the past, and I
| > | > beleive the situation doesn't improve :-(
| > |
| > | If the difficulties you mention are related to aliasing, and you have
| > | some testcases, please send them to me.
| >
| > I did some preliminary analysis here:
| >
| > http://gcc.gnu.org/ml/libstdc++/2001-11/msg00038.html
| >
| > I'm suspecting some aliasing issues, but I can't tell for sure.
|
| The code generated by 3.2 is a little better.
| The problem is the same, when functions are inlined the argument
| passing is inlined too, and for SPARC v8 float arguments are passed in
| integer registers, so there's a lot of code generated to move
| arguments between the float and integer registers (through memory).
Thanks for the detective work and shading a light (for me at least)
oon this issue.
| Your example looks much better when compiled with -mcpu=v9 -m64
|
| __complex__ double add(__complex__ double z1, __complex__ double z2)
| {
| return z1 + z2;
| }
|
| _Z3addCdS_:
| !#PROLOGUE# 0
| !#PROLOGUE# 1
| fmovd %f0, %f8
| faddd %f8, %f4, %f12
| faddd %f2, %f6, %f4
| fmovd %f4, %f2
| retl
| fmovd %f12, %f0
Indeed, it looks much better. Do you know why GCC can't be convinced
to emit something like:
faddd %f0, %f4, %f0
retl
faddd %f2, %f6, %f2
that is, what are the reasons why GCC want to do all those moves?
| struct complex { double re, im; };
|
| complex add(complex z1, complex z2)
| {
| complex w;
| w.re = z1.re + z2.re;
| w.im = z1.im + z2.im;
| return w;
| }
|
| _Z3add7complexS_:
| !#PROLOGUE# 0
| add %sp, -224, %sp
| !#PROLOGUE# 1
| faddd %f0, %f4, %f8
| faddd %f2, %f6, %f4
| std %f8, [%sp+192]
| std %f4, [%sp+200]
| ldx [%sp+192], %g4
| ldx [%sp+200], %g1
| stx %g4, [%sp+176]
| stx %g1, [%sp+184]
| ldd [%sp+176], %f0
| ldd [%sp+184], %f2
| nop
| retl
| sub %sp, -224, %sp
This one is really ridiculous^Wfunny :-)
Any SPARC back-end expert there?
-- Gaby