This is the mail archive of the libstdc++@gcc.gnu.org mailing list for the libstdc++ project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Pessimization in compiler support for builtin __complex__


Benjamin Kosnik <bkoz@redhat.com> writes:

[...]

| Does this impact the performance noted here:
| 
| http://gcc.gnu.org/ml/libstdc++/2001-09/msg00091.html
| http://gcc.gnu.org/ml/libstdc++/2001-09/msg00110.html

No, the performance regression noted there is an orthogonal story,
which I'm trying to understand.  The analysis below is conducted with
GCC-3.1 20011101 on a sparc sun4u SUNW,Ultra-2.


gcc -O2 translate the following

   __complex__ double add(__complex__ double z1, __complex__ double z2)
   {
     return z1 + z2;
   }

into

   add:
	   !#PROLOGUE# 0
	   save    %sp, -120, %sp
	   !#PROLOGUE# 1
	   std     %i0, [%fp-24]
	   ldd     [%fp-24], %f2
	   st      %i4, [%fp+84]
	   fmovs   %f2, %f4
	   std     %i2, [%fp-24]
	   st      %i5, [%fp+88]
	   fmovs   %f3, %f5
	   ldd     [%fp-24], %f2
	   fmovs   %f2, %f6
	   fmovs   %f3, %f7
	   ld      [%fp+84], %f2
	   ld      [%fp+88], %f3
	   faddd   %f4, %f2, %f4
	   ld      [%fp+92], %f2
	   std     %f4, [%fp-24]
	   ld      [%fp+96], %f3
	   faddd   %f6, %f2, %f6
	   ldd     [%fp-24], %o0
	   mov     %o0, %i0
	   mov     %o1, %i1
	   std     %f6, [%fp-24]
	   ldd     [%fp-24], %o0
	   mov     %o0, %i2
	   mov     %o1, %i3
	   ret
	   restore

which I find quite surprising, compared to the following

    struct complex { double re, im; };

    complex add(complex z1, complex z2)
    {
      complex w;
      w.re = z1.re + z2.re;
      w.im = z1.im + z2.im;
      return w;
    }

translated by g++ -O2 into

   add(complex, complex):
   .LLFB2:
	   !#PROLOGUE# 0
	   !#PROLOGUE# 1
	   mov     %o0, %o3
	   ldd     [%o1], %f2
	   ldd     [%o3], %f4
	   faddd   %f4, %f2, %f4
	   ld      [%sp+64], %o2
	   mov     %o2, %o0
	   std     %f4, [%o2]
	   ldd     [%o3+8], %f2
	   ldd     [%o1+8], %f4
	   faddd   %f2, %f4, %f2
	   jmp     %o7+12
	   std     %f2, [%o2+8]


Multiplication of complex numbers is also affected in the same way.

That is something I would like any middle-end/back-end expert
explain to me -- I know g++ implements the named return value
optimiztion but that doesn't suffice to explain the difference, I
think. 

I'm also interested in why GCC doesn't generate

           ld    [%sp+64], %o2

instead of

	   ld      [%sp+64], %o2
	   mov     %o2, %o0

-- Gaby
CodeSourcery, LLC                       http://www.codesourcery.com



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]