20610 – Real by complex multiplications perform unnecessary operations

Bug 20610 - Real by complex multiplications perform unnecessary operations

Summary: Real by complex multiplications perform unnecessary operations

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	tree-optimization (show other bugs)
Version:	4.0.0

Importance:	P2 normal
Target Milestone:	4.1.0
Assignee:	Richard Henderson

URL:
Keywords:

Depends on:
Blocks:	21994
	Show dependency tree / graph

Reported:	2005-03-23 20:59 UTC by Fredrik Huss
Modified:	2005-06-23 07:52 UTC (History)
CC List:	6 users (show)

See Also:
Host:	i686-pc-linux-gnu
Target:	i686-pc-linux-gnu
Build:	i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed:	2005-06-02 21:48:21

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Fredrik Huss 2005-03-23 20:59:57 UTC

When doing real by complex multiplications in C++ using std::complex<double>, 
the multiplication is transformed to a complex by complex multiplication, even 
though the imaginary part is zero for one of the arguments.  
  
This is similar to TR 19953, however there is still a problem even though it is 
supposed to be fixed.  
  
Here is a small test program:  
  
#include <complex>  
  
std::complex<double> a, b;  
double c;  
  
void f()  
{  
 a = b * c;  
}  
  
When compiled with "g++ -O3 -march=pentium4 -mfpmath=sse -S -c test.C", the 
following is produced (from test.s):  
  
_Z1fv:  
.LFB1857:  
 pushl %ebp  
.LCFI0:  
 movl %esp, %ebp  
.LCFI1:  
 movsd b+8, %xmm3  
 movsd b, %xmm2  
 movsd c, %xmm4  
 pxor %xmm5, %xmm5  
 movapd %xmm2, %xmm0  
 mulsd %xmm5, %xmm0  
 movapd %xmm4, %xmm1  
 mulsd %xmm3, %xmm1  
 addsd %xmm1, %xmm0  
 movsd %xmm0, a+8  
 mulsd %xmm4, %xmm2  
 mulsd %xmm5, %xmm3  
 subsd %xmm3, %xmm2  
 movsd %xmm2, a  
 popl %ebp  
 ret  
  
I.e, the real value c is still converted to a complex value and a full 
multiplication is done.  
  
I'm using the following version of gcc:  
  
Using built-in specs.  
Target: i686-pc-linux-gnu  
Configured with: ../gcc-4.0-20050312/configure --prefix=/home/fredrik/gcc  
--enable-languages=c,c++  
Thread model: posix  
gcc version 4.0.0 20050312 (prerelease)

Comment 1 Andrew Pinski 2005-03-23 21:11:39 UTC

Because C99 is different from C++.

Comment 2 Paolo Carlini 2005-03-24 00:19:37 UTC

But, isn't 19953 about -ffast-math? Or you really want the same code *without*
that switch?!? We are talking about completely different issues, IMHO.

Comment 3 Paolo Carlini 2005-03-24 00:26:03 UTC

For concreteness, this is what I get (with 4.0.0 20050321) if I add
-ffast-math to your switches, seems not so bad, first blush:

_Z1fv:
.LFB1939:
        pushl   %ebp
.LCFI0:
        movl    %esp, %ebp
.LCFI1:
        movsd   c, %xmm0
        movsd   b, %xmm1
        mulsd   %xmm0, %xmm1
        mulsd   b+8, %xmm0
        movsd   %xmm0, a+8
        movsd   %xmm1, a
        popl    %ebp
        ret

Comment 4 Fredrik Huss 2005-03-24 09:21:05 UTC

Thanks for looking into this!

Yes, I was meaning when -ffast-math is NOT used, so maybe this is completely 
unrelated. But I was thinking that even without -ffast-math, this should not 
require a full complex multiplication.

I looked in the C++ library, and the multiplication is implemented as

  template<typename _Tp>
    inline complex<_Tp>
    operator*(const complex<_Tp>& __x, const _Tp& __y)
    {
      complex<_Tp> __r = __x;
      __r *= __y;
      return __r;
    }

where operator*=() is implemented as

  inline complex<double>&
  complex<double>::operator*=(double __d)
  {
    _M_value *= __d;
    return *this;
  }

The typedef _M_value is defined as __complex__ double. So it seems that the 
multiplication is converted to a full complex multiplication within the 
compiler.

For comparison, here is the generic version of operator*()

  // 26.2.5/5
  template<typename _Tp>
    complex<_Tp>&
    complex<_Tp>::operator*=(const _Tp& __t)
    {
      _M_real *= __t;
      _M_imag *= __t;
      return *this;
    }

Here, the multiplication is performed separately for the real and imaginary 
parts.

It would be nice if the multiplication worked like this also for 
complex<double>, even without -ffast-math. Or, is there something in the 
standard which would disallow this?

Comment 5 Paolo Carlini 2005-03-24 09:28:04 UTC

> It would be nice if the multiplication worked like this also for 
> complex<double>, even without -ffast-math. Or, is there something in the 
> standard which would disallow this?

I don't think there is. But, AFAIK, this must be fixed in the compiler, not
in the library: i.e., if you try changing operator*(const complex<_Tp>&,
const _Tp&) to perform explicitly two separate real multiplications, nothing
really changes because the right member is converted anyway to a complex.
Therefore, the middle-end categorization is correct.

Comment 6 Paolo Carlini 2005-03-24 09:33:56 UTC

Side note: here we are talking about the specializations for real/double/long
double, therefore no _M_real but __real__ _M_value and so on. In this case we
rely on the compiler to expand the operations using to builtin support for
complex types.

Comment 7 Paolo Carlini 2005-03-24 09:55:00 UTC

Argh! No, I was totally wrong: I tried fixing this problem some time ago and
did something wrong. We can fix it in the library. Sorry.

Comment 8 Paolo Carlini 2005-03-24 11:08:03 UTC

Richard, can you possibly have a look? Why fold_complex_mult_parts doesn't
optimize well complex * real also when no-fast-math?

Comment 9 Paolo Carlini 2005-03-24 11:39:30 UTC

The middle-end problem seems that, when no-fast-math, c99-conforming method, we 
are not implementing systematically the special cases in G.5.1/2 (and /3 for the
division).

Comment 10 Paolo Carlini 2005-03-24 15:52:04 UTC

I'm recategorizing back to middle-end: really this should be properly fixed
in the middle-end.

Comment 11 GCC Commits 2005-06-09 07:44:25 UTC

Subject: Bug 20610

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2005-06-09 07:43:47

Modified files:
	gcc            : ChangeLog gimplify.c integrate.c tree-complex.c 
	                 tree-gimple.c tree-optimize.c tree-pass.h 
	                 tree.h 

Log message:
	PR tree-opt/20610
	* tree.h (DECL_COMPLEX_GIMPLE_REG_P): New.
	(struct tree_decl): Add gimple_reg_flag.
	* integrate.c (copy_decl_for_inlining): Copy it.
	* gimplify.c (internal_get_tmp_var): Set it.
	(gimplify_bind_expr): Likewise.
	(gimplify_function_tree): Likewise.
	(gimplify_modify_expr_complex_part): New.
	(gimplify_modify_expr): Use it.
	* tree-gimple.c (is_gimple_reg_type): Allow complex.
	(is_gimple_reg): Allow complex with DECL_COMPLEX_GIMPLE_REG_P set.
	
	* tree-complex.c (complex_lattice_t): New.
	(complex_lattice_values, complex_variable_components): New.
	(some_nonzerop, find_lattice_value, is_complex_reg,
	init_parameter_lattice_values, init_dont_simulate_again,
	complex_visit_stmt, complex_visit_phi, create_components,
	update_complex_components, update_parameter_components,
	update_phi_components, update_all_vops, expand_complex_move): New.
	(extract_component): Handle INDIRECT_REF, COMPONENT_REF, ARRAY_REF,
	SSA_NAME.
	(update_complex_assignment): Use update_complex_components;
	handle updates of return_expr properly.
	(expand_complex_addition): Use complex lattice values.
	(expand_complex_multiplication): Likewise.
	(expand_complex_division): Likewise.
	(expand_complex_libcall): Use update_complex_components.
	(expand_complex_comparison): Use update_stmt.
	(expand_complex_operations_1): Use expand_complex_move, retrieve
	lattice values.
	(tree_lower_complex): Compute lattice values.
	(tree_lower_complex_O0): Duplicate from tree_lower_complex.
	(pass_lower_complex_O0): Rename from pass_lower_complex.
	(pass_lower_complex, gate_no_optimization): New.
	* tree-optimize.c (init_tree_optimization_passes): Update for
	complex pass changes.
	* tree-pass.h (pass_lower_complex_O0): Declare.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9100&r2=2.9101
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/gimplify.c.diff?cvsroot=gcc&r1=2.135&r2=2.136
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/integrate.c.diff?cvsroot=gcc&r1=1.279&r2=1.280
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-complex.c.diff?cvsroot=gcc&r1=2.26&r2=2.27
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-gimple.c.diff?cvsroot=gcc&r1=2.38&r2=2.39
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-optimize.c.diff?cvsroot=gcc&r1=2.105&r2=2.106
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-pass.h.diff?cvsroot=gcc&r1=2.40&r2=2.41
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree.h.diff?cvsroot=gcc&r1=1.735&r2=1.736

Comment 12 Richard Henderson 2005-06-09 08:14:34 UTC

Fixed.

Comment 13 Andreas Schwab 2005-06-10 14:17:35 UTC

Breaks Ada on ia64.

Comment 14 Andrew Pinski 2005-06-10 14:20:30 UTC

Yes it might break Ada, but since this is still fixed, I am going to keep this as fixed unless it is reverted.

Comment 15 fengwang 2005-06-23 07:52:44 UTC

And in extract_component() in tree-complex.c file, I need handle 
VIEW_CONVER_EXPR.
On gfortran front end, I need treat other type data as complex. So I build 
VIEW_CONVER_EXPR tree node and failed at extract_component(). Can you fix 
this, Richard?