When doing real by complex multiplications in C++ using std::complex<double>, the multiplication is transformed to a complex by complex multiplication, even though the imaginary part is zero for one of the arguments. This is similar to TR 19953, however there is still a problem even though it is supposed to be fixed. Here is a small test program: #include <complex> std::complex<double> a, b; double c; void f() { a = b * c; } When compiled with "g++ -O3 -march=pentium4 -mfpmath=sse -S -c test.C", the following is produced (from test.s): _Z1fv: .LFB1857: pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: movsd b+8, %xmm3 movsd b, %xmm2 movsd c, %xmm4 pxor %xmm5, %xmm5 movapd %xmm2, %xmm0 mulsd %xmm5, %xmm0 movapd %xmm4, %xmm1 mulsd %xmm3, %xmm1 addsd %xmm1, %xmm0 movsd %xmm0, a+8 mulsd %xmm4, %xmm2 mulsd %xmm5, %xmm3 subsd %xmm3, %xmm2 movsd %xmm2, a popl %ebp ret I.e, the real value c is still converted to a complex value and a full multiplication is done. I'm using the following version of gcc: Using built-in specs. Target: i686-pc-linux-gnu Configured with: ../gcc-4.0-20050312/configure --prefix=/home/fredrik/gcc --enable-languages=c,c++ Thread model: posix gcc version 4.0.0 20050312 (prerelease)
Because C99 is different from C++.
But, isn't 19953 about -ffast-math? Or you really want the same code *without* that switch?!? We are talking about completely different issues, IMHO.
For concreteness, this is what I get (with 4.0.0 20050321) if I add -ffast-math to your switches, seems not so bad, first blush: _Z1fv: .LFB1939: pushl %ebp .LCFI0: movl %esp, %ebp .LCFI1: movsd c, %xmm0 movsd b, %xmm1 mulsd %xmm0, %xmm1 mulsd b+8, %xmm0 movsd %xmm0, a+8 movsd %xmm1, a popl %ebp ret
Thanks for looking into this! Yes, I was meaning when -ffast-math is NOT used, so maybe this is completely unrelated. But I was thinking that even without -ffast-math, this should not require a full complex multiplication. I looked in the C++ library, and the multiplication is implemented as template<typename _Tp> inline complex<_Tp> operator*(const complex<_Tp>& __x, const _Tp& __y) { complex<_Tp> __r = __x; __r *= __y; return __r; } where operator*=() is implemented as inline complex<double>& complex<double>::operator*=(double __d) { _M_value *= __d; return *this; } The typedef _M_value is defined as __complex__ double. So it seems that the multiplication is converted to a full complex multiplication within the compiler. For comparison, here is the generic version of operator*() // 26.2.5/5 template<typename _Tp> complex<_Tp>& complex<_Tp>::operator*=(const _Tp& __t) { _M_real *= __t; _M_imag *= __t; return *this; } Here, the multiplication is performed separately for the real and imaginary parts. It would be nice if the multiplication worked like this also for complex<double>, even without -ffast-math. Or, is there something in the standard which would disallow this?
> It would be nice if the multiplication worked like this also for > complex<double>, even without -ffast-math. Or, is there something in the > standard which would disallow this? I don't think there is. But, AFAIK, this must be fixed in the compiler, not in the library: i.e., if you try changing operator*(const complex<_Tp>&, const _Tp&) to perform explicitly two separate real multiplications, nothing really changes because the right member is converted anyway to a complex. Therefore, the middle-end categorization is correct.
Side note: here we are talking about the specializations for real/double/long double, therefore no _M_real but __real__ _M_value and so on. In this case we rely on the compiler to expand the operations using to builtin support for complex types.
Argh! No, I was totally wrong: I tried fixing this problem some time ago and did something wrong. We can fix it in the library. Sorry.
Richard, can you possibly have a look? Why fold_complex_mult_parts doesn't optimize well complex * real also when no-fast-math?
The middle-end problem seems that, when no-fast-math, c99-conforming method, we are not implementing systematically the special cases in G.5.1/2 (and /3 for the division).
I'm recategorizing back to middle-end: really this should be properly fixed in the middle-end.
Subject: Bug 20610 CVSROOT: /cvs/gcc Module name: gcc Changes by: rth@gcc.gnu.org 2005-06-09 07:43:47 Modified files: gcc : ChangeLog gimplify.c integrate.c tree-complex.c tree-gimple.c tree-optimize.c tree-pass.h tree.h Log message: PR tree-opt/20610 * tree.h (DECL_COMPLEX_GIMPLE_REG_P): New. (struct tree_decl): Add gimple_reg_flag. * integrate.c (copy_decl_for_inlining): Copy it. * gimplify.c (internal_get_tmp_var): Set it. (gimplify_bind_expr): Likewise. (gimplify_function_tree): Likewise. (gimplify_modify_expr_complex_part): New. (gimplify_modify_expr): Use it. * tree-gimple.c (is_gimple_reg_type): Allow complex. (is_gimple_reg): Allow complex with DECL_COMPLEX_GIMPLE_REG_P set. * tree-complex.c (complex_lattice_t): New. (complex_lattice_values, complex_variable_components): New. (some_nonzerop, find_lattice_value, is_complex_reg, init_parameter_lattice_values, init_dont_simulate_again, complex_visit_stmt, complex_visit_phi, create_components, update_complex_components, update_parameter_components, update_phi_components, update_all_vops, expand_complex_move): New. (extract_component): Handle INDIRECT_REF, COMPONENT_REF, ARRAY_REF, SSA_NAME. (update_complex_assignment): Use update_complex_components; handle updates of return_expr properly. (expand_complex_addition): Use complex lattice values. (expand_complex_multiplication): Likewise. (expand_complex_division): Likewise. (expand_complex_libcall): Use update_complex_components. (expand_complex_comparison): Use update_stmt. (expand_complex_operations_1): Use expand_complex_move, retrieve lattice values. (tree_lower_complex): Compute lattice values. (tree_lower_complex_O0): Duplicate from tree_lower_complex. (pass_lower_complex_O0): Rename from pass_lower_complex. (pass_lower_complex, gate_no_optimization): New. * tree-optimize.c (init_tree_optimization_passes): Update for complex pass changes. * tree-pass.h (pass_lower_complex_O0): Declare. Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.9100&r2=2.9101 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/gimplify.c.diff?cvsroot=gcc&r1=2.135&r2=2.136 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/integrate.c.diff?cvsroot=gcc&r1=1.279&r2=1.280 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-complex.c.diff?cvsroot=gcc&r1=2.26&r2=2.27 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-gimple.c.diff?cvsroot=gcc&r1=2.38&r2=2.39 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-optimize.c.diff?cvsroot=gcc&r1=2.105&r2=2.106 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree-pass.h.diff?cvsroot=gcc&r1=2.40&r2=2.41 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/tree.h.diff?cvsroot=gcc&r1=1.735&r2=1.736
Fixed.
Breaks Ada on ia64.
Yes it might break Ada, but since this is still fixed, I am going to keep this as fixed unless it is reverted.
And in extract_component() in tree-complex.c file, I need handle VIEW_CONVER_EXPR. On gfortran front end, I need treat other type data as complex. So I build VIEW_CONVER_EXPR tree node and failed at extract_component(). Can you fix this, Richard?