Bug 38969 - [4.3 Regression] -foptimize-sibling-calls generates wrong code on alpha
Summary: [4.3 Regression] -foptimize-sibling-calls generates wrong code on alpha
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.4
: P3 normal
Target Milestone: 4.3.4
Assignee: Uroš Bizjak
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords:
Depends on:
Blocks:
 
Reported: 2009-01-25 16:59 UTC by Aurelien Jarno
Modified: 2009-01-29 10:10 UTC (History)
1 user (show)

See Also:
Host: alphaev68-unknown-linux-gnu
Target: alphaev68-unknown-linux-gnu
Build: alphaev68-unknown-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2009-01-25 19:55:31


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Aurelien Jarno 2009-01-25 16:59:33 UTC
gcc 4.3 on alpha generates wrong code when -foptimize-sibling-calls is
used (which is enabled at -O2). It was not the case with gcc 4.2, and
this is still reproducible with gcc from trunk from 20090106. This is
the reason why most of the complex tests of the glibc testsuite are
failing with gcc 4.3.

Here is a reduced testcase:

$ cat main.c
#include <math.h>
#include <complex.h>

__complex__ float my_print_complex (__complex__ float x);

int main()
{
  __complex__ float a;
  __real__ a = 9;
  __imag__ a = 42;

  my_print_complex(a);

  return 0;
}
$ cat print.c
#include <complex.h>
#include <math.h>
#include <stdio.h>

__complex__ float internal_print_complex (__complex__ float x)
{

  printf("%f+%fi\n", __real__ x, __imag__ x);

  if (__real__ x < 0)
  {
     __real__ x = -__real__ x;
  }

  return x;
}

__complex__ float my_print_complex (__complex__ float x)
{
  return internal_print_complex (x);
}
$ gcc-4.3 -Wall -c main.c -o main.o
$ gcc-4.3 -O1 -foptimize-sibling-calls -Wall -c print.c -o print.o
$ gcc-4.3 -o test main.o print.o
$ ./test
0.000000+0.000000i

When print.o is not compiled with -foptimize-sibling-calls, the result
is:
$./test
9.000000+42.000000i
$
Comment 1 Uroš Bizjak 2009-01-25 17:39:18 UTC
Can you attach a working asm dump?
Comment 2 Uroš Bizjak 2009-01-25 19:55:31 UTC
gcc should initialize pseudos from "x" variable in my_print_complex:

;; Function my_print_complex (my_print_complex)


;; Generating RTL for gimple basic block 2

;; D.2813 = internal_print_complex (x); [tail call]

(insn 17 6 18 pr38969.c:20 (set (reg:SF 48 $f16)
        (reg:SF 75 [ D.2870 ])) -1 (nil))

(insn 18 17 19 pr38969.c:20 (set (reg:SF 49 $f17)
        (reg:SF 76 [ D.2870+4 ])) -1 (nil))

(call_insn 19 18 20 pr38969.c:20 (parallel [
            (set (parallel [
                        (expr_list:REG_DEP_TRUE (reg:SF 32 $f0)
                            (const_int 0 [0x0]))
                        (expr_list:REG_DEP_TRUE (reg:SF 33 $f1)
                            (const_int 4 [0x4]))
                    ])
                (call (mem:DI (symbol_ref:DI ("internal_print_complex") [flags 0x3] <function_decl 0x7f203091dc00 internal_print_complex>) [0 S8 A64])
                    (const_int 0 [0x0])))
            (use (reg:DI 29 $29))
            (clobber (reg:DI 26 $26))
        ]) -1 (expr_list:REG_EH_REGION (const_int 0 [0x0])
        (nil))
    (expr_list:REG_DEP_TRUE (use (reg:SF 49 $f17))
        (expr_list:REG_DEP_TRUE (use (reg:SF 48 $f16))
            (nil))))


Missing part is:

(insn 7 6 8 pr38969.c:20 (set (reg:SF 75 [ D.2870 ])
        (reg/v:SF 73 [ x ])) -1 (nil))

(insn 8 7 9 pr38969.c:20 (set (reg:SF 76 [ D.2870+4 ])
        (reg/v:SF 74 [ x+4 ])) -1 (nil))


Since this part is missing, some later pass initializes uninitialized registers with zeros.

Looking into it.
Comment 3 Uroš Bizjak 2009-01-26 17:20:37 UTC
Following patch fixes this problem:

--cut here--
Index: calls.c
===================================================================
--- calls.c     (revision 143671)
+++ calls.c     (working copy)
@@ -992,7 +992,6 @@ initialize_argument_information (int num
            && targetm.calls.split_complex_arg (argtype))
          {
            tree subtype = TREE_TYPE (argtype);
-           arg = save_expr (arg);
            args[j].tree_value = build1 (REALPART_EXPR, subtype, arg);
            j += inc;
            args[j].tree_value = build1 (IMAGPART_EXPR, subtype, arg);
--cut here--

This testcase triggered sibcall_failure in the loop inside expand_call function. We expanded complex argument during sibcall sequence expansion and due to sibcall_failure, we throw produced sequence away.

Since the complex argument was wrapped in SAVE_EXPR, we were not able to correctly expand function argument during second pass. The argument was already expanded to a temporary, but the initialization of a temporary was discarded...

This worked in gcc-4.2 (it produces the same initial RTL sequence as unpatched gcc-4.3/gcc-4.4) since no later pass initializes unintialized registers to zero. This functionality was introduced by dataflow merge and this bug was _exposed_ by the merge.

Regression test is in progress...
Comment 4 Uroš Bizjak 2009-01-26 17:22:36 UTC
This is generic RTL optimization problem.
Comment 5 uros 2009-01-27 10:19:06 UTC
Subject: Bug 38969

Author: uros
Date: Tue Jan 27 10:18:54 2009
New Revision: 143699

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=143699
Log:
	PR middle-end/38969
	* calls.c (initialize_argument_information): Do not wrap complex
	arguments in SAVE_EXPR.

testsuite/ChangeLog:

	PR middle-end/38969
	* gcc.c-torture/execute/pr38969.c: New test.


Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/pr38969.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/calls.c
    trunk/gcc/testsuite/ChangeLog

Comment 6 Uroš Bizjak 2009-01-27 10:27:06 UTC
Fixed in the trunk.
Comment 7 Aurelien Jarno 2009-01-27 20:00:32 UTC
Thanks a lot, I confirm it also fixes the original problem, that is problem in glibc testsuite (test-float and test-ifloat failing on most of the complex number functions).
Comment 8 uros 2009-01-29 10:05:32 UTC
Subject: Bug 38969

Author: uros
Date: Thu Jan 29 10:05:17 2009
New Revision: 143752

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=143752
Log:
	Backport from mainline:
	2009-01-28  Uros Bizjak  <ubizjak@gmail.com>

	PR target/38988
	* gcc.target/i386/pr38988.c: New test.

	2009-01-27  Uros Bizjak  <ubizjak@gmail.com>

	PR middle-end/38969
	* gcc.c-torture/execute/pr38969.c: New test.

testsuite/ChangeLog:

	Backport from mainline:
	2009-01-28  Uros Bizjak  <ubizjak@gmail.com>

	PR target/38988
	* config/i386/i386.md (set_rip_rex64): Wrap operand 1 in label_ref.
	(set_got_offset_rex64): Ditto.

	2009-01-27  Uros Bizjak  <ubizjak@gmail.com>

	PR middle-end/38969
	* calls.c (initialize_argument_information): Do not wrap complex
	arguments in SAVE_EXPR.


Added:
    branches/gcc-4_3-branch/gcc/testsuite/gcc.c-torture/execute/pr38969.c
      - copied unchanged from r143699, trunk/gcc/testsuite/gcc.c-torture/execute/pr38969.c
    branches/gcc-4_3-branch/gcc/testsuite/gcc.target/i386/pr38988.c
      - copied unchanged from r143720, trunk/gcc/testsuite/gcc.target/i386/pr38988.c
Modified:
    branches/gcc-4_3-branch/gcc/ChangeLog
    branches/gcc-4_3-branch/gcc/calls.c
    branches/gcc-4_3-branch/gcc/config/i386/i386.md
    branches/gcc-4_3-branch/gcc/testsuite/ChangeLog

Comment 9 Uroš Bizjak 2009-01-29 10:10:42 UTC
Fixed.