Bug 36701 - [4.4 Regression] unaligned access in gcc.c-torture/execute/complex-7.c
Summary: [4.4 Regression] unaligned access in gcc.c-torture/execute/complex-7.c
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.4.0
: P1 normal
Target Milestone: 4.4.0
Assignee: Not yet assigned to anyone
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2008-07-02 17:07 UTC by H.J. Lu
Modified: 2008-08-13 18:01 UTC (History)
5 users (show)

See Also:
Host:
Target: ia64-unknown-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description H.J. Lu 2008-07-02 17:07:14 UTC
Revision 137328 generates unaligned access in

gcc.c-torture/execute/complex-7.c

and revision 137304 is OK. Steve, will your patch

http://gcc.gnu.org/ml/gcc-patches/2008-06/msg00884.html

cause this? Thanks.
Comment 1 Steve Ellcey 2008-07-02 17:17:22 UTC
I don't see any reason why my patch would cause this problem.  Looking through my old test logs I see this test failing before my patch as well as after it.  The failures go back to version 126651 which is the oldest results file I have sitting around.
Comment 2 H.J. Lu 2008-07-04 00:42:41 UTC
Revision 137326:

http://gcc.gnu.org/ml/gcc-cvs/2008-07/msg00030.html

may be the cause. It may be related to

http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00219.html
Comment 3 Daniel Jacobowitz 2008-07-04 03:14:35 UTC
Does the new function do anything on this test case?  What are the various incoming modes (nominal, promoted) and the actual RTX?
Comment 4 H.J. Lu 2008-07-07 17:50:29 UTC
Revision 137575 is the cause and revision 137575 doesn't help.
Comment 5 H.J. Lu 2008-07-07 19:37:30 UTC
A small testcase:

[hjl@gnu-14 good]$ cat ../complex-7.c 
volatile _Complex float f1 = 1.1f + 2.2if;
volatile _Complex float f2 = 3.3f + 4.4if;
volatile _Complex float f3 = 5.5f + 6.6if;
volatile _Complex float f4 = 7.7f + 8.8if;
volatile _Complex float f5 = 9.9f + 10.1if;

extern void abort (void);
extern void exit (int);

__attribute__((noinline)) void
check_float (int a, _Complex float a1, _Complex float a2,
	     _Complex float a3, _Complex float a4, _Complex float a5)
{
  if (a1 != f1 || a2 != f2 || a3 != f3 || a4 != f4 || a5 != f5)
    abort ();
}

int
main (void)
{
  check_float (0, f1, f2, f3, f4, f5);
  exit (0);
}
[hjl@gnu-14 good]$ 

diff -upr bad/complex-7.c.133r.expand good/complex-7.c.133r.expand
--- bad/complex-7.c.133r.expand	2008-07-07 12:33:14.000000000 -0700
+++ good/complex-7.c.133r.expand	2008-07-07 12:33:40.000000000 -0700
@@ -5,115 +5,115 @@
 ;; Generating RTL for tree basic block 2
 
 ;; D.1272 = REALPART_EXPR <a1>
-(insn 62 61 63 ../complex-7.c:14 (set (reg:DI 419)
+(insn 42 41 43 ../complex-7.c:14 (set (reg:DI 401)
         (reg/f:DI 335 virtual-stack-vars)) -1 (nil))
 
-(insn 63 62 64 ../complex-7.c:14 (set (reg/f:DI 420)
+(insn 43 42 44 ../complex-7.c:14 (set (reg/f:DI 402)
         (plus:DI (reg/f:DI 335 virtual-stack-vars)
-            (const_int 4 [0x4]))) -1 (nil))
+            (const_int 8 [0x8]))) -1 (nil))
 
-(insn 64 63 0 ../complex-7.c:14 (set (reg:SF 373 [ D.1272 ])
-        (mem/s/c:SF (reg/f:DI 420) [0 a1+0 S4 A32])) -1 (nil))
+(insn 44 43 0 ../complex-7.c:14 (set (reg:SF 373 [ D.1272 ])
+        (mem/s/c:SF (reg/f:DI 402) [0 a1+0 S4 A64])) -1 (nil))

The alignment of the first argument passed on stack is wrong.
Comment 6 H.J. Lu 2008-07-07 22:25:07 UTC
assign_parm_remove_parallels:

  if (GET_CODE (entry_parm) == PARALLEL && GET_MODE (entry_parm) != BLKmode)
    {
      rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm));
      emit_group_store (parmreg, entry_parm, NULL_TREE,
                        GET_MODE_SIZE (GET_MODE (entry_parm)));
      entry_parm = parmreg;
    }      

is incorrect for ia64 HFA. You can't do

rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm));

when entry_parm is HFA passed in GR:

(parallel:SC [
        (expr_list:REG_DEP_TRUE (reg:DI 117 in5 [ a5 ])
            (const_int 0 [0x0]))
    ])



Comment 7 Daniel Jacobowitz 2008-07-07 22:31:38 UTC
Subject: Re:  [4.4 Regression] unaligned access in
	gcc.c-torture/execute/complex-7.c

On Mon, Jul 07, 2008 at 10:25:08PM -0000, hjl dot tools at gmail dot com wrote:
> is incorrect for ia64 HFA. You can't do
> 
> rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm));
> 
> when entry_parm is HFA passed in GR:

What does HFA mean?  Why can't you copy it into a reg:SC?  This shouldn't
change the incoming location of the argument; it's generating code at
the start of the function to retrieve the argument.

Comment 8 H.J. Lu 2008-07-08 01:19:40 UTC
HFA stands for homogeneous floating point aggregates as specified in
IA-64 Software Conventions and Runtime Architecture Guide. To pass
complex float, if we run out of available FP argument registers,
we will pass it in GP register.  For a simple testcase:

---
extern _Complex float f5;

int
check_float (int a, _Complex float a1, _Complex float a2,
	     _Complex float a3, _Complex float a4, _Complex float a5)
{
  return (a5 != f5);
}
---

(gdb) call debug_rtx (entry_parm)
(parallel:SC [
        (expr_list:REG_DEP_TRUE (reg:DI 117 in5 [ a5 ])
            (const_int 0 [0x0]))
    ])

Here DI is 64bit with 8byte alignment.

rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm));

turns it into

(concat:SC (reg:SF 380)
    (reg:SF 381))

It has 4 byte alignment.
Comment 9 H.J. Lu 2008-07-08 19:55:15 UTC
When emit_group_store allocates stack temp to copy from register
to stack, it should use the mode with the largest alignment. Does
this patch make sense?

--- expr.c.align	2008-07-02 11:56:29.000000000 -0700
+++ expr.c	2008-07-08 12:48:40.000000000 -0700
@@ -2074,10 +2074,28 @@ emit_group_store (rtx orig_dst, rtx src,
 	  else
 	    {
 	      gcc_assert (bytepos == 0 && XVECLEN (src, 0));
-	      dest = assign_stack_temp (GET_MODE (dest),
-				        GET_MODE_SIZE (GET_MODE (dest)), 0);
-	      emit_move_insn (adjust_address (dest, GET_MODE (tmps[i]), bytepos),
-			      tmps[i]);
+	      enum machine_mode dest_mode = GET_MODE (dest);
+	      enum machine_mode tmp_mode = GET_MODE (tmps[i]);
+
+	      if (GET_MODE_ALIGNMENT (dest_mode)
+		  >= GET_MODE_ALIGNMENT (tmp_mode))
+		{
+		  dest = assign_stack_temp (dest_mode,
+					    GET_MODE_SIZE (dest_mode),
+					    0);
+		  emit_move_insn (adjust_address (dest,
+						  tmp_mode,
+						  bytepos),
+				  tmps[i]);
+		}
+	      else
+		{
+		  dest = assign_stack_temp (tmp_mode,
+					    GET_MODE_SIZE (dest_mode),
+					    0);
+		  emit_move_insn (dest, tmps[i]);
+		  dest = adjust_address (dest, dest_mode, bytepos);
+		}
 	      dst = dest;
 	      break;
 	    }
Comment 10 Daniel Jacobowitz 2008-07-08 20:26:19 UTC
Subject: Re:  [4.4 Regression] unaligned access in
	gcc.c-torture/execute/complex-7.c

It sounds plausible but I suggest you ask on gcc-patches, I can't
review that.

Comment 11 H.J. Lu 2008-07-08 22:04:56 UTC
A patch is posted at

http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00671.html
Comment 12 H.J. Lu 2008-07-27 19:11:09 UTC
Jim, can you take a look at this patch?
Comment 13 hjl@gcc.gnu.org 2008-08-13 16:22:02 UTC
Subject: Bug 36701

Author: hjl
Date: Wed Aug 13 16:20:42 2008
New Revision: 139062

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=139062
Log:
2008-08-13  H.J. Lu  <hongjiu.lu@intel.com>

	PR middle-end/36701
	* expr.c (emit_group_store): Allocate stack temp with the
	largest alignment when copying from register to stack.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/expr.c

Comment 14 H.J. Lu 2008-08-13 18:01:42 UTC
Fixed.