Revision 137328 generates unaligned access in gcc.c-torture/execute/complex-7.c and revision 137304 is OK. Steve, will your patch http://gcc.gnu.org/ml/gcc-patches/2008-06/msg00884.html cause this? Thanks.
I don't see any reason why my patch would cause this problem. Looking through my old test logs I see this test failing before my patch as well as after it. The failures go back to version 126651 which is the oldest results file I have sitting around.
Revision 137326: http://gcc.gnu.org/ml/gcc-cvs/2008-07/msg00030.html may be the cause. It may be related to http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00219.html
Does the new function do anything on this test case? What are the various incoming modes (nominal, promoted) and the actual RTX?
Revision 137575 is the cause and revision 137575 doesn't help.
A small testcase: [hjl@gnu-14 good]$ cat ../complex-7.c volatile _Complex float f1 = 1.1f + 2.2if; volatile _Complex float f2 = 3.3f + 4.4if; volatile _Complex float f3 = 5.5f + 6.6if; volatile _Complex float f4 = 7.7f + 8.8if; volatile _Complex float f5 = 9.9f + 10.1if; extern void abort (void); extern void exit (int); __attribute__((noinline)) void check_float (int a, _Complex float a1, _Complex float a2, _Complex float a3, _Complex float a4, _Complex float a5) { if (a1 != f1 || a2 != f2 || a3 != f3 || a4 != f4 || a5 != f5) abort (); } int main (void) { check_float (0, f1, f2, f3, f4, f5); exit (0); } [hjl@gnu-14 good]$ diff -upr bad/complex-7.c.133r.expand good/complex-7.c.133r.expand --- bad/complex-7.c.133r.expand 2008-07-07 12:33:14.000000000 -0700 +++ good/complex-7.c.133r.expand 2008-07-07 12:33:40.000000000 -0700 @@ -5,115 +5,115 @@ ;; Generating RTL for tree basic block 2 ;; D.1272 = REALPART_EXPR <a1> -(insn 62 61 63 ../complex-7.c:14 (set (reg:DI 419) +(insn 42 41 43 ../complex-7.c:14 (set (reg:DI 401) (reg/f:DI 335 virtual-stack-vars)) -1 (nil)) -(insn 63 62 64 ../complex-7.c:14 (set (reg/f:DI 420) +(insn 43 42 44 ../complex-7.c:14 (set (reg/f:DI 402) (plus:DI (reg/f:DI 335 virtual-stack-vars) - (const_int 4 [0x4]))) -1 (nil)) + (const_int 8 [0x8]))) -1 (nil)) -(insn 64 63 0 ../complex-7.c:14 (set (reg:SF 373 [ D.1272 ]) - (mem/s/c:SF (reg/f:DI 420) [0 a1+0 S4 A32])) -1 (nil)) +(insn 44 43 0 ../complex-7.c:14 (set (reg:SF 373 [ D.1272 ]) + (mem/s/c:SF (reg/f:DI 402) [0 a1+0 S4 A64])) -1 (nil)) The alignment of the first argument passed on stack is wrong.
assign_parm_remove_parallels: if (GET_CODE (entry_parm) == PARALLEL && GET_MODE (entry_parm) != BLKmode) { rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm)); emit_group_store (parmreg, entry_parm, NULL_TREE, GET_MODE_SIZE (GET_MODE (entry_parm))); entry_parm = parmreg; } is incorrect for ia64 HFA. You can't do rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm)); when entry_parm is HFA passed in GR: (parallel:SC [ (expr_list:REG_DEP_TRUE (reg:DI 117 in5 [ a5 ]) (const_int 0 [0x0])) ])
Subject: Re: [4.4 Regression] unaligned access in gcc.c-torture/execute/complex-7.c On Mon, Jul 07, 2008 at 10:25:08PM -0000, hjl dot tools at gmail dot com wrote: > is incorrect for ia64 HFA. You can't do > > rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm)); > > when entry_parm is HFA passed in GR: What does HFA mean? Why can't you copy it into a reg:SC? This shouldn't change the incoming location of the argument; it's generating code at the start of the function to retrieve the argument.
HFA stands for homogeneous floating point aggregates as specified in IA-64 Software Conventions and Runtime Architecture Guide. To pass complex float, if we run out of available FP argument registers, we will pass it in GP register. For a simple testcase: --- extern _Complex float f5; int check_float (int a, _Complex float a1, _Complex float a2, _Complex float a3, _Complex float a4, _Complex float a5) { return (a5 != f5); } --- (gdb) call debug_rtx (entry_parm) (parallel:SC [ (expr_list:REG_DEP_TRUE (reg:DI 117 in5 [ a5 ]) (const_int 0 [0x0])) ]) Here DI is 64bit with 8byte alignment. rtx parmreg = gen_reg_rtx (GET_MODE (entry_parm)); turns it into (concat:SC (reg:SF 380) (reg:SF 381)) It has 4 byte alignment.
When emit_group_store allocates stack temp to copy from register to stack, it should use the mode with the largest alignment. Does this patch make sense? --- expr.c.align 2008-07-02 11:56:29.000000000 -0700 +++ expr.c 2008-07-08 12:48:40.000000000 -0700 @@ -2074,10 +2074,28 @@ emit_group_store (rtx orig_dst, rtx src, else { gcc_assert (bytepos == 0 && XVECLEN (src, 0)); - dest = assign_stack_temp (GET_MODE (dest), - GET_MODE_SIZE (GET_MODE (dest)), 0); - emit_move_insn (adjust_address (dest, GET_MODE (tmps[i]), bytepos), - tmps[i]); + enum machine_mode dest_mode = GET_MODE (dest); + enum machine_mode tmp_mode = GET_MODE (tmps[i]); + + if (GET_MODE_ALIGNMENT (dest_mode) + >= GET_MODE_ALIGNMENT (tmp_mode)) + { + dest = assign_stack_temp (dest_mode, + GET_MODE_SIZE (dest_mode), + 0); + emit_move_insn (adjust_address (dest, + tmp_mode, + bytepos), + tmps[i]); + } + else + { + dest = assign_stack_temp (tmp_mode, + GET_MODE_SIZE (dest_mode), + 0); + emit_move_insn (dest, tmps[i]); + dest = adjust_address (dest, dest_mode, bytepos); + } dst = dest; break; }
Subject: Re: [4.4 Regression] unaligned access in gcc.c-torture/execute/complex-7.c It sounds plausible but I suggest you ask on gcc-patches, I can't review that.
A patch is posted at http://gcc.gnu.org/ml/gcc-patches/2008-07/msg00671.html
Jim, can you take a look at this patch?
Subject: Bug 36701 Author: hjl Date: Wed Aug 13 16:20:42 2008 New Revision: 139062 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=139062 Log: 2008-08-13 H.J. Lu <hongjiu.lu@intel.com> PR middle-end/36701 * expr.c (emit_group_store): Allocate stack temp with the largest alignment when copying from register to stack. Modified: trunk/gcc/ChangeLog trunk/gcc/expr.c
Fixed.