This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: PATCH:[darwin] fix load of a misaligned double word
- From: Alan Modra <amodra at bigpond dot net dot au>
- To: gcc-patches at gcc dot gnu dot org, Bradley Lucier <lucier at math dot purdue dot edu>
- Cc: Hartmut Penner <HPENNER at de dot ibm dot com>,Geoff Keating <geoffk at geoffk dot org>, dje at watson dot ibm dot com,fjahanian at apple dot com, pinskia at physics dot uc dot edu,Ulrich Weigand <Ulrich dot Weigand at de dot ibm dot com>
- Date: Sat, 21 Feb 2004 23:03:30 +1030
- Subject: Re: PATCH:[darwin] fix load of a misaligned double word
- References: <OFC33F797D.6A27BF16-ONC1256E3E.0055FF5D-C1256E3F.002BE267@de.ibm.com> <0CD2715B-62E7-11D8-ABEF-003065BA681E@math.purdue.edu>
On Thu, Feb 19, 2004 at 09:22:29AM -0500, Bradley Lucier wrote:
> The bug I hit still shows up on the 3.4 branch under certain
> optimization settings:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13674
>
> It would be nice to get it fixed there before the release.
You may have already seen this patch, but it appears the mailing list
archives have dropped the two emails I sent. The following is a
consolidation of the two emails with a little more explanation.
-----
I've just been fixing a case that sounds very similar to this on
the hammer branch. The attached testcase (which doesn't trigger a
problem for me on gcc-3.4 and mainline) results in:
$ powerpc64-linux-gcc -O2 /src/tmp/Channel.ii
/src/tmp/Channel.ii: In function `void g(void*, void*)':
/src/tmp/Channel.ii:35: error: insn does not satisfy its constraints:
(insn:HI 35 833 43 0 0x400afc00 (set (mem/s:DI (plus:DI (reg:DI 10 r10)
(const_int 113 [0x71])) [16 <variable>._vptr.c1+0 S8 A64])
(reg:DI 8 r8 [123])) 324 {*movdi_internal64} (insn_list 34 (insn_list 27 (nil)))
(nil))
/src/tmp/Channel.ii:35: internal compiler error: in
reload_cse_simplify_operands, at reload1.c:8356
Channel.ii.27.lreg
(insn:HI 35 22 43 0 0x400afc00 (set (mem/s:DI (reg/v/f:DI 122 [ this ]) [16 <variable>._vptr.c1+0 S8 A64])
(reg:DI 123)) 324 {*movdi_internal64} (insn_list 34 (insn_list 27 (nil)))
(expr_list:REG_DEAD (reg:DI 123)
(nil)))
Channel.ii.28.greg
(insn 833 22 35 0 (nil) (set (reg:DI 10 r10)
(plus:DI (reg/f:DI 1 r1)
(const_int 0 [0x0]))) 233 {*adddi3_internal1} (nil)
(nil))
(insn:HI 35 833 43 0 0x400afc00 (set (mem/s:DI (plus:DI (reg:DI 10 r10)
(const_int 113 [0x71])) [16 <variable>._vptr.c1+0 S8 A64])
(reg:DI 8 r8 [123])) 324 {*movdi_internal64} (insn_list 34 (insn_list 27 (nil)))
(nil))
Notice how we have a useless (plus (reg) (0)) in the greg dump. This is
coming from the following part of rs6000_legitimize_reload_address:
/* Check for 32-bit overflow. */
if (high + low != val)
{
*win = 0;
return x;
}
/* Reload the high part into a base reg; leave the low part
in the mem directly. */
x = gen_rtx_PLUS (GET_MODE (x),
gen_rtx_PLUS (GET_MODE (x), XEXP (x, 0),
GEN_INT (high)),
GEN_INT (low));
push_reload (XEXP (x, 0), NULL_RTX, &XEXP (x, 0), NULL,
BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
opnum, (enum reload_type)type);
*win = 1;
return x;
The code here is supposed to be handling the case where we have an
offset that is not in the 16-bit signed range but within 32-bit signed.
In such cases it is more efficient to calculate the high part of the
offset in a register and leave the low part in the mem. The high part
might be useful for a number of different offsets. However, we are also
triggering this code for offsets that are invalid because they aren't a
multiple of four. 64 bit gpr loads and stores on ppc64 only allow
multiples of four for the offset.
Now, clearly it's wrong for legitimize_reload_address to produce useless
rtl. We ought to be testing for high == 0 just for this reason alone.
Worse, returning "win" stops reload from reloading the whole address
into a register. If we let reload do its job, we'll not have these
invalid non-multiple-of-four offsets in MEMs.
Of course, the offset can both exceed 16-bit signed range and also not
be a multiple of four. When that happens, we need to test for modes
that might be moved to/from 64 bit gprs. Testing the mode isn't ideal,
because we might be moving a DImode in a floating point reg, where the
insn forms don't have the multiple-of-four restriction. It is possible
to test for exactly the right condition, ie. that we're using a gpr,
but then we need to look at the whole insn. I did pursue this solution
which is relatively easy to achieve in LEGITIMIZE_RELOAD_ADDRESS.
However I abandoned the idea because GO_IF_LEGITIMATE_ADDRESS should be
testing the same condition for valid offsets (LEGITIMIZE_RELOAD_ADDRESS
is called when GO_IF_LEGITIMATE_ADDRESS doesn't "win"). It's a lot
more work to pass the insn down to GO_IF_LEGITIMATE_ADDRESS...
* config/rs6000/rs6000.c (rs6000_legitimize_reload_address): Don't
attempt to handle non multiple of 4 DImode or TImode.
OK mainline and 3.4? Bootstrapped etc. powerpc-linux and
powerpc64-linux.
Index: gcc/config/rs6000/rs6000.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.c,v
retrieving revision 1.576.2.8
diff -u -p -r1.576.2.8 rs6000.c
--- gcc/config/rs6000/rs6000.c 12 Feb 2004 10:19:28 -0000 1.576.2.8
+++ gcc/config/rs6000/rs6000.c 21 Feb 2004 12:21:24 -0000
@@ -3079,8 +3068,16 @@ rs6000_legitimize_reload_address (rtx x,
HOST_WIDE_INT high
= (((val - low) & 0xffffffff) ^ 0x80000000) - 0x80000000;
- /* Check for 32-bit overflow. */
- if (high + low != val)
+ /* We get here for two reasons: a) The offset is too large,
+ or b) the offset is invalid, for example, not a multiple of
+ 4 on a DImode access. Leave case b, and when we get 32-bit
+ overflow, to the generic parts of reload to handle. */
+
+ if (high + low != val
+ || high == 0
+ || ((low & 3) != 0
+ && (mode == DImode || mode == TImode)
+ && TARGET_POWERPC64))
{
*win = 0;
return x;
--
Alan Modra
IBM OzLabs - Linux Technology Centre