[PATCH, i386]: Generate addr32 prefixed addresses
H.J. Lu
hjl.tools@gmail.com
Mon Aug 8 17:14:00 GMT 2011
On Mon, Aug 8, 2011 at 8:16 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
> Attached patch implements addr32 prefixed addresses for x86_64
> targets, where memory locations are accessed with 32bit base and index
> registers in the form (zero_extend:DI (... SImode registers ...)).
> The optimization rarely (if at all) triggers on x86_64, but is very
> important on x32 (see [1]), where many LEAs get moved into addresses
> of the operators.
>
> Of some interest is inability of reload to fix-up its own generated
> moves for offsetable memory operand constraint "o", as it happens with
> TImode moves. See [2] for further analysis and [3] for the workaround.
>
> 2011-08-08 Uros Bizjak <ubizjak@gmail.com>
>
> PR target/49781
> * config/i386/i386.c (ix86_decompose_address): Allow zero-extended
> SImode addresses.
> (ix86_print_operand_address): Handle zero-extended addresses.
> (memory_address_length): Add length of addr32 prefix for
> zero-extended addresses.
> (ix86_secondary_reload): Handle moves to/from double-word general
> registers from/to zero-extended addresses.
> * config/i386/predicates.md (lea_address_operand): Reject
> zero-extended operands.
>
> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
> {,-m32}. Additionally, H.J. tested the patch on x32 target with GCC
> bootstrap/regression tests, build of glibc (+regression tests) and
> SPEC2000/2006.
>
> Patch was committed to mainline SVN.
>
> BTW: There is a strange optimization in combine pass, where
> zero-extended address is converted on-the-fly to:
>
> Trying 9 -> 10:
> Failed to match this instruction:
> (... (and:DI (subreg:DI (plus:SI (ashift:SI (reg/v:SI 63 [ i ])
> (const_int 2 [0x2]))
> (subreg:SI (reg/v/f:DI 62 [ a ]) 0)) 0)
> (const_int 4294967295 [0xffffffff]))
> ...)
>
> While it is easy to add a pattern recognizer for this RTX to
> ix86_decompose_address/ix86_legitimate_address_p, I would like to
> understand the purpose of the conversion better and eventually fix it
> in combine pass.
>
> [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781
> [2] http://gcc.gnu.org/ml/gcc/2011-08/msg00129.html
> [3] http://gcc.gnu.org/ml/gcc/2011-08/msg00157.html
>
> Uros.
>
I checked in this testcase.
Thanks.
--
H.J.
---
Index: gcc.target/i386/pr49781-1.c
===================================================================
--- gcc.target/i386/pr49781-1.c (revision 0)
+++ gcc.target/i386/pr49781-1.c (revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpic" } */
+/* { dg-require-effective-target fpic } */
+
+static int heap[2*(256 +1+29)+1];
+static int heap_len;
+static int heap_max;
+void
+foo (int elems)
+{
+ int n, m;
+ int max_code = -1;
+ int node = elems;
+ heap_len = 0, heap_max = (2*(256 +1+29)+1);
+ for (n = 0; n < elems; n++)
+ heap[++heap_len] = max_code = n;
+ do {
+ n = heap[1];
+ heap[1] = heap[heap_len--];
+ m = heap[1];
+ heap[--heap_max] = n;
+ heap[--heap_max] = m;
+ } while (heap_len >= 2);
+}
+
+/* { dg-final { scan-assembler-not "lea\[lq\]?\[
\t\]\\((%|)r\[a-z0-9\]*" } } */
Index: ChangeLog
===================================================================
--- ChangeLog (revision 177568)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2011-08-08 H.J. Lu <hongjiu.lu@intel.com>
+
+ PR target/49781
+ * gcc.target/i386/pr49781-1.c: New.
+
2011-08-08 Jason Merrill <jason@redhat.com>
* g++.dg/cpp0x/range-for20.C: Adjust to test 50020 as well.
More information about the Gcc-patches
mailing list