[PATCH, i386]: Generate addr32 prefixed addresses

H.J. Lu hjl.tools@gmail.com
Mon Aug 8 17:14:00 GMT 2011


On Mon, Aug 8, 2011 at 8:16 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> Hello!
>
> Attached patch implements addr32 prefixed addresses for x86_64
> targets, where memory locations are accessed with 32bit base and index
> registers in the form (zero_extend:DI (... SImode registers ...)).
> The optimization rarely (if at all) triggers on x86_64, but is very
> important on x32 (see [1]), where many LEAs get moved into addresses
> of the operators.
>
> Of some interest is inability of reload to fix-up its own generated
> moves for offsetable memory operand constraint "o", as it happens with
> TImode moves. See [2] for further analysis and [3] for the workaround.
>
> 2011-08-08  Uros Bizjak  <ubizjak@gmail.com>
>
>        PR target/49781
>        * config/i386/i386.c (ix86_decompose_address): Allow zero-extended
>        SImode addresses.
>        (ix86_print_operand_address): Handle zero-extended addresses.
>        (memory_address_length): Add length of addr32 prefix for
>        zero-extended addresses.
>        (ix86_secondary_reload): Handle moves to/from double-word general
>        registers from/to zero-extended addresses.
>        * config/i386/predicates.md (lea_address_operand): Reject
>        zero-extended operands.
>
> Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
> {,-m32}. Additionally, H.J. tested the patch on x32 target with GCC
> bootstrap/regression tests, build of glibc (+regression tests) and
> SPEC2000/2006.
>
> Patch was committed to mainline SVN.
>
> BTW: There is a strange optimization in combine pass, where
> zero-extended address is converted on-the-fly to:
>
> Trying 9 -> 10:
> Failed to match this instruction:
> (... (and:DI (subreg:DI (plus:SI (ashift:SI (reg/v:SI 63 [ i ])
>                    (const_int 2 [0x2]))
>                (subreg:SI (reg/v/f:DI 62 [ a ]) 0)) 0)
>        (const_int 4294967295 [0xffffffff]))
> ...)
>
> While it is easy to add a pattern recognizer for this RTX to
> ix86_decompose_address/ix86_legitimate_address_p, I would like to
> understand the purpose of the conversion better and eventually fix it
> in combine pass.
>
> [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781
> [2] http://gcc.gnu.org/ml/gcc/2011-08/msg00129.html
> [3] http://gcc.gnu.org/ml/gcc/2011-08/msg00157.html
>
> Uros.
>

I checked in this testcase.

Thanks.

-- 
H.J.
---
Index: gcc.target/i386/pr49781-1.c
===================================================================
--- gcc.target/i386/pr49781-1.c	(revision 0)
+++ gcc.target/i386/pr49781-1.c	(revision 0)
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fpic" } */
+/* { dg-require-effective-target fpic } */
+
+static int heap[2*(256 +1+29)+1];
+static int heap_len;
+static int heap_max;
+void
+foo (int elems)
+{
+  int n, m;
+  int max_code = -1;
+  int node = elems;
+  heap_len = 0, heap_max = (2*(256 +1+29)+1);
+  for (n = 0; n < elems; n++)
+    heap[++heap_len] = max_code = n;
+  do {
+    n = heap[1];
+    heap[1] = heap[heap_len--];
+    m = heap[1];
+    heap[--heap_max] = n;
+    heap[--heap_max] = m;
+  } while (heap_len >= 2);
+}
+
+/* { dg-final { scan-assembler-not "lea\[lq\]?\[
\t\]\\((%|)r\[a-z0-9\]*" } } */
Index: ChangeLog
===================================================================
--- ChangeLog	(revision 177568)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2011-08-08  H.J. Lu  <hongjiu.lu@intel.com>
+
+	PR target/49781
+	* gcc.target/i386/pr49781-1.c: New.
+
 2011-08-08  Jason Merrill  <jason@redhat.com>

 	* g++.dg/cpp0x/range-for20.C: Adjust to test 50020 as well.



More information about the Gcc-patches mailing list