Bug 52352 - [x32] - Wrong code to access addresses 0x80000000 to 0xFFFFFFFF using registers
Summary: [x32] - Wrong code to access addresses 0x80000000 to 0xFFFFFFFF using registers
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.6.3
: P3 normal
Target Milestone: 4.7.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2012-02-23 11:36 UTC by Steffen Schmidt
Modified: 2012-02-27 23:05 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64-*-linux-gnu
Build:
Known to work:
Known to fail: 4.6.3
Last reconfirmed: 2012-02-24 00:00:00


Attachments
C code resulting in wrong instructions when compiled with -mx32 and -O2 or higher (177 bytes, application/octet-stream)
2012-02-23 11:36 UTC, Steffen Schmidt
Details
Generated -mx32 -O3 assembler (309 bytes, text/plain)
2012-02-23 11:38 UTC, Steffen Schmidt
Details
Generated -mx32 -O1 assembler (284 bytes, text/plain)
2012-02-23 11:39 UTC, Steffen Schmidt
Details
Generated -m64 -O3 assembler (301 bytes, text/plain)
2012-02-23 11:40 UTC, Steffen Schmidt
Details
Generated -m64 -O1 assembler (278 bytes, text/plain)
2012-02-23 11:40 UTC, Steffen Schmidt
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steffen Schmidt 2012-02-23 11:36:58 UTC
Created attachment 26727 [details]
C code resulting in wrong instructions when compiled with -mx32 and -O2 or higher

The example C code defines a structure in memory at address 0xFEE00000
Compilation of this example with -mx32 and -O2 or higher (seems to be related to -fgcse) results in faulty assembler instructions.
 
For accessing several members of the structure loads the signed address of one member of the structure (-18874360 = 0xFEE00008) into register %rax. This actually results in a 64bit negative address of 0xFFFFFFFFFEE00008, which is incorrect when later using complete %rax register for memory access.

// -mx32 -O3
movq	$-18874360, %rax
movl	(%rax), %edx
xorb	%al, %al
movl	%edx, 4(%eax)

When optimizing with -O1 register %eax is used instead of %rax which results in correct behaviour:

// -mx32 -O1
movl	$-18874368, %eax
movl	8(%eax), %edx
movl	%edx, 4(%eax)


The x64 compiler produces correct code with -O1 and -O3 loading the address into %eax not %rax:

// -m64 -O1
movl	$4276092928, %eax
movl	8(%rax), %edx
movl	%edx, 4(%rax)

// -m64 -O3
movl	$4276092936, %eax
movl	(%rax), %edx
xorb	%al, %al
movl	%edx, 4(%rax)
Comment 1 Steffen Schmidt 2012-02-23 11:38:50 UTC
Created attachment 26728 [details]
Generated -mx32 -O3 assembler
Comment 2 Steffen Schmidt 2012-02-23 11:39:26 UTC
Created attachment 26729 [details]
Generated -mx32 -O1 assembler
Comment 3 Steffen Schmidt 2012-02-23 11:40:01 UTC
Created attachment 26730 [details]
Generated -m64 -O3 assembler
Comment 4 Steffen Schmidt 2012-02-23 11:40:41 UTC
Created attachment 26731 [details]
Generated -m64 -O1 assembler
Comment 5 Uroš Bizjak 2012-02-23 16:43:16 UTC
This works OK with gcc-4.7:

// -mx32 -O3
movabsl 4276092936, %eax
movabsl %eax, 4276092932
ret

H.J. probably needs to backport a patch or two from mainline.
Comment 6 Uroš Bizjak 2012-02-23 16:47:00 UTC
(In reply to comment #5)

> H.J. probably needs to backport a patch or two from mainline.

BTW: Please report problems with non-FSF branches directly to their respective authors. There is no -mx32 switch in official 4.6.x, so please try to trigger the x32 specific bug with gcc-4.7.x before reporting it in bugzilla.
Comment 7 Andrew Pinski 2012-02-23 17:27:29 UTC
Since 4.6.x did not have -mx32 I am just going to close this as a dup of the one which was reported against 4.7.  The branch you are using is not an officially supported branch and you should report bugs to HJL directly.

*** This bug has been marked as a duplicate of bug 52146 ***
Comment 8 H.J. Lu 2012-02-24 04:46:58 UTC
Working on a fix.
Comment 9 H.J. Lu 2012-02-24 04:53:04 UTC
It is fixed on hjl/x32/addr32, hjl/x32/gcc-4_6-branch and
hjl/x32/gcc-4_6-branch+mx32 branches.

The problem is

;; Stores and loads of ax to arbitrary constant address.
;; We fake an second form of instruction to force reload to load address
;; into register when rax is not available
(define_insn "*movabs<mode>_1"
  [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r"))
        (match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))]
  "TARGET_64BIT && ix86_check_movabs (insn, 0)"
  "@
   movabs{<imodesuffix>}\t{%1, %P0|%P0, %1}
   mov{<imodesuffix>}\t{%1, %a0|%a0, %1}"

DImode is incorrect for x32, especially for register operand.
That is where

movq    $-18874360, %rax
movl    (%rax), %edx

comes from. It should be in Pmode. However, the immediate operand
must be in DImode for x86_64_movabs_operand.  Changing this condition
is very intrusive. On the other hand, the second form is redundant
and I opened PR 52364 for it. My fix simply removes the second form:

http://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eb7964e69f1d7d478ae99ae6eff080f15af2b074
Comment 10 Uroš Bizjak 2012-02-24 18:32:56 UTC
(In reply to comment #9)
> It is fixed on hjl/x32/addr32, hjl/x32/gcc-4_6-branch and
> hjl/x32/gcc-4_6-branch+mx32 branches.
> 
> The problem is
> 
> ;; Stores and loads of ax to arbitrary constant address.
> ;; We fake an second form of instruction to force reload to load address
> ;; into register when rax is not available
> (define_insn "*movabs<mode>_1"
>   [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r"))
>         (match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))]
>   "TARGET_64BIT && ix86_check_movabs (insn, 0)"
>   "@
>    movabs{<imodesuffix>}\t{%1, %P0|%P0, %1}
>    mov{<imodesuffix>}\t{%1, %a0|%a0, %1}"
> 
> DImode is incorrect for x32, especially for register operand.
> That is where
> 
> movq    $-18874360, %rax
> movl    (%rax), %edx
> 
> comes from. It should be in Pmode. However, the immediate operand
> must be in DImode for x86_64_movabs_operand.

But this is the _address_ that we are talking, see the "MEM" RTX. So, following (untested) patch can help - access is in PTR mode, and "a" modifier should handle this without problems.

--cut here--
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md (revision 184560)
+++ config/i386/i386.md (working copy)
@@ -2360,7 +2360,7 @@
 ;; We fake an second form of instruction to force reload to load address
 ;; into register when rax is not available
 (define_insn "*movabs<mode>_1"
-  [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r"))
+  [(set (mem:SWI1248x (match_operand:PTR 0 "x86_64_movabs_operand" "i,r"))
        (match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))]
   "TARGET_64BIT && ix86_check_movabs (insn, 0)"
   "@
@@ -2375,7 +2375,7 @@
 
 (define_insn "*movabs<mode>_2"
   [(set (match_operand:SWI1248x 0 "register_operand" "=a,r")
-        (mem:SWI1248x (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))]
+        (mem:SWI1248x (match_operand:PTR 1 "x86_64_movabs_operand" "i,r")))]
   "TARGET_64BIT && ix86_check_movabs (insn, 1)"
   "@
    movabs{<imodesuffix>}\t{%P1, %0|%0, %P1}
--cut here--
Comment 11 H.J. Lu 2012-02-24 19:14:19 UTC
(In reply to comment #10)
>
> 
> But this is the _address_ that we are talking, see the "MEM" RTX. So, following
> (untested) patch can help - access is in PTR mode, and "a" modifier should
> handle this without problems.
> 
> --cut here--
> Index: config/i386/i386.md
> ===================================================================
> --- config/i386/i386.md (revision 184560)
> +++ config/i386/i386.md (working copy)
> @@ -2360,7 +2360,7 @@
>  ;; We fake an second form of instruction to force reload to load address
>  ;; into register when rax is not available
>  (define_insn "*movabs<mode>_1"
> -  [(set (mem:SWI1248x (match_operand:DI 0 "x86_64_movabs_operand" "i,r"))
> +  [(set (mem:SWI1248x (match_operand:PTR 0 "x86_64_movabs_operand" "i,r"))
>         (match_operand:SWI1248x 1 "nonmemory_operand" "a,er"))]
>    "TARGET_64BIT && ix86_check_movabs (insn, 0)"
>    "@
> @@ -2375,7 +2375,7 @@
> 
>  (define_insn "*movabs<mode>_2"
>    [(set (match_operand:SWI1248x 0 "register_operand" "=a,r")
> -        (mem:SWI1248x (match_operand:DI 1 "x86_64_movabs_operand" "i,r")))]
> +        (mem:SWI1248x (match_operand:PTR 1 "x86_64_movabs_operand" "i,r")))]
>    "TARGET_64BIT && ix86_check_movabs (insn, 1)"
>    "@
>     movabs{<imodesuffix>}\t{%P1, %0|%0, %P1}
> --cut here--

I checked a similar fix into hjl/x32/addr32, hjl/x32/gcc-4_6-branch and
hjl/x32/gcc-4_6-branch+mx32 branches. I also added "I" code to print
constant address as positive 32bit integer for x32:

http://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=c6d9aee05cb3bfbe3c2a1b63f3f842e8d3fcb8e0

I used :P instead of :PTR which will be removed when I submit patches
to use SImode for Pmode with x32.
Comment 12 Uroš Bizjak 2012-02-24 19:29:20 UTC
(In reply to comment #11)

> I checked a similar fix into hjl/x32/addr32, hjl/x32/gcc-4_6-branch and
> hjl/x32/gcc-4_6-branch+mx32 branches. I also added "I" code to print
> constant address as positive 32bit integer for x32:

I think we can simply disable these two patterns on x32.

This is with disabled patterns:

0000000000000000 <x32_O3_main>:
   0:   b8 00 00 e0 fe          mov    $0xfee00000,%eax
   5:   8b 00                   mov    (%rax),%eax
   7:   a8 01                   test   $0x1,%al
   9:   74 01                   je     c <x32_O3_main+0xc>
   b:   90                      nop
   c:   b8 00 00 e0 fe          mov    $0xfee00000,%eax
  11:   8b 50 08                mov    0x8(%rax),%edx
  14:   89 50 04                mov    %edx,0x4(%rax)
  17:   c3                      retq   

and with enabled patterns:

0000000000000000 <x32_O3_main>:
   0:   a1 00 00 e0 fe 00 00    movabs 0xfee00000,%eax
   7:   00 00 
   9:   a8 01                   test   $0x1,%al
   b:   74 01                   je     e <x32_O3_main+0xe>
   d:   90                      nop
   e:   a1 08 00 e0 fe 00 00    movabs 0xfee00008,%eax
  15:   00 00 
  17:   a3 04 00 e0 fe 00 00    movabs %eax,0xfee00004
  1e:   00 00 
  20:   c3                      retq   

There is simply no need for movabs from/to mem, since there is no 64bit addresses. And code size is horrible (and I bet that the former code runs faster).
Comment 13 H.J. Lu 2012-02-24 20:03:59 UTC
(In reply to comment #12)
> (In reply to comment #11)
> 
> > I checked a similar fix into hjl/x32/addr32, hjl/x32/gcc-4_6-branch and
> > hjl/x32/gcc-4_6-branch+mx32 branches. I also added "I" code to print
> > constant address as positive 32bit integer for x32:
> 
> I think we can simply disable these two patterns on x32.
> 
> This is with disabled patterns:
> 
> 0000000000000000 <x32_O3_main>:
>    0:   b8 00 00 e0 fe          mov    $0xfee00000,%eax
>    5:   8b 00                   mov    (%rax),%eax
>    7:   a8 01                   test   $0x1,%al
>    9:   74 01                   je     c <x32_O3_main+0xc>
>    b:   90                      nop
>    c:   b8 00 00 e0 fe          mov    $0xfee00000,%eax
>   11:   8b 50 08                mov    0x8(%rax),%edx
>   14:   89 50 04                mov    %edx,0x4(%rax)
>   17:   c3                      retq   
> 
> and with enabled patterns:
> 
> 0000000000000000 <x32_O3_main>:
>    0:   a1 00 00 e0 fe 00 00    movabs 0xfee00000,%eax
>    7:   00 00 
>    9:   a8 01                   test   $0x1,%al
>    b:   74 01                   je     e <x32_O3_main+0xe>
>    d:   90                      nop
>    e:   a1 08 00 e0 fe 00 00    movabs 0xfee00008,%eax
>   15:   00 00 
>   17:   a3 04 00 e0 fe 00 00    movabs %eax,0xfee00004
>   1e:   00 00 
>   20:   c3                      retq   
> 
> There is simply no need for movabs from/to mem, since there is no 64bit
> addresses. And code size is horrible (and I bet that the former code runs
> faster).

I think it is a good idea.
Comment 14 hjl@gcc.gnu.org 2012-02-27 16:48:36 UTC
Author: hjl
Date: Mon Feb 27 16:48:26 2012
New Revision: 184604

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184604
Log:
Enable *movabs<mode>_[12] only for TARGET_LP64

2012-02-27  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/52352
	* config/i386/i386.md (*movabs<mode>_1): Enable only for
	TARGET_LP64.
	(*movabs<mode>_2): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.md
Comment 15 Uroš Bizjak 2012-02-27 23:05:12 UTC
Fixed for 4.7.