Bug 64037

Summary: [4.8/4.9/5 Regression] Miscompilation with -Os and enum class : char parameter
Product: gcc Reporter: Julian Stecklina <js>
Component: rtl-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: areg.melikadamyan, ebotcazou, hjl.tools, ubizjak
Priority: P2 Keywords: ABI, wrong-code
Version: 4.8.4   
Target Milestone: 4.8.5   
Host: Target: x86_64-*-*, i?86-*-*
Build: Known to work:
Known to fail: 4.8.2, 4.9.2 Last reconfirmed: 2014-11-24 00:00:00
Attachments: Minimal test case
gcc --verbose --version for 4.9.1
gcc --verbose --version for 4.9.2
Miscompiled binary (built with 4.9.2)

Description Julian Stecklina 2014-11-23 23:07:04 UTC
Created attachment 34082 [details]
Minimal test case

On Linux with the x86-64 target, code that contains a byte-sized enum class as a function parameter is miscompiled when -Os and -flto is enabled. A minimal example is attached. I also attached g++ --verbose --version output for both compilers where this triggers (4.9.1 and 4.9.2).

To reproduce the problem compile the attached source with:
g++ -std=gnu++11 -Os -flto test.cc -o test

Run ./test and you get this output: deadbe02
Expected output: 00000002

The problem boils down to the fact that the third parameter to foo (see below) is passed in DL (the rest of RDX is not cleared). In the function, no code is emitted to clear the rest of RDX, but it is obviously assumed to be zero.

enum class X : unsigned char {
  V = 2,
};

// noinline and noclone are only there to reduce the example. This
// also triggers without these annotations in larger projects.
void foo(unsigned &out, unsigned a, X b) __attribute__((noinline,noclone));

void foo(unsigned &out, unsigned a, X b)
{
  out = static_cast<unsigned>(b);
}

int main()
{
  // Load obvious value into EDX
  unsigned deadbeef = 0xDEADBEEF;
  asm volatile ("" : "+d" (deadbeef));

  unsigned out;
  foo(out, 2, X::V);
...

0000000000400500 <main>:
  400500:	48 83 ec 18          	sub    rsp,0x18
  400504:	ba ef be ad de       	mov    edx,0xdeadbeef  <- EDX is loaded with some garbage
  400509:	48 8d 7c 24 0c       	lea    rdi,[rsp+0xc]
  40050e:	b2 02                	mov    dl,0x2          <- Parameter is put in DL, without clearing RDX first
  400510:	be 02 00 00 00       	mov    esi,0x2
  400515:	e8 0c 01 00 00       	call   400626 <foo(unsigned int&, unsigned int, X)>


0000000000400626 <foo(unsigned int&, unsigned int, X)>:
  400626:	89 17                	mov    DWORD PTR [rdi],edx  <- EDX still contains remnants of 0xDEADBEEF here!
  400628:	c3                   	ret
Comment 1 Julian Stecklina 2014-11-23 23:08:10 UTC
Created attachment 34083 [details]
gcc --verbose --version for 4.9.1
Comment 2 Julian Stecklina 2014-11-23 23:08:35 UTC
Created attachment 34084 [details]
gcc --verbose --version for 4.9.2
Comment 3 Julian Stecklina 2014-11-23 23:09:38 UTC
Created attachment 34085 [details]
Miscompiled binary (built with 4.9.2)
Comment 4 Andrew Pinski 2014-11-23 23:15:41 UTC
This sounds like the target is expanding the two sides of the function call foo inconsistent with each other.
Comment 5 Richard Biener 2014-11-24 11:13:24 UTC
Same issue without -flto but just -Os -fwhole-program.  Can also reproduced
with just -Os when making 'foo' static.

Confirmed.
Comment 6 H.J. Lu 2014-11-24 17:48:24 UTC
It also happens with -m32.
Comment 7 Andrew Pinski 2014-11-24 17:49:42 UTC
(In reply to H.J. Lu from comment #6)
> It also happens with -m32.
Comment 8 H.J. Lu 2014-11-24 19:25:20 UTC
On x86, we don't require promoting QI/HI argument to SI.
Depending on optimization choice, we may generate SI move
for QI argument:

	movl	$2, %ecx	# 9	*movqi_internal/2	[length = 5]

For -Os, we generate:

	movb	$2, %cl	# 9	*movqi_internal/2	[length = 2]

since movb is shorter.  It is a pure luck that it only fails with -Os.
The combine pass turns

(insn 2 7 4 2 (set (reg/v/f:SI 88 [ out ])
        (reg:SI 0 ax [ out ])) ../pr64037.ii:8 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 0 ax [ out ])
        (nil)))
(insn 4 2 6 2 (set (reg:SI 91 [ b ])
        (reg:SI 2 cx [ b ])) ../pr64037.ii:8 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 2 cx [ b ])
        (nil)))
(note 6 4 9 2 NOTE_INSN_FUNCTION_BEG)
(insn 9 6 10 2 (set (reg:SI 92 [ b ])
        (zero_extend:SI (subreg:QI (reg:SI 91 [ b ]) 0))) ../pr64037.ii:9 138 {*
zero_extendqisi2}
     (expr_list:REG_DEAD (reg:SI 91 [ b ])
        (nil)))
(insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
        (reg:SI 92 [ b ])) ../pr64037.ii:9 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 92 [ b ])
        (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
            (nil))))

into

(insn 2 7 4 2 (set (reg/v/f:SI 88 [ out ])
        (reg:SI 0 ax [ out ])) pr64037.ii:8 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 0 ax [ out ])
        (nil)))
(insn 4 2 6 2 (set (reg:SI 91 [ b ])
        (reg:SI 2 cx [ b ])) pr64037.ii:8 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 2 cx [ b ])
        (nil)))
(note 6 4 9 2 NOTE_INSN_FUNCTION_BEG)
(note 9 6 10 2 NOTE_INSN_DELETED)
(insn 10 9 0 2 (set (mem:SI (reg/v/f:SI 88 [ out ]) [1 *out_4(D)+0 S4 A32])
        (reg:SI 91 [ b ])) pr64037.ii:9 90 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 91 [ b ])
        (expr_list:REG_DEAD (reg/v/f:SI 88 [ out ])
            (nil))))

It is certainly wrong.
Comment 9 H.J. Lu 2014-11-24 19:52:07 UTC
This may fail on all targets where WORD_REGISTER_OPERATIONS isn't defined.
Comment 10 H.J. Lu 2014-11-24 21:33:23 UTC
To pass

enum class X : unsigned char {
  V = 2,
};

assign_parm_setup_reg calls ix86_promote_function_mode

(gdb) f 0
#0  ix86_promote_function_mode (type=0x7ffff19f85e8, mode=QImode, 
    punsignedp=0x7fffffffd3c4, fntype=0x7ffff19f8738, for_return=2)
    at /export/gnu/import/git/gcc/gcc/config/i386/i386.c:8310
8310	  if (type != NULL_TREE && POINTER_TYPE_P (type))
(gdb) call debug_tree (type)
 <enumeral_type 0x7ffff19f85e8 X
    type <integer_type 0x7ffff18a93f0 unsigned char public unsigned string-flag QI
        size <integer_cst 0x7ffff18a5fa8 constant 8>
        unit size <integer_cst 0x7ffff18a5fc0 constant 1>
        align 8 symtab 0 alias set -1 canonical type 0x7ffff18a93f0 precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst 0x7ffff18a5f78 255>>
    static unsigned type_5 QI size <integer_cst 0x7ffff18a5fa8 8> unit size <integer_cst 0x7ffff18a5fc0 1>
    align 8 symtab 0 alias set -1 canonical type 0x7ffff19f85e8 precision 8 min <integer_cst 0x7ffff18a5fd8 0> max <integer_cst 0x7ffff18a5f78 255>
    values <tree_list 0x7ffff19fb028
        purpose <identifier_node 0x7ffff19f6738 V
            bindings <(nil)>
            local bindings <(nil)>>
        value <const_decl 0x7ffff18c21c0 V type <enumeral_type 0x7ffff19f85e8 X>
            readonly constant used VOID file pr64037.ii line 2 col 3
            align 1 context <enumeral_type 0x7ffff19f85e8 X> initial <integer_cst 0x7ffff19d8d08 2>>> context <translation_unit_decl 0x7ffff7ff91e0 D.1>
    chain <type_decl 0x7ffff19f5c78 X>>
(gdb) 

However, setup_incoming_promotions calls ix86_promote_function_mode with

Breakpoint 18, ix86_promote_function_mode (type=0x7ffff18a9690, mode=SImode, 
    punsignedp=0x7fffffffd774, fntype=0x7ffff19f8738, for_return=0)
    at /export/gnu/import/git/gcc/gcc/config/i386/i386.c:8310
8310	  if (type != NULL_TREE && POINTER_TYPE_P (type))
(gdb) call debug_tree (type)
 <integer_type 0x7ffff18a9690 int public SI
    size <integer_cst 0x7ffff18a5e70 type <integer_type 0x7ffff18a9150 bitsizetype> constant 32>
    unit size <integer_cst 0x7ffff18a5e88 type <integer_type 0x7ffff18a90a8 sizetype> constant 4>
    align 32 symtab 0 alias set 1 canonical type 0x7ffff18a9690 precision 32 min <integer_cst 0x7ffff18c60c0 -2147483648> max <integer_cst 0x7ffff18c60d8 2147483647>
    pointer_to_this <pointer_type 0x7ffff18cb930>>
(gdb) 

It uses a different type and mode.

If I make this change to combine:

diff --git a/gcc/combine.c b/gcc/combine.c
index 1808f97..8bdfe2b 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1562,7 +1562,7 @@ setup_incoming_promotions (rtx_insn *first)
 
       /* The mode and signedness of the argument as it is actually passed,
          after any TARGET_PROMOTE_FUNCTION_ARGS-driven ABI promotions.  */
-      mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode2, &uns3,
+      mode3 = promote_function_mode (DECL_ARG_TYPE (arg), mode1, &uns1,
                 TREE_TYPE (cfun->decl), 0);

to make setup_incoming_promotions consistent with assign_parm_setup_reg,
the testcase works.
Comment 11 H.J. Lu 2014-11-24 21:39:09 UTC
i386.c has

#define TARGET_PROMOTE_PROTOTYPES hook_bool_const_tree_true

 -- Target Hook: bool TARGET_PROMOTE_PROTOTYPES (const_tree FNTYPE)
     This target hook returns 'true' if an argument declared in a
     prototype as an integral type smaller than 'int' should actually be
     passed as an 'int'.  In addition to avoiding errors in certain
     cases of mismatch, it also makes for better code on certain
     machines.  The default is to not promote prototypes.

But it uses movq to pass a QI parameter.
Comment 12 H.J. Lu 2014-11-24 21:40:43 UTC
Here is the related discussion:

https://groups.google.com/forum/#!topic/ia32-abi/9H4BBrIdkmk
Comment 13 H.J. Lu 2014-11-25 18:23:26 UTC
The bug was introduced by

https://gcc.gnu.org/ml/gcc-cvs/2007-09/msg00613.html

commit 5d93234932c3d8617ce92b77b7013ef6bede9508
Author: shinwell <shinwell@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Thu Sep 20 11:01:18 2007 +0000

      gcc/
      * combine.c: Include cgraph.h.
      (setup_incoming_promotions): Rework to allow more aggressive
      elimination of sign extensions when all call sites of the
      current function are known to lie within the current unit.


    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@128618
138bc75d-0d04-0410-961f-82ee72b054a4

Before this commit, combine.c has

          enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg));
          int uns = TYPE_UNSIGNED (TREE_TYPE (arg));

          mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);
          if (mode == GET_MODE (reg) && mode != DECL_MODE (arg))
            {
              rtx x;
              x = gen_rtx_CLOBBER (DECL_MODE (arg), const0_rtx);
              x = gen_rtx_fmt_e ((uns ? ZERO_EXTEND : SIGN_EXTEND), mode, x);
              record_value_for_reg (reg, first, x);
            }

It matches function.c:

  /* This is not really promoting for a call.  However we need to be
     consistent with assign_parm_find_data_types and expand_expr_real_1.  */
  promoted_nominal_mode
    = promote_mode (data->nominal_type, data->nominal_mode, &unsignedp, 1);

128618 changed

mode = promote_mode (TREE_TYPE (arg), mode, &uns, 1);

to

mode3 = promote_mode (DECL_ARG_TYPE (arg), mode2, &uns3, 1);

It breaks none WORD_REGISTER_OPERATIONS targets.
Comment 14 hjl@gcc.gnu.org 2014-11-28 15:28:27 UTC
Author: hjl
Date: Fri Nov 28 15:27:55 2014
New Revision: 218161

URL: https://gcc.gnu.org/viewcvs?rev=218161&root=gcc&view=rev
Log:
Pass unpromoted argument to promote_function_mode

This patch updates setup_incoming_promotions in combine.c to match what
is actually passed in assign_parm_setup_reg in function.c.

gcc/

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

gcc/testsuite/

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/pr64037.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c
    trunk/gcc/testsuite/ChangeLog
Comment 15 H.J. Lu 2014-11-28 15:32:23 UTC
Fixed on trunk so far.
Comment 16 hjl@gcc.gnu.org 2014-12-05 11:48:23 UTC
Author: hjl
Date: Fri Dec  5 11:47:51 2014
New Revision: 218418

URL: https://gcc.gnu.org/viewcvs?rev=218418&root=gcc&view=rev
Log:
Pass unpromoted argument to promote_function_mode

This patch updates setup_incoming_promotions in combine.c to match what
is actually passed in assign_parm_setup_reg in function.c.

gcc/

	Backport from mainline
	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

gcc/testsuite/

	Backport from mainline
	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.

Added:
    branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr64037.C
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/combine.c
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Comment 17 hjl@gcc.gnu.org 2014-12-05 12:03:05 UTC
Author: hjl
Date: Fri Dec  5 12:02:33 2014
New Revision: 218420

URL: https://gcc.gnu.org/viewcvs?rev=218420&root=gcc&view=rev
Log:
Pass unpromoted argument to promote_function_mode

This patch updates setup_incoming_promotions in combine.c to match what
is actually passed in assign_parm_setup_reg in function.c.

gcc/

	Backport from mainline
	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

gcc/testsuite/

	Backport from mainline
	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.

Added:
    branches/gcc-4_8-branch/gcc/testsuite/g++.dg/pr64037.C
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/combine.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 18 H.J. Lu 2014-12-05 12:07:05 UTC
Fixed for 4.8.4 and 4.9.3.
Comment 19 Uroš Bizjak 2014-12-08 20:28:53 UTC
This patch caused PR64213.
Comment 20 uros 2014-12-09 14:35:04 UTC
Author: uros
Date: Tue Dec  9 14:34:32 2014
New Revision: 218516

URL: https://gcc.gnu.org/viewcvs?rev=218516&root=gcc&view=rev
Log:
	PR bootstrap/64213
	Revert:
	2014-11-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

testsuite/ChangeLog:

	PR bootstrap/64213
	Revert:
	2014-11-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.


Removed:
    trunk/gcc/testsuite/g++.dg/pr64037.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c
    trunk/gcc/testsuite/ChangeLog
Comment 21 uros 2014-12-09 14:41:12 UTC
Author: uros
Date: Tue Dec  9 14:40:40 2014
New Revision: 218517

URL: https://gcc.gnu.org/viewcvs?rev=218517&root=gcc&view=rev
Log:
	PR bootstrap/64213
	Revert:
	2014-11-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

testsuite/ChangeLog:

	PR bootstrap/64213
	Revert:
	2014-11-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.


Removed:
    branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr64037.C
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/combine.c
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Comment 22 uros 2014-12-09 14:44:38 UTC
Author: uros
Date: Tue Dec  9 14:44:06 2014
New Revision: 218518

URL: https://gcc.gnu.org/viewcvs?rev=218518&root=gcc&view=rev
Log:
	PR bootstrap/64213
	Revert:
	2014-11-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

testsuite/ChangeLog:

	PR bootstrap/64213
	Revert:
	2014-11-28  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.


Removed:
    branches/gcc-4_8-branch/gcc/testsuite/g++.dg/pr64037.C
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/combine.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 23 Uroš Bizjak 2014-12-09 14:48:25 UTC
Reopening due to revert on all branches.
Comment 24 hjl@gcc.gnu.org 2014-12-14 16:04:43 UTC
Author: hjl
Date: Sun Dec 14 16:04:11 2014
New Revision: 218720

URL: https://gcc.gnu.org/viewcvs?rev=218720&root=gcc&view=rev
Log:
Pass unpromoted argument to promote_function_mode

This patch updates setup_incoming_promotions in combine.c to match what
is actually passed in assign_parm_setup_reg in function.c.

gcc/

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

gcc/testsuite/

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.

Added:
    trunk/gcc/testsuite/g++.dg/pr64037.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/combine.c
    trunk/gcc/testsuite/ChangeLog
Comment 25 hjl@gcc.gnu.org 2014-12-14 16:07:35 UTC
Author: hjl
Date: Sun Dec 14 16:07:03 2014
New Revision: 218721

URL: https://gcc.gnu.org/viewcvs?rev=218721&root=gcc&view=rev
Log:
Pass unpromoted argument to promote_function_mode

This patch updates setup_incoming_promotions in combine.c to match what
is actually passed in assign_parm_setup_reg in function.c.

Backported from mainline:

gcc/

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

gcc/testsuite/

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.

Added:
    branches/gcc-4_9-branch/gcc/testsuite/g++.dg/pr64037.C
Modified:
    branches/gcc-4_9-branch/gcc/ChangeLog
    branches/gcc-4_9-branch/gcc/combine.c
    branches/gcc-4_9-branch/gcc/testsuite/ChangeLog
Comment 26 H.J. Lu 2014-12-15 16:01:43 UTC
*** Bug 58192 has been marked as a duplicate of this bug. ***
Comment 27 Jakub Jelinek 2014-12-19 13:33:13 UTC
GCC 4.8.4 has been released.
Comment 28 hjl@gcc.gnu.org 2014-12-19 19:20:05 UTC
Author: hjl
Date: Fri Dec 19 19:19:33 2014
New Revision: 218967

URL: https://gcc.gnu.org/viewcvs?rev=218967&root=gcc&view=rev
Log:
Pass unpromoted argument to promote_function_mode

This patch updates setup_incoming_promotions in combine.c to match what
is actually passed in assign_parm_setup_reg in function.c.

Backported from mainline:

gcc/

	PR rtl-optimization/64037
	* combine.c (setup_incoming_promotions): Pass the argument
	before any promotions happen to promote_function_mode.

gcc/testsuite/

	PR rtl-optimization/64037
	* g++.dg/pr64037.C: New test.

Added:
    branches/gcc-4_8-branch/gcc/testsuite/g++.dg/pr64037.C
Modified:
    branches/gcc-4_8-branch/gcc/ChangeLog
    branches/gcc-4_8-branch/gcc/combine.c
    branches/gcc-4_8-branch/gcc/testsuite/ChangeLog
Comment 29 H.J. Lu 2014-12-19 19:49:14 UTC
Fixed for 4.8.5