Bug 42588 - unnecessary move through x87 stack/local frame for union
Summary: unnecessary move through x87 stack/local frame for union
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: unknown
: P3 enhancement
Target Milestone: 4.9.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 33989
Blocks:
  Show dependency treegraph
 
Reported: 2010-01-03 06:22 UTC by Andi Kleen
Modified: 2021-07-26 07:03 UTC (History)
1 user (show)

See Also:
Host: x86_64-linux
Target: x86_64-linux -m32
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-01-03 06:46:29


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andi Kleen 2010-01-03 06:22:52 UTC
from http://embed.cs.utah.edu/embarrassing/dec_09/harvest/gcc-head_llvm-gcc-head/

union __anonunion___u_19
{
  double __d;
  int __i[2];
};
extern __attribute__ ((__nothrow__))
     int __signbit (double __x) __attribute__ ((__const__));
     extern __attribute__ ((__nothrow__))
     int __signbit (double __x) __attribute__ ((__const__));
     extern int __signbit (double __x)
{
  union __anonunion___u_19 __u;

  {
    __u.__d = __x;
    return (__u.__i[1] < 0);
  }
}

/* Checksum = AEFB9790 */

generates with -O2 -m32 -fomit-frame-pointer

 subl    $12, %esp
        fldl    16(%esp)
        fstpl   (%esp)
        movl    4(%esp), %eax
        addl    $12, %esp
        shrl    $31, %eax
        ret

the move through the x87 stack and the local frame is totally unnecessary;
the shr could be just done on the input stack value

in comparison llvm generates the much neater:

   0:	0f b7 44 24 0c       	movzwl 0xc(%esp),%eax
   5:	c1 e8 0f             	shr    $0xf,%eax
   8:	c3                   	ret
Comment 1 Andrew Pinski 2010-01-03 06:46:29 UTC
;; __u.__d = __x_1(D);

(insn 6 5 0 t.c:15 (set (subreg:DF (reg/v:DI 60 [ __u ]) 0)
        (reg/v:DF 62 [ __x ])) -1 (nil))

That causes a reload to happen:
Reload 0: reload_out (DF) = (subreg:DF (reg/v:DI 60 [ __u ]) 0)
        FLOAT_REGS, RELOAD_FOR_OUTPUT (opnum = 0)
        reload_out_reg: (subreg:DF (reg/v:DI 60 [ __u ]) 0)
        reload_reg_rtx: (reg:DF 8 st)

I think I might have another bug about this issue before ...
Comment 2 Andrew Pinski 2010-01-03 06:49:19 UTC
Here is an example which shows the issue on PPC also:
union __anonunion___u_19
{
  double __d;
  int __i[2];
};
     extern int __signbit (double *__x)
{
  union __anonunion___u_19 __u;
  {
    __u.__d = *__x;
    return (__u.__i[1] < 0);
  }
}
Comment 3 Andrew Pinski 2010-01-03 06:51:04 UTC
This is related to PR 33989.
Comment 4 Andrew Pinski 2021-07-26 07:03:40 UTC
Fixed in 4.9.0+

Before:
(insn 6 3 7 2 (set (subreg:DF (reg/v:DI 62 [ __u ]) 0)
        (reg/v:DF 64 [ __x ])) /app/example.c:15 -1
     (nil))
After:
(insn 6 3 7 2 (set (reg/v:DI 86 [ __u ])
        (subreg:DI (reg/v:DF 88 [ __x ]) 0)) /app/example.c:15 -1
     (nil))

Which means it was fixed by r0-126192.