[Bug middle-end/53623] [4.7/4.8 Regression] sign extension is effectively split into two x86-64 instructions

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Jun 11 09:47:00 GMT 2012


Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
         Depends on|                            |50176
   Target Milestone|---                         |4.7.2

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-06-11 09:47:15 UTC ---
Forwprop does

--- t.c.024t.ccp1       2012-06-11 11:32:13.791164397 +0200
+++ t.c.025t.forwprop1  2012-06-11 11:32:13.792164397 +0200
@@ -11,7 +11,7 @@
 <bb 2>:
   D.1751_2 = code[rdx_1(D)];
   rdx_3 = (int64_t) D.1751_2;
-  inst_4 = (uint8_t) rdx_3;
+  inst_4 = (uint8_t) D.1751_2;
   rdx_5 = rdx_3 >> 8;
   D.1752_6 = (int) inst_4;
   D.1753_7 = dispatch[D.1752_6];

making D.1751_2 no longer single-use and thus no longer triggering combine.

Indeed looks related to 50176.

But while we certainly can teach forwprop to only consider single-use
chains (to never possibly cause this issue) it isn't a good solution.
In fact for properly optimizing this we need to know whether cheap
sub-reg like accesses are possible (combining (int) (uint8_t) (int64_t)
code[rdx_1] to simply extending the lower part of (int64_t) code[rdx_1]
without explicit truncation).  This seems more fit for an RTL optimization
pass than for a tree pass if consider the forwprop "optimization" be done
in source like

#include <stdint.h>

typedef (*inst_t)(int64_t rdi, int64_t rsi, int64_t rdx);

int16_t code[256];
inst_t dispatch[256];

void an_inst(int64_t rdi, int64_t rsi, int64_t rdx) {
    uint8_t inst;
    inst = (uint8_t) code[rdx];
    rdx = code[rdx];
    rdx >>= 8;
    dispatch[inst](rdi, rsi, rdx);

int main(void) {
    return 0;

which you could easily get from some level of abstraction.

More information about the Gcc-bugs mailing list