[Bug target/53987] [SH] Unnecessary zero-extension before cmp/eq

olegendo at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Dec 16 19:07:00 GMT 2013


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53987

--- Comment #3 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #2)
> As of rev 204180 (4.9) this problem still exists.
> As far as I understand, the actual root of the problem is that the 'unsigned
> char' mem loads into regs are neither sign nor zero extended.

I've tried doing the following to enforce sign extension of memory loads <
SImode:

Index: gcc/config/sh/sh.md
===================================================================
--- gcc/config/sh/sh.md    (revision 205971)
+++ gcc/config/sh/sh.md    (working copy)
@@ -5958,7 +5958,18 @@

 (define_expand "zero_extend<mode>si2"
   [(set (match_operand:SI 0 "arith_reg_dest")
-    (zero_extend:SI (match_operand:QIHI 1 "zero_extend_operand")))])
+    (zero_extend:SI (match_operand:QIHI 1 "general_extend_operand")))]
+  ""
+{
+  if (!zero_extend_operand (operands[1], <MODE>mode))
+    {
+      rtx tmp = gen_reg_rtx (SImode);
+      emit_insn (gen_extend<mode>si2 (tmp, operands[1]));
+      emit_insn (gen_zero_extend<mode>si2 (operands[0],
+                       gen_lowpart (<MODE>mode, tmp)));
+      DONE;
+    }
+})

 (define_insn_and_split "*zero_extend<mode>si2_compact"
   [(set (match_operand:SI 0 "arith_reg_dest" "=r")

However, this doesn't fix the problem.
According to CSiBE (-m4 -ml -O2 -mpretend-cmove) there are a few cases where
register allocation is a bit better, but there are also some code size
increases (e.g. interference with the tst #imm,r0 patterns).  There's a code
size decrease of 228 bytes on the whole set.

Nevertheless, having the explicit sign_extend mem loads could be useful.  For
example knowing that a mem load sign extends the cmpeq insn could be hoisted
above the extension insns before register allocation.
On SH2A it's probably better to not allow zero extending mem loads in the
expander and defer the movu.{b|w} insn selection until the combine pass. 
Otherwise the original test case will always use zero extending mem loads, even
though sign extending ones would suffice (16 bit insns vs 32 bit insns).



More information about the Gcc-bugs mailing list