This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PR63633: May middle-end come up width hard regs for insn expanders?


Am 04/20/2015 um 10:11 PM schrieb Vladimir Makarov:
On 17/04/15 05:58 AM, Georg-Johann Lay wrote:
I allowed me to CC Vladimir; maybe he can propose how the backend can
describe an efficient, constraint-based solution.  The problem is about
expanders producing insns with non-fixed hard-regs as in/out operands or
clobbers.  This includes move insn from non-generic address spaces which
require dedicated hard regs. Issue is about correctness and efficiency of
generated code.

I might be wrong but I think you have a bloated code because you use
scratches.  I already told several times that usage of scratch is always a bad
idea.  It was a bad idea for an old RA and is still a bad idea for IRA.  The
usage of scratches should be prohibited, probably we should write it
somewhere.  It is better to use just a regular pseudo instead.

Why it is a bad idea?  Because IRA (or the old global RA) does not take them
into account *at all*.  It means that IRA might think that there are enough
registers for pseudos but in reality it is wrong because of scratches in live
range of the pseudos.

Ok, thanks for that information!

The avr backend actually uses clobbers only if one is available peep2. But there are insns that need always specific registers.

If it is not the case I should investigate why you have a bloated code and
small test would help here.

Thanks,  I hope my comments will be useful.

Attached is a C test program which produces fine results with

$ avr-gcc -S -O2 -mmcu=atmega8

Also attached is a respective patch against the trunk avr backend that indicates the transition from clobbers to hard-regs-by-constraint.

I don't actually remember when I tried this first; sometimes around when 4.8 was in stage I or so.

If my recollection is right; the problem was not that small test programs with mulsi3 produced large code, but that "ordinary" code could get much worse. I had the impression it was because the bunch of new, rarely used / rarely useful register classes, and that IRA's cost computation got confused resp. much less accurate than with the usual register classes (only 10 classes of GENERAL_REG).

The attached patch adds 27 new register classes, and to transform all insns even more classes might be needed: 8-bit, 16-bit and 24-bit multiplications including sign/zero extension of operands, fixed-point functions from 8...32 bit, divmod, builtins implementations, support functions for address spaces, ...

The insns which are using this all have the following properties in common:

- Only 1 constraint alternative

- Register allocation is uniquely determined, i.e. reg allocator has no choice what register to pick for what operand (except for commutative constraints with '%' which give exactly 2 solutions).

The patch avoids clobbers or scratches altogether. The only insn where a register is affected that is not the output, are transformed from single_set to parallels in split1. The 2nd set describes setting a (reg:HI 26) to a useless value. The insn is not expanded as parallel, because insn combine won't use them for combinations.

Is there a chance that register allocation gets worse just because so many register classes are added?

Johann

typedef __UINT8_TYPE__ u8;
typedef __INT8_TYPE__ s8;

typedef __UINT16_TYPE__ u16;
typedef __INT16_TYPE__ s16;

typedef __INT32_TYPE__ s32;

s32 muls16_const (s16 a)
{
    return (s32) a * 12345;
}

s32 mulu16_const (u16 a)
{
    return (s32) a * 12345;
}

s32 mul_s16_s16 (s16 a, s16 b)
{
    return (s32) a * b;
}

s32 mul_s16_u16 (s16 a, u16 b)
{
    return (s32) a * b;
}

s32 mul32 (s32 a, s32 b, s32 c)
{
    return a * b * c * 257;
}

s32 mul32_uconst (s32 a)
{
    return a * 12345;
}

s32 mul32_oconst (s32 a)
{
    return a * -60000;
}

s32 mul32_s8 (s32 a, s8 b)
{
    return a * b;
}

s32 mul32_u8 (s32 a, u8 b)
{
    return a * b;
}

s32 mul32_u8_const (u8 b)
{
    return -1024L * b;
}

s32 mul32_s8_const (s8 b)
{
    return 12345L * b;
}


u16 udiv10 (u16 a)
{
    return a / 10;
}
Index: avr-protos.h
===================================================================
--- avr-protos.h	(revision 222143)
+++ avr-protos.h	(working copy)
@@ -19,7 +19,7 @@
    along with GCC; see the file COPYING3.  If not see
    <http://www.gnu.org/licenses/>.  */
 
-
+extern bool avr_split1_completed (void);
 extern int avr_function_arg_regno_p (int r);
 extern void avr_cpu_cpp_builtins (struct cpp_reader * pfile);
 extern enum reg_class avr_regno_reg_class (int r);
Index: constraints.md
===================================================================
--- constraints.md	(revision 222143)
+++ constraints.md	(working copy)
@@ -19,6 +19,40 @@
 
 ;; Register constraints
 
+(define_register_constraint "R16_1" "REGS_R16" "")
+(define_register_constraint "R16_2" "REGS_R16_R17" "")
+(define_register_constraint "R16_3" "REGS_R16_R18" "")
+(define_register_constraint "R16_4" "REGS_R16_R19" "")
+
+(define_register_constraint "R18_1" "REGS_R18" "")
+(define_register_constraint "R18_2" "REGS_R18_R19" "")
+(define_register_constraint "R18_3" "REGS_R18_R20" "")
+(define_register_constraint "R18_4" "REGS_R18_R21" "")
+
+(define_register_constraint "R20_1" "REGS_R20" "")
+(define_register_constraint "R20_2" "REGS_R20_R21" "")
+(define_register_constraint "R20_3" "REGS_R20_R22" "")
+(define_register_constraint "R20_4" "REGS_R20_R23" "")
+
+(define_register_constraint "R22_1" "REGS_R22" "")
+(define_register_constraint "R22_2" "REGS_R22_R23" "")
+(define_register_constraint "R22_3" "REGS_R22_R24" "")
+(define_register_constraint "R22_4" "REGS_R22_R25" "")
+
+(define_register_constraint "R24_1" "REGS_R24" "")
+(define_register_constraint "R24_2" "REGS_R24_R25" "")
+(define_register_constraint "R24_3" "REGS_R24_R26" "")
+(define_register_constraint "R24_4" "REGS_R24_R27" "")
+
+(define_register_constraint "R17_1" "REGS_R17" "")
+(define_register_constraint "R19_1" "REGS_R19" "")
+(define_register_constraint "R21_1" "REGS_R21" "")
+(define_register_constraint "R23_1" "REGS_R23" "")
+(define_register_constraint "R25_1" "REGS_R25" "")
+(define_register_constraint "R26_1" "REGS_R26" "")
+(define_register_constraint "R27_1" "REGS_R27" "")
+
+
 (define_register_constraint "t" "R0_REG"
   "Temporary register r0")
 
Index: avr.c
===================================================================
--- avr.c	(revision 222143)
+++ avr.c	(working copy)
@@ -327,6 +327,53 @@ avr_to_int_mode (rtx x)
     : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0);
 }
 
+static const pass_data avr_pass_data_after_split1 =
+{
+  RTL_PASS,      // type
+  "",            // name
+  OPTGROUP_NONE, // optinfo_flags
+  TV_NONE,       // tv_id
+  0, // properties_required
+  0, // properties_provided
+  0, // properties_destroyed
+  0, // todo_flags_start
+  0, // todo_flags_finish
+};
+
+class avr_pass_after_split1 : public rtl_opt_pass
+{
+public:
+  avr_pass_after_split1 (gcc::context *ctxt, const char *name)
+    : rtl_opt_pass (avr_pass_data_after_split1, ctxt)
+  {
+    this->name = name;
+  }
+
+  virtual unsigned int execute (function*)
+  {
+    cfun->machine->split1_completed = true;
+    return 0;
+  }
+}; // class avr_pass_after_split1
+
+/* FIXME: We compose insns by means of insn combine and split them in split1.
+      We don't want IRA/reload to combine them to the original insns again
+      because that avoids some CSE optimizations if constants are involved
+      (or even results in unrecognizable insns).
+      If IRA/reload combines, the recombined insns get split again after
+      reload, but then CSE does not take place.
+         It appears that at present there is no other way to take away the
+      insns from IRA.  Notice that split1 runs unconditionally so that all our
+      insns will get split no matter of command line options.  */
+
+bool
+avr_split1_completed (void)
+{
+  return (cfun
+          && cfun->machine
+          && cfun->machine->split1_completed);
+}
+
 
 static const pass_data avr_pass_data_recompute_notes =
 {
@@ -364,6 +411,12 @@ public:
 static void
 avr_register_passes (void)
 {
+  /* Sole purpose of this machine specific pass is to set
+     `cfun->machine->split1_completed' to true after pass split1 completed.  */
+
+  register_pass (new avr_pass_after_split1 (g, "avr-after-split1"),
+                 PASS_POS_INSERT_AFTER, "split1", 1);
+
   /* This avr-specific pass (re)computes insn notes, in particular REG_DEAD
      notes which are used by `avr.c::reg_unused_after' and branch offset
      computations.  These notes must be correct, i.e. there must be no
@@ -551,12 +604,12 @@ avr_regno_reg_class (int r)
       NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
       NO_LD_REGS, NO_LD_REGS, NO_LD_REGS, NO_LD_REGS,
       /* r16 - r23 */
-      SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS,
-      SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS, SIMPLE_LD_REGS,
+	  REGS_R16, REGS_R17, REGS_R18, REGS_R19, 
+	  REGS_R20, REGS_R21, REGS_R22, REGS_R23, 
       /* r24, r25 */
-      ADDW_REGS, ADDW_REGS,
+	  REGS_R24, REGS_R25,
       /* X: r26, 27 */
-      POINTER_X_REGS, POINTER_X_REGS,
+	  REGS_R26, REGS_R27,
       /* Y: r28, r29 */
       POINTER_Y_REGS, POINTER_Y_REGS,
       /* Z: r30, r31 */
Index: avr.h
===================================================================
--- avr.h	(revision 222143)
+++ avr.h	(working copy)
@@ -217,9 +217,46 @@ These two properties are reflected by bu
 enum reg_class {
   NO_REGS,
   R0_REG,			/* r0 */
+
+  // 1 byte
+  REGS_R16,
+  REGS_R17,
+  REGS_R18,
+  REGS_R19,
+  REGS_R20,
+  REGS_R21,
+  REGS_R22,
+  REGS_R23,
+  REGS_R24,
+  REGS_R25,
+  REGS_R26,
+  REGS_R27,
+
+  // 2 bytes
+  REGS_R16_R17,
+  REGS_R18_R19,
+  REGS_R20_R21,
+  REGS_R22_R23,
+  REGS_R24_R25,
+
   POINTER_X_REGS,		/* r26 - r27 */
   POINTER_Y_REGS,		/* r28 - r29 */
   POINTER_Z_REGS,		/* r30 - r31 */
+
+  // 3 bytes
+  REGS_R16_R18,
+  REGS_R18_R20,
+  REGS_R20_R22,
+  REGS_R22_R24,
+  REGS_R24_R26,
+
+  // 4 bytes
+  REGS_R16_R19,
+  REGS_R18_R21,
+  REGS_R20_R23,
+  REGS_R22_R25,
+  REGS_R24_R27,
+
   STACK_REG,			/* STACK */
   BASE_POINTER_REGS,		/* r28 - r31 */
   POINTER_REGS,			/* r26 - r31 */
@@ -237,9 +274,44 @@ enum reg_class {
 #define REG_CLASS_NAMES {					\
 		 "NO_REGS",					\
 		   "R0_REG",	/* r0 */                        \
-		   "POINTER_X_REGS", /* r26 - r27 */		\
+  /* 1 byte */                                                  \
+  "REGS_R16",                                                   \
+  "REGS_R17",                                                   \
+  "REGS_R18",                                                   \
+  "REGS_R19",                                                   \
+  "REGS_R20",                                                   \
+  "REGS_R21",                                                   \
+  "REGS_R22",                                                   \
+  "REGS_R23",                                                   \
+  "REGS_R24",                                                   \
+  "REGS_R25",                                                   \
+  "REGS_R26",                                                   \
+  "REGS_R27",                                                   \
+                                                                \
+  /* 2 bytes */                                                 \
+  "REGS_R16_R17",                                               \
+  "REGS_R18_R19",                                               \
+  "REGS_R20_R21",                                               \
+  "REGS_R22_R23",                                               \
+  "REGS_R24_R25",                                               \
+                                                                \
+                   "POINTER_X_REGS", /* r26 - r27 */		\
 		   "POINTER_Y_REGS", /* r28 - r29 */		\
 		   "POINTER_Z_REGS", /* r30 - r31 */		\
+  /* 3 bytes */                                                 \
+  "REGS_R16_R18",                                               \
+  "REGS_R18_R20",                                               \
+  "REGS_R20_R22",                                               \
+  "REGS_R22_R24",                                               \
+  "REGS_R24_R26",                                               \
+                                                                \
+  /* 4 bytes */                                                 \
+  "REGS_R16_R19",                                               \
+  "REGS_R18_R21",                                               \
+  "REGS_R20_R23",                                               \
+  "REGS_R22_R25",                                               \
+  "REGS_R24_R27",                                               \
+                                                                \
 		   "STACK_REG",	/* STACK */			\
 		   "BASE_POINTER_REGS",	/* r28 - r31 */		\
 		   "POINTER_REGS", /* r26 - r31 */		\
@@ -253,9 +325,42 @@ enum reg_class {
 #define REG_CLASS_CONTENTS {						\
   {0x00000000,0x00000000},	/* NO_REGS */				\
   {0x00000001,0x00000000},	/* R0_REG */                            \
+                                                                        \
+  { 1u << 16, 0x0},                                                     \
+  { 1u << 17, 0x0},                                                     \
+  { 1u << 18, 0x0},                                                     \
+  { 1u << 19, 0x0},                                                     \
+  { 1u << 20, 0x0},                                                     \
+  { 1u << 21, 0x0},                                                     \
+  { 1u << 22, 0x0},                                                     \
+  { 1u << 23, 0x0},                                                     \
+  { 1u << 24, 0x0},                                                     \
+  { 1u << 25, 0x0},                                                     \
+  { 1u << 26, 0x0},                                                     \
+  { 1u << 27, 0x0},                                                     \
+                                                                        \
+  { 3u << 16, 0x0},                                                     \
+  { 3u << 18, 0x0},                                                     \
+  { 3u << 20, 0x0},                                                     \
+  { 3u << 22, 0x0},                                                     \
+  { 3u << 24, 0x0},                                                     \
+                                                                        \
   {3u << REG_X,0x00000000},     /* POINTER_X_REGS, r26 - r27 */		\
   {3u << REG_Y,0x00000000},     /* POINTER_Y_REGS, r28 - r29 */		\
   {3u << REG_Z,0x00000000},     /* POINTER_Z_REGS, r30 - r31 */		\
+                                                                        \
+  { 7u << 16, 0x0},                                                     \
+  { 7u << 18, 0x0},                                                     \
+  { 7u << 20, 0x0},                                                     \
+  { 7u << 22, 0x0},                                                     \
+  { 7u << 24, 0x0},                                                     \
+                                                                        \
+  { 15u << 16, 0x0},                                                    \
+  { 15u << 18, 0x0},                                                    \
+  { 15u << 20, 0x0},                                                    \
+  { 15u << 22, 0x0},                                                    \
+  { 15u << 24, 0x0},                                                    \
+                                                                        \
   {0x00000000,0x00000003},	/* STACK_REG, STACK */			\
   {(3u << REG_Y) | (3u << REG_Z),					\
      0x00000000},		/* BASE_POINTER_REGS, r28 - r31 */	\
@@ -559,6 +664,8 @@ struct GTY(()) machine_function
   /* 'true' if the above is_foo predicates are sanity-checked to avoid
      multiple diagnose for the same function.  */
   int attributes_checked_p;
+
+  bool split1_completed;
 };
 
 /* AVR does not round pushes, but the existence of this macro is
Index: predicates.md
===================================================================
--- predicates.md	(revision 222143)
+++ predicates.md	(working copy)
@@ -277,3 +277,18 @@ (define_predicate "const_or_immediate_op
   (ior (match_code "const_fixed")
        (match_code "const_double")
        (match_operand 0 "immediate_operand")))
+
+;; Used in "*mulsi3"
+(define_predicate "reg_or_u16_o16_operand"
+  (and (match_code ("reg,subreg,const_int"))
+       (match_test "!avr_split1_completed()")
+       (ior (match_operand 0 "register_operand")
+            (match_operand 0 "u16_operand")
+            (match_operand 0 "o16_operand"))))
+
+(define_predicate "reg_or_u16_s16_operand"
+  (and (match_code ("reg,subreg,const_int"))
+       (match_test "!avr_split1_completed()")
+       (ior (match_operand 0 "register_operand")
+            (match_operand 0 "u16_operand")
+            (match_operand 0 "s16_operand"))))
Index: avr.md
===================================================================
--- avr.md	(revision 222143)
+++ avr.md	(working copy)
@@ -2008,6 +2008,212 @@ (define_insn_and_split "*sumsubqihi4.uco
     operands[2] = gen_int_mode (INTVAL (operands[2]), QImode);
   })
 
+
+;;;32-bit mul ;;;;;;;;;;;;;;
+
+(define_expand "mulsi3"
+  [(set (match_operand:SI 0 "register_operand" "")
+        (mult:SI (match_operand:SI 1 "register_operand" "")
+                 (match_operand:SI 2 "nonmemory_operand" "")))]
+  "AVR_HAVE_MUL"
+  {
+    if (u16_operand (operands[2], SImode))
+      {
+        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
+        emit_insn (gen_muluhisi3_call (operands[0], operands[2], operands[1]));
+        DONE;
+      }
+
+    if (o16_operand (operands[2], SImode))
+      {
+        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
+        emit_insn (gen_mulohisi3_call (operands[0], operands[2], operands[1]));
+        DONE;
+      }
+  })
+
+(define_insn_and_split "*mulsi3"
+  [(set (match_operand:SI 0 "register_operand"               "=r")
+        (mult:SI (match_operand:SI 1 "register_operand"       "r")
+                 (match_operand:SI 2 "reg_or_u16_o16_operand" "r")))]
+  "AVR_HAVE_MUL
+   && !avr_split1_completed()"
+  { gcc_unreachable(); }
+  "&& 1"
+  [;; "*mulsi3_call"
+   (parallel [(set (match_dup 0)
+                   (mult:SI (match_dup 1)
+                            (match_dup 2)))
+              (set (match_dup 3)
+                   (unspec:HI [(match_dup 1)
+                               (match_dup 2)] UNSPEC_IDENTITY))])]
+  {
+    if (u16_operand (operands[2], SImode))
+      {
+        operands[2] = gen_int_mode (INTVAL (operands[2]), HImode);
+        operands[2] = force_reg (HImode, operands[2]);
+        emit_insn (gen_muluhisi3_call (operands[0], operands[2], operands[1]));
+        DONE;
+      }
+
+    if (o16_operand (operands[2], SImode))
+      {
+        operands[2] = gen_int_mode (INTVAL (operands[2]), HImode);
+        operands[2] = force_reg (HImode, operands[2]);
+        emit_insn (gen_mulohisi3_call (operands[0], operands[2], operands[1]));
+        DONE;
+      }
+
+    gcc_assert (register_operand (operands[2], SImode));
+    operands[3] = gen_reg_rtx (HImode);
+  })
+
+;; "*mulsqisi3"
+;; "*muluqisi3"
+(define_insn_and_split "*mul<extend_su>qisi3"
+  [(set (match_operand:SI 0 "register_operand"                         "=r")
+        (mult:SI (any_extend:SI (match_operand:QI 1 "register_operand"  "r"))
+                 (match_operand:SI 2 ""                                 "r")))]
+  "AVR_HAVE_MUL
+   && !avr_split1_completed()
+   && (reg_or_u16_s16_operand (operands[2], SImode)
+       || ((ZERO_EXTEND == GET_CODE (operands[2])
+            || SIGN_EXTEND == GET_CODE (operands[2]))
+           && register_operand (XEXP (operands[2], 0), HImode)))"
+  { gcc_unreachable(); }
+  "&& 1"
+  [;; "extendqihi2"
+   ;; "zero_extendqihi2"
+   (set (match_dup 3)
+        (any_extend:HI (match_dup 1)))
+   ;; "*muluhisi3_call"
+   ;; "*mulshisi3_call"
+
+   ;; "*umulhisi3_call"
+   ;; "*mulhisi3_call"
+   ;; "*usmulhisi3_call"
+   ;; "*sumulhisi3_call"
+   (set (match_dup 0)
+        (mult:SI (any_extend:SI (match_dup 3))
+                 (match_dup 4)))]
+  {
+    operands[3] = gen_reg_rtx (HImode);
+    operands[4] = operands[2];
+
+    if (u16_operand (operands[2], SImode))
+      {
+        rtx xval = gen_int_mode (INTVAL (operands[2]), HImode);
+        operands[4] = gen_rtx_ZERO_EXTEND (SImode, force_reg (HImode, xval));
+      }
+    else if (s16_operand (operands[2], SImode))
+      {
+        rtx xval = gen_int_mode (INTVAL (operands[2]), HImode);
+        operands[4] = gen_rtx_SIGN_EXTEND (SImode, force_reg (HImode, xval));
+      }
+  })
+
+(define_insn "*mulsi3_call"
+  [(set (match_operand:SI 0 "register_operand"          "=R22_4")
+        (mult:SI (match_operand:SI 1 "register_operand" "%R22_4")
+                 (match_operand:SI 2 "register_operand"  "R18_4")))
+   ;; Just a clobber actually
+   (set (match_operand:HI 3 "register_operand"          "=x") ;; R26_2
+        (unspec:HI [(match_dup 1)
+                    (match_dup 2)] UNSPEC_IDENTITY))]
+  "AVR_HAVE_MUL"
+  "%~call __mulsi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+;; "mulhisi3"
+;; "umulhisi3"
+(define_expand "<extend_u>mulhisi3"
+  [(set (match_operand:SI 0 "register_operand" "")
+        (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" ""))
+                 (any_extend:SI (match_operand:HI 2 "register_operand" ""))))]
+  "AVR_HAVE_MUL")
+
+;; "*mulhisi3_call"
+;; "*umulhisi3_call"
+(define_insn "*<extend_u>mulhisi3_call"
+  [(set (match_operand:SI 0 "register_operand"                         "=R22_4")
+        (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" "%R18_2"))
+                 (any_extend:SI (match_operand:HI 2 "register_operand"  "x"))))] ;; R26_2
+  "AVR_HAVE_MUL"
+  "%~call __<extend_u>mulhisi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+(define_insn "*usmulhisi3_call"
+  [(set (match_operand:SI 0 "register_operand"                         "=R22_4")
+        (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "R18_2"))
+                 (sign_extend:SI (match_operand:HI 2 "register_operand" "x"))))] ;; R26_2
+  "AVR_HAVE_MUL"
+  "%~call __usmulhisi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+(define_insn "*sumulhisi3_call"
+  [(set (match_operand:SI 0 "register_operand"                         "=R22_4")
+        (mult:SI (sign_extend:SI (match_operand:HI 2 "register_operand" "x")) ;; R26_2
+                 (zero_extend:SI (match_operand:HI 1 "register_operand" "R18_2"))))]
+  "AVR_HAVE_MUL"
+  "%~call __usmulhisi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+(define_insn "mul<extend_su>hisi3_call"
+  [(set (match_operand:SI 0 "register_operand"                        "=R22_4")
+        (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" "x")) ;; R26_2
+                 (match_operand:SI 2 "register_operand"                "R18_4")))]
+  "AVR_HAVE_MUL"
+  "%~call __mul<extend_su>hisi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+(define_insn "mulohisi3_call"
+  [(set (match_operand:SI 0 "register_operand"                                         "=R22_4")
+        (mult:SI (not:SI (zero_extend:SI (not:HI (match_operand:HI 1 "register_operand" "x")))) ;; R26_2
+                 (match_operand:SI 2 "register_operand"                                 "R18_4")))]
+  "AVR_HAVE_MUL"
+  "%~call __mulohisi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+;; "smulhi3_highpart"
+;; "umulhi3_highpart"
+(define_expand "<extend_su>mulhi3_highpart"
+  [(parallel
+    [(set (match_operand:HI 0 "register_operand" "")
+          (truncate:HI (lshiftrt:SI (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" ""))
+                                             (any_extend:SI (match_operand:HI 2 "register_operand" "")))
+                                  (const_int 16))))
+     (set (match_dup 3)
+          (unspec:HI [(match_dup 1)
+                      (match_dup 2)] UNSPEC_IDENTITY))])]
+  "AVR_HAVE_MUL"
+  {
+    operands[3] = gen_reg_rtx (HImode);
+  })
+
+
+;; "*umulhi3_highpart_call"
+;; "*smulhi3_highpart_call"
+(define_insn "*<extend_su>mulhi3_highpart_call"
+  [(set (match_operand:HI 0 "register_operand"                                                   "=R24_2")
+        (truncate:HI (lshiftrt:SI (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" "%R18_2"))
+                                           (any_extend:SI (match_operand:HI 2 "register_operand"  "x"))) ;; R26_2
+                                  (const_int 16))))
+   (set (match_operand:HI 3 "register_operand"                                                   "=R22_2")
+        ;; Just a clobber actually
+        (unspec:HI [(match_dup 1)
+                    (match_dup 2)] UNSPEC_IDENTITY))]
+  "AVR_HAVE_MUL"
+  "%~call __<extend_u>mulhisi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+
 ;******************************************************************************
 ; mul HI: $1 = sign/zero-extend, $2 = small constant
 ;******************************************************************************
@@ -2283,386 +2489,6 @@ (define_insn "*mulhi3_call"
   [(set_attr "type" "xcall")
    (set_attr "cc" "clobber")])
 
-;; To support widening multiplication with constant we postpone
-;; expanding to the implicit library call until post combine and
-;; prior to register allocation.  Clobber all hard registers that
-;; might be used by the (widening) multiply until it is split and
-;; it's final register footprint is worked out.
-
-(define_expand "mulsi3"
-  [(parallel [(set (match_operand:SI 0 "register_operand" "")
-                   (mult:SI (match_operand:SI 1 "register_operand" "")
-                            (match_operand:SI 2 "nonmemory_operand" "")))
-              (clobber (reg:HI 26))
-              (clobber (reg:DI 18))])]
-  "AVR_HAVE_MUL"
-  {
-    if (u16_operand (operands[2], SImode))
-      {
-        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
-        emit_insn (gen_muluhisi3 (operands[0], operands[2], operands[1]));
-        DONE;
-      }
-
-    if (o16_operand (operands[2], SImode))
-      {
-        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
-        emit_insn (gen_mulohisi3 (operands[0], operands[2], operands[1]));
-        DONE;
-      }
-
-    if (avr_emit3_fix_outputs (gen_mulsi3, operands, 1 << 0,
-                               regmask (DImode, 18) | regmask (HImode, 26)))
-      DONE;
-  })
-
-(define_insn_and_split "*mulsi3"
-  [(set (match_operand:SI 0 "pseudo_register_operand"                      "=r")
-        (mult:SI (match_operand:SI 1 "pseudo_register_operand"              "r")
-                 (match_operand:SI 2 "pseudo_register_or_const_int_operand" "rn")))
-   (clobber (reg:HI 26))
-   (clobber (reg:DI 18))]
-  "AVR_HAVE_MUL && !reload_completed"
-  { gcc_unreachable(); }
-  "&& 1"
-  [(set (reg:SI 18)
-        (match_dup 1))
-   (set (reg:SI 22)
-        (match_dup 2))
-   (parallel [(set (reg:SI 22)
-                   (mult:SI (reg:SI 22)
-                            (reg:SI 18)))
-              (clobber (reg:HI 26))])
-   (set (match_dup 0)
-        (reg:SI 22))]
-  {
-    if (u16_operand (operands[2], SImode))
-      {
-        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
-        emit_insn (gen_muluhisi3 (operands[0], operands[2], operands[1]));
-        DONE;
-      }
-
-    if (o16_operand (operands[2], SImode))
-      {
-        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
-        emit_insn (gen_mulohisi3 (operands[0], operands[2], operands[1]));
-        DONE;
-      }
-  })
-
-;; "muluqisi3"
-;; "muluhisi3"
-(define_expand "mulu<mode>si3"
-  [(parallel [(set (match_operand:SI 0 "pseudo_register_operand" "")
-                   (mult:SI (zero_extend:SI (match_operand:QIHI 1 "pseudo_register_operand" ""))
-                            (match_operand:SI 2 "pseudo_register_or_const_int_operand" "")))
-              (clobber (reg:HI 26))
-              (clobber (reg:DI 18))])]
-  "AVR_HAVE_MUL"
-  {
-    avr_fix_inputs (operands, (1 << 1) | (1 << 2), -1u);
-    if (avr_emit3_fix_outputs (gen_mulu<mode>si3, operands, 1 << 0,
-                               regmask (DImode, 18) | regmask (HImode, 26)))
-      DONE;
-  })
-
-;; "*muluqisi3"
-;; "*muluhisi3"
-(define_insn_and_split "*mulu<mode>si3"
-  [(set (match_operand:SI 0 "pseudo_register_operand"                           "=r")
-        (mult:SI (zero_extend:SI (match_operand:QIHI 1 "pseudo_register_operand" "r"))
-                 (match_operand:SI 2 "pseudo_register_or_const_int_operand"      "rn")))
-   (clobber (reg:HI 26))
-   (clobber (reg:DI 18))]
-  "AVR_HAVE_MUL && !reload_completed"
-  { gcc_unreachable(); }
-  "&& 1"
-  [(set (reg:HI 26)
-        (match_dup 1))
-   (set (reg:SI 18)
-        (match_dup 2))
-   (set (reg:SI 22)
-        (mult:SI (zero_extend:SI (reg:HI 26))
-                 (reg:SI 18)))
-   (set (match_dup 0)
-        (reg:SI 22))]
-  {
-    /* Do the QI -> HI extension explicitely before the multiplication.  */
-    /* Do the HI -> SI extension implicitely and after the multiplication.  */
-
-    if (QImode == <MODE>mode)
-      operands[1] = gen_rtx_ZERO_EXTEND (HImode, operands[1]);
-
-    if (u16_operand (operands[2], SImode))
-      {
-        operands[1] = force_reg (HImode, operands[1]);
-        operands[2] = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
-        emit_insn (gen_umulhisi3 (operands[0], operands[1], operands[2]));
-        DONE;
-      }
-  })
-
-;; "mulsqisi3"
-;; "mulshisi3"
-(define_expand "muls<mode>si3"
-  [(parallel [(set (match_operand:SI 0 "pseudo_register_operand" "")
-                   (mult:SI (sign_extend:SI (match_operand:QIHI 1 "pseudo_register_operand" ""))
-                            (match_operand:SI 2 "pseudo_register_or_const_int_operand" "")))
-              (clobber (reg:HI 26))
-              (clobber (reg:DI 18))])]
-  "AVR_HAVE_MUL"
-  {
-    avr_fix_inputs (operands, (1 << 1) | (1 << 2), -1u);
-    if (avr_emit3_fix_outputs (gen_muls<mode>si3, operands, 1 << 0,
-                               regmask (DImode, 18) | regmask (HImode, 26)))
-      DONE;
-  })
-
-;; "*mulsqisi3"
-;; "*mulshisi3"
-(define_insn_and_split "*muls<mode>si3"
-  [(set (match_operand:SI 0 "pseudo_register_operand"                           "=r")
-        (mult:SI (sign_extend:SI (match_operand:QIHI 1 "pseudo_register_operand" "r"))
-                 (match_operand:SI 2 "pseudo_register_or_const_int_operand"      "rn")))
-   (clobber (reg:HI 26))
-   (clobber (reg:DI 18))]
-  "AVR_HAVE_MUL && !reload_completed"
-  { gcc_unreachable(); }
-  "&& 1"
-  [(set (reg:HI 26)
-        (match_dup 1))
-   (set (reg:SI 18)
-        (match_dup 2))
-   (set (reg:SI 22)
-        (mult:SI (sign_extend:SI (reg:HI 26))
-                 (reg:SI 18)))
-   (set (match_dup 0)
-        (reg:SI 22))]
-  {
-    /* Do the QI -> HI extension explicitely before the multiplication.  */
-    /* Do the HI -> SI extension implicitely and after the multiplication.  */
-
-    if (QImode == <MODE>mode)
-      operands[1] = gen_rtx_SIGN_EXTEND (HImode, operands[1]);
-
-    if (u16_operand (operands[2], SImode)
-        || s16_operand (operands[2], SImode))
-      {
-        rtx xop2 = force_reg (HImode, gen_int_mode (INTVAL (operands[2]), HImode));
-
-        operands[1] = force_reg (HImode, operands[1]);
-
-        if (u16_operand (operands[2], SImode))
-          emit_insn (gen_usmulhisi3 (operands[0], xop2, operands[1]));
-        else
-          emit_insn (gen_mulhisi3 (operands[0], operands[1], xop2));
-
-        DONE;
-      }
-  })
-
-;; One-extend operand 1
-
-(define_expand "mulohisi3"
-  [(parallel [(set (match_operand:SI 0 "pseudo_register_operand" "")
-                   (mult:SI (not:SI (zero_extend:SI
-                                     (not:HI (match_operand:HI 1 "pseudo_register_operand" ""))))
-                            (match_operand:SI 2 "pseudo_register_or_const_int_operand" "")))
-              (clobber (reg:HI 26))
-              (clobber (reg:DI 18))])]
-  "AVR_HAVE_MUL"
-  {
-    avr_fix_inputs (operands, (1 << 1) | (1 << 2), -1u);
-    if (avr_emit3_fix_outputs (gen_mulohisi3, operands, 1 << 0,
-                               regmask (DImode, 18) | regmask (HImode, 26)))
-      DONE;
-  })
-
-(define_insn_and_split "*mulohisi3"
-  [(set (match_operand:SI 0 "pseudo_register_operand"                          "=r")
-        (mult:SI (not:SI (zero_extend:SI
-                          (not:HI (match_operand:HI 1 "pseudo_register_operand" "r"))))
-                 (match_operand:SI 2 "pseudo_register_or_const_int_operand"     "rn")))
-   (clobber (reg:HI 26))
-   (clobber (reg:DI 18))]
-  "AVR_HAVE_MUL && !reload_completed"
-  { gcc_unreachable(); }
-  "&& 1"
-  [(set (reg:HI 26)
-        (match_dup 1))
-   (set (reg:SI 18)
-        (match_dup 2))
-   (set (reg:SI 22)
-        (mult:SI (not:SI (zero_extend:SI (not:HI (reg:HI 26))))
-                 (reg:SI 18)))
-   (set (match_dup 0)
-        (reg:SI 22))])
-
-;; "mulhisi3"
-;; "umulhisi3"
-(define_expand "<extend_u>mulhisi3"
-  [(parallel [(set (match_operand:SI 0 "register_operand" "")
-                   (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" ""))
-                            (any_extend:SI (match_operand:HI 2 "register_operand" ""))))
-              (clobber (reg:HI 26))
-              (clobber (reg:DI 18))])]
-  "AVR_HAVE_MUL"
-  {
-    if (avr_emit3_fix_outputs (gen_<extend_u>mulhisi3, operands, 1 << 0,
-                               regmask (DImode, 18) | regmask (HImode, 26)))
-      DONE;
-  })
-
-(define_expand "usmulhisi3"
-  [(parallel [(set (match_operand:SI 0 "register_operand" "")
-                   (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" ""))
-                            (sign_extend:SI (match_operand:HI 2 "register_operand" ""))))
-              (clobber (reg:HI 26))
-              (clobber (reg:DI 18))])]
-  "AVR_HAVE_MUL"
-  {
-    if (avr_emit3_fix_outputs (gen_usmulhisi3, operands, 1 << 0,
-                               regmask (DImode, 18) | regmask (HImode, 26)))
-      DONE;
-  })
-
-;; "*uumulqihisi3" "*uumulhiqisi3" "*uumulhihisi3" "*uumulqiqisi3"
-;; "*usmulqihisi3" "*usmulhiqisi3" "*usmulhihisi3" "*usmulqiqisi3"
-;; "*sumulqihisi3" "*sumulhiqisi3" "*sumulhihisi3" "*sumulqiqisi3"
-;; "*ssmulqihisi3" "*ssmulhiqisi3" "*ssmulhihisi3" "*ssmulqiqisi3"
-(define_insn_and_split
-  "*<any_extend:extend_su><any_extend2:extend_su>mul<QIHI:mode><QIHI2:mode>si3"
-  [(set (match_operand:SI 0 "pseudo_register_operand"                            "=r")
-        (mult:SI (any_extend:SI (match_operand:QIHI 1 "pseudo_register_operand"   "r"))
-                 (any_extend2:SI (match_operand:QIHI2 2 "pseudo_register_operand" "r"))))
-   (clobber (reg:HI 26))
-   (clobber (reg:DI 18))]
-  "AVR_HAVE_MUL && !reload_completed"
-  { gcc_unreachable(); }
-  "&& 1"
-  [(set (reg:HI 18)
-        (match_dup 1))
-   (set (reg:HI 26)
-        (match_dup 2))
-   (set (reg:SI 22)
-        (mult:SI (match_dup 3)
-                 (match_dup 4)))
-   (set (match_dup 0)
-        (reg:SI 22))]
-  {
-    rtx xop1 = operands[1];
-    rtx xop2 = operands[2];
-
-    /* Do the QI -> HI extension explicitely before the multiplication.  */
-    /* Do the HI -> SI extension implicitely and after the multiplication.  */
-
-    if (QImode == <QIHI:MODE>mode)
-      xop1 = gen_rtx_fmt_e (<any_extend:CODE>, HImode, xop1);
-
-    if (QImode == <QIHI2:MODE>mode)
-      xop2 = gen_rtx_fmt_e (<any_extend2:CODE>, HImode, xop2);
-
-    if (<any_extend:CODE> == <any_extend2:CODE>
-        || <any_extend:CODE> == ZERO_EXTEND)
-      {
-        operands[1] = xop1;
-        operands[2] = xop2;
-        operands[3] = gen_rtx_fmt_e (<any_extend:CODE>, SImode, gen_rtx_REG (HImode, 18));
-        operands[4] = gen_rtx_fmt_e (<any_extend2:CODE>, SImode, gen_rtx_REG (HImode, 26));
-      }
-    else
-      {
-        /* <any_extend:CODE>  = SIGN_EXTEND */
-        /* <any_extend2:CODE> = ZERO_EXTEND */
-
-        operands[1] = xop2;
-        operands[2] = xop1;
-        operands[3] = gen_rtx_ZERO_EXTEND (SImode, gen_rtx_REG (HImode, 18));
-        operands[4] = gen_rtx_SIGN_EXTEND (SImode, gen_rtx_REG (HImode, 26));
-      }
-  })
-
-;; "smulhi3_highpart"
-;; "umulhi3_highpart"
-(define_expand "<extend_su>mulhi3_highpart"
-  [(set (reg:HI 18)
-        (match_operand:HI 1 "nonmemory_operand" ""))
-   (set (reg:HI 26)
-        (match_operand:HI 2 "nonmemory_operand" ""))
-   (parallel [(set (reg:HI 24)
-                   (truncate:HI (lshiftrt:SI (mult:SI (any_extend:SI (reg:HI 18))
-                                                      (any_extend:SI (reg:HI 26)))
-                                             (const_int 16))))
-              (clobber (reg:HI 22))])
-   (set (match_operand:HI 0 "register_operand" "")
-        (reg:HI 24))]
-  "AVR_HAVE_MUL"
-  {
-    avr_fix_inputs (operands, 1 << 2, regmask (HImode, 18));
-  })
-
-
-(define_insn "*mulsi3_call"
-  [(set (reg:SI 22)
-        (mult:SI (reg:SI 22)
-                 (reg:SI 18)))
-   (clobber (reg:HI 26))]
-  "AVR_HAVE_MUL"
-  "%~call __mulsi3"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-;; "*mulhisi3_call"
-;; "*umulhisi3_call"
-(define_insn "*<extend_u>mulhisi3_call"
-  [(set (reg:SI 22)
-        (mult:SI (any_extend:SI (reg:HI 18))
-                 (any_extend:SI (reg:HI 26))))]
-  "AVR_HAVE_MUL"
-  "%~call __<extend_u>mulhisi3"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-;; "*umulhi3_highpart_call"
-;; "*smulhi3_highpart_call"
-(define_insn "*<extend_su>mulhi3_highpart_call"
-  [(set (reg:HI 24)
-        (truncate:HI (lshiftrt:SI (mult:SI (any_extend:SI (reg:HI 18))
-                                           (any_extend:SI (reg:HI 26)))
-                                  (const_int 16))))
-   (clobber (reg:HI 22))]
-  "AVR_HAVE_MUL"
-  "%~call __<extend_u>mulhisi3"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-(define_insn "*usmulhisi3_call"
-  [(set (reg:SI 22)
-        (mult:SI (zero_extend:SI (reg:HI 18))
-                 (sign_extend:SI (reg:HI 26))))]
-  "AVR_HAVE_MUL"
-  "%~call __usmulhisi3"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-(define_insn "*mul<extend_su>hisi3_call"
-  [(set (reg:SI 22)
-        (mult:SI (any_extend:SI (reg:HI 26))
-                 (reg:SI 18)))]
-  "AVR_HAVE_MUL"
-  "%~call __mul<extend_su>hisi3"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
-
-(define_insn "*mulohisi3_call"
-  [(set (reg:SI 22)
-        (mult:SI (not:SI (zero_extend:SI (not:HI (reg:HI 26))))
-                 (reg:SI 18)))]
-  "AVR_HAVE_MUL"
-  "%~call __mulohisi3"
-  [(set_attr "type" "xcall")
-   (set_attr "cc" "clobber")])
 
 ; / % / % / % / % / % / % / % / % / % / % / % / % / % / % / % / % / % / % / %
 ; divmod

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]