This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patch,AVR]: Built-in for non-contiguous port layouts


This patch set removes __builtin_avr_map8 __builtin_avr_map16 built-ins and
implements a built-in __builtin_avr_insert_bits instead.

This has several reasons:

* From user feedback I learned that speed matters more than size here

* I found that the new built-in has better usability and fits better to
  the intended use cases.

* Better code is generated by implementing hook TARGET_FOLD_BUILTIN.

* The implementation is simpler (except the new folding part).

* There were issues with __builtin_avr_map*.  Instead of fixing these
  I went ahead an removed them altogether

* The new built-in is generic enough to provide the old ones'
  functionalities easily.

There are 2 new test programs for this built-in that all pass fine.

Ok for trunk?

Johann


gcc/doc/
	* extend.texi (AVR Built-in Functions): Remove doc for
	__builtin_avr_map8, __builtin_avr_map16.
	Document __builtin_avr_insert_bits.

gcc/testsuite/
	* gcc.target/avr/torture/builtin_insert_bits-1.c: New test.
	* gcc.target/avr/torture/builtin_insert_bits-2.c: New test.

gcc/
	* config/avr/avr.md (map_bitsqi, map_bitshi): Remove.
	(insert_bits): New insn.
	(adjust_len.map_bits): Rename to insert_bits.
	(UNSPEC_MAP_BITS): Rename to UNSPEC_INSERT_BITS.

	* avr-protos.h (avr_out_map_bits): Remove.
	(avr_out_insert_bits, avr_has_nibble_0xf): New.

	* config/avr/constraints.md (Cxf,C0f): New.

	* config/avr/avr.c (avr_cpu_cpp_builtins): Remove built-in
	defines __BUILTIN_AVR_MAP8, __BUILTIN_AVR_MAP16.
	New built-in define __BUILTIN_AVR_INSERT_BITS.

	* config/avr/avr.c (TARGET_FOLD_BUILTIN): New define.
	(enum avr_builtin_id): Add AVR_BUILTIN_INSERT_BITS.
	(avr_move_bits): Rewrite.
	(avr_fold_builtin, avr_map_metric, avr_map_decompose): New static
	functions.
	(avr_map_op_t): New typedef.
	(avr_map_op): New static variable.
	(avr_out_insert_bits, avr_has_nibble_0xf): New functions.
	(adjust_insn_length): Handle ADJUST_LEN_INSERT_BITS.
	(avr_init_builtins): Add definition for __builtin_avr_insert_bits.
	(bdesc_3arg, avr_expand_triop_builtin): New.
	(avr_expand_builtin): Use them. And handle AVR_BUILTIN_INSERT_BITS.

	(avr_revert_map, avr_swap_map, avr_id_map, avr_sig_map): Remove.
	(avr_map_hamming_byte, avr_map_hamming_nonstrict): Remove.
	(avr_map_equal_p, avr_map_sig_p): Remove.
	(avr_out_swap_bits, avr_out_revert_bits, avr_out_map_bits): Remove.
	(bdesc_2arg): Remove AVR_BUILTIN_MAP8, AVR_BUILTIN_MAP16.
	(adjust_insn_length): Remove handling for ADJUST_LEN_MAP_BITS.
	(enum avr_builtin_id): Remove AVR_BUILTIN_MAP8, AVR_BUILTIN_MAP16.
	(avr_init_builtins): Remove __builtin_avr_map8, __builtin_avr_map16.
	(avr_expand_builtin): Remove AVR_BUILTIN_MAP8, AVR_BUILTIN_MAP16.
Index: doc/extend.texi
===================================================================
--- doc/extend.texi	(revision 184156)
+++ doc/extend.texi	(working copy)
@@ -8810,33 +8810,53 @@ might increase delay time. @code{ticks}
 integer constant; delays with a variable number of cycles are not supported.
 
 @smallexample
-     unsigned char __builtin_avr_map8 (unsigned long map, unsigned char val)
+     unsigned char __builtin_avr_insert_bits (unsigned long map, unsigned char bits, unsigned char val)
 @end smallexample
 
 @noindent
-Each bit of the result is copied from a specific bit of @code{val}.
-@code{map} is a compile time constant that represents a map composed
-of 8 nibbles (4-bit groups):
-The @var{n}-th nibble of @code{map} specifies which bit of @code{val}
-is to be moved to the @var{n}-th bit of the result.
-For example, @code{map = 0x76543210} represents identity: The MSB of
-the result is read from the 7-th bit of @code{val}, the LSB is
-read from the 0-th bit to @code{val}, etc.
-Two more examples: @code{0x01234567} reverses the bit order and
-@code{0x32107654} is equivalent to a @code{swap} instruction.
+Insert bits from @var{bits} into @var{val} and return the resulting
+value. The nibbles of @var{map} determine how the insertion is
+performed: Let @var{X} be the @var{n}-th nibble of @var{map}
+@enumerate
+@item If @var{X} is @code{0xf},
+then the @var{n}-th bit of @var{val} is returned unaltered.
+
+@item If X is in the range 0@dots{}7,
+then the @var{n}-th result bit is set to the @var{X}-th bit of @var{bits}
+
+@item If X is in the range 8@dots{}@code{0xe},
+then the @var{n}-th result bit is undefined.
+@end enumerate
 
 @noindent
-One typical use case for this and the following built-in is adjusting input and
-output values to non-contiguous port layouts.
+One typical use case for this built-in is adjusting input and
+output values to non-contiguous port layouts. Some examples:
 
 @smallexample
-     unsigned int __builtin_avr_map16 (unsigned long long map, unsigned int val)
+// same as val, bits is unused
+__builtin_avr_insert_bits (0xffffffff, bits, val)
 @end smallexample
 
-@noindent
-Similar to the previous built-in except that it operates on @code{int}
-and thus 16 bits are involved.  Again, @code{map} must be a compile
-time constant.
+@smallexample
+// same as bits, val is unused
+__builtin_avr_insert_bits (0x76543210, bits, val)
+@end smallexample
+
+@smallexample
+// same as rotating bits by 4
+__builtin_avr_insert_bits (0x32107654, bits, 0)
+@end smallexample
+
+@smallexample
+// high-nibble of result is the high-nibble of val
+// low-nibble of result is the low-nibble of bits
+__builtin_avr_insert_bits (0xffff3210, bits, val)
+@end smallexample
+
+@smallexample
+// reverse the bit order of bits
+__builtin_avr_insert_bits (0x01234567, bits, 0)
+@end smallexample
 
 @node Blackfin Built-in Functions
 @subsection Blackfin Built-in Functions
Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(revision 184156)
+++ config/avr/avr.md	(working copy)
@@ -68,7 +68,7 @@ (define_c_enum "unspec"
    UNSPEC_FMULSU
    UNSPEC_COPYSIGN
    UNSPEC_IDENTITY
-   UNSPEC_MAP_BITS
+   UNSPEC_INSERT_BITS
    ])
 
 (define_c_enum "unspecv"
@@ -144,7 +144,7 @@ (define_attr "adjust_len"
    ashlhi, ashrhi, lshrhi,
    ashlsi, ashrsi, lshrsi,
    ashlpsi, ashrpsi, lshrpsi,
-   map_bits,
+   insert_bits,
    no"
   (const_string "no"))
 
@@ -5264,28 +5264,20 @@ (define_insn "delay_cycles_4"
   [(set_attr "length" "9")
    (set_attr "cc" "clobber")])
 
-(define_insn "map_bitsqi"
-  [(set (match_operand:QI 0 "register_operand"             "=d")
-        (unspec:QI [(match_operand:SI 1 "const_int_operand" "n")
-                    (match_operand:QI 2 "register_operand"  "r")]
-                   UNSPEC_MAP_BITS))]
-  ""
-  {
-    return avr_out_map_bits (insn, operands, NULL);
-  }
-  [(set_attr "adjust_len" "map_bits")
-   (set_attr "cc" "clobber")])
 
-(define_insn "map_bitshi"
-  [(set (match_operand:HI 0 "register_operand"               "=&r")
-        (unspec:HI [(match_operand:DI 1 "const_double_operand" "n")
-                    (match_operand:HI 2 "register_operand"     "r")]
-                   UNSPEC_MAP_BITS))]
+;; __builtin_avr_insert_bits
+
+(define_insn "insert_bits"
+  [(set (match_operand:QI 0 "register_operand"              "=r  ,d  ,r")
+        (unspec:QI [(match_operand:SI 1 "const_int_operand"  "C0f,Cxf,C0f")
+                    (match_operand:QI 2 "register_operand"   "r  ,r  ,r")
+                    (match_operand:QI 3 "nonmemory_operand"  "n  ,0  ,0")]
+                   UNSPEC_INSERT_BITS))]
   ""
   {
-    return avr_out_map_bits (insn, operands, NULL);
+    return avr_out_insert_bits (operands, NULL);
   }
-  [(set_attr "adjust_len" "map_bits")
+  [(set_attr "adjust_len" "insert_bits")
    (set_attr "cc" "clobber")])
 
 
Index: config/avr/avr-c.c
===================================================================
--- config/avr/avr-c.c	(revision 184156)
+++ config/avr/avr-c.c	(working copy)
@@ -163,8 +163,7 @@ avr_cpu_cpp_builtins (struct cpp_reader
   cpp_define (pfile, "__BUILTIN_AVR_WDR");
   cpp_define (pfile, "__BUILTIN_AVR_SLEEP");
   cpp_define (pfile, "__BUILTIN_AVR_SWAP");
-  cpp_define (pfile, "__BUILTIN_AVR_MAP8");
-  cpp_define (pfile, "__BUILTIN_AVR_MAP16");
+  cpp_define (pfile, "__BUILTIN_AVR_INSERT_BITS");
   cpp_define (pfile, "__BUILTIN_AVR_DELAY_CYCLES");
 
   cpp_define (pfile, "__BUILTIN_AVR_FMUL");
Index: config/avr/avr-protos.h
===================================================================
--- config/avr/avr-protos.h	(revision 184156)
+++ config/avr/avr-protos.h	(working copy)
@@ -94,8 +94,9 @@ extern const char* avr_out_plus64 (rtx,
 extern const char* avr_out_addto_sp (rtx*, int*);
 extern const char* avr_out_xload (rtx, rtx*, int*);
 extern const char* avr_out_movmem (rtx, rtx*, int*);
-extern const char* avr_out_map_bits (rtx, rtx*, int*);
+extern const char* avr_out_insert_bits (rtx*, int*);
 extern bool avr_popcount_each_byte (rtx, int, int);
+extern bool avr_has_nibble_0xf (rtx);
 
 extern int extra_constraint_Q (rtx x);
 extern int adjust_insn_length (rtx insn, int len);
Index: config/avr/constraints.md
===================================================================
--- config/avr/constraints.md	(revision 184156)
+++ config/avr/constraints.md	(working copy)
@@ -182,3 +182,13 @@ (define_constraint "Csp"
   "Integer constant in the range -6 @dots{} 6."
   (and (match_code "const_int")
        (match_test "IN_RANGE (ival, -6, 6)")))
+
+(define_constraint "Cxf"
+  "32-bit integer constant where at least one nibble is 0xf."
+  (and (match_code "const_int")
+       (match_test "avr_has_nibble_0xf (op)")))
+
+(define_constraint "C0f"
+  "32-bit integer constant where no nibble equals 0xf."
+  (and (match_code "const_int")
+       (match_test "!avr_has_nibble_0xf (op)")))
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 184156)
+++ config/avr/avr.c	(working copy)
@@ -305,6 +305,9 @@ bool avr_need_copy_data_p = false;
 #undef TARGET_EXPAND_BUILTIN
 #define TARGET_EXPAND_BUILTIN avr_expand_builtin
 
+#undef  TARGET_FOLD_BUILTIN
+#define TARGET_FOLD_BUILTIN avr_fold_builtin
+
 #undef TARGET_ASM_FUNCTION_RODATA_SECTION
 #define TARGET_ASM_FUNCTION_RODATA_SECTION avr_asm_function_rodata_section
 
@@ -6465,12 +6468,12 @@ adjust_insn_length (rtx insn, int len)
 
     case ADJUST_LEN_CALL: len = AVR_HAVE_JMP_CALL ? 2 : 1; break;
 
-    case ADJUST_LEN_MAP_BITS: avr_out_map_bits (insn, op, &len); break;
+    case ADJUST_LEN_INSERT_BITS: avr_out_insert_bits (op, &len); break;
 
     default:
       gcc_unreachable();
     }
-  
+
   return len;
 }
 
@@ -9945,193 +9948,220 @@ avr_map (double_int f, int x)
 }
 
 
-/* Return the map R that reverses the bits of byte B.
+/* Return some metrics of map A.  */
 
-   R(0)  =  (0  7)  o  (1  6)  o   (2  5)  o   (3  4)
-   R(1)  =  (8 15)  o  (9 14)  o  (10 13)  o  (11 12)
-            
-   Notice that R o R = id.  */
+enum
+  {
+    /* Number of fixed points in { 0 ... 7 } */
+    MAP_FIXED_0_7,
 
-static double_int
-avr_revert_map (int b)
+    /* Size of preimage of non-fixed points in { 0 ... 7 } */
+    MAP_NONFIXED_0_7,
+    
+    /* Mask representing the fixed points in { 0 ... 7 } */
+    MAP_MASK_FIXED_0_7,
+    
+    /* Size of the preimage of { 0 ... 7 } */
+    MAP_PREIMAGE_0_7,
+    
+    /* Mask that represents the preimage of { f } */
+    MAP_MASK_PREIMAGE_F
+  };
+
+static unsigned
+avr_map_metric (double_int a, int mode)
 {
-  int i;
-  double_int r = double_int_zero;
+  unsigned i, metric = 0;
 
-  for (i = 16-1; i >= 0; i--)
-    r = avr_double_int_push_digit (r, 16, i >> 3 == b ? i ^ 7 : i);
+  for (i = 0; i < 8; i++)
+    {
+      unsigned ai = avr_map (a, i);
 
-  return r;
+      if (mode == MAP_FIXED_0_7)
+        metric += ai == i;
+      else if (mode == MAP_NONFIXED_0_7)
+        metric += ai < 8 && ai != i;
+      else if (mode == MAP_MASK_FIXED_0_7)
+        metric |= ((unsigned) (ai == i)) << i;
+      else if (mode == MAP_PREIMAGE_0_7)
+        metric += ai < 8;
+      else if (mode == MAP_MASK_PREIMAGE_F)
+        metric |= ((unsigned) (ai == 0xf)) << i;
+      else
+        gcc_unreachable();
+    }
+  
+  return metric;
 }
 
 
-/* Return the map R that swaps bit-chunks of size SIZE in byte B.
+/* Return true if IVAL has a 0xf in its hexadecimal representation
+   and false, otherwise.  Only nibbles 0..7 are taken into account.
+   Used as constraint helper for C0f and Cxf.  */
 
-   R(1,0)  =  (0 1)  o   (2  3)  o   (4  5)  o   (6  7)
-   R(1,1)  =  (8 9)  o  (10 11)  o  (12 13)  o  (14 15)
+bool
+avr_has_nibble_0xf (rtx ival)
+{
+  return 0 != avr_map_metric (rtx_to_double_int (ival), MAP_MASK_PREIMAGE_F);
+}
 
-   R(4,0)  =  (0  4)  o  (1  5)  o   (2  6)  o   (3  7)
-   R(4,1)  =  (8 12)  o  (9 13)  o  (10 14)  o  (11 15)
 
-   Notice that R o R = id.  */
+/* We have a set of bits that are mapped by a function F.
+   Try to decompose F by means of a second function G so that
 
-static double_int
-avr_swap_map (int size, int b)
-{
-  int i;
-  double_int r = double_int_zero;
+      F = F o G^-1 o G
 
-  for (i = 16-1; i >= 0; i--)
-    r = avr_double_int_push_digit (r, 16, i ^ (i >> 3 == b ? size : 0));
+   and
 
-  return r;
-}
+      cost (F o G^-1) + cost (G)  <  cost (F)
 
+   Example:  Suppose builtin insert_bits supplies us with the map
+   F = 0x3210ffff.  Instead of doing 4 bit insertions to get the high
+   nibble of the result, we can just as well rotate the bits before inserting
+   them and use the map 0x7654ffff which is cheaper than the original map.
+   For this example G = G^-1 = 0x32107654 and F o G^-1 = 0x7654ffff.  */
+   
+typedef struct
+{
+  /* tree code of binary function G */
+  enum tree_code code;
 
-/* Return Identity.  */
+  /* The constant second argument of G */
+  int arg;
 
-static double_int
-avr_id_map (void)
-{
-  int i;
-  double_int r = double_int_zero;
+  /* G^-1, the inverse of G (*, arg) */
+  unsigned ginv;
 
-  for (i = 16-1; i >= 0; i--)
-    r = avr_double_int_push_digit (r, 16, i);
+  /* The cost of appplying G (*, arg) */
+  int cost;
 
-  return r;
-}
+  /* The composition F o G^-1 (*, arg) for some function F */
+  double_int map;
 
+  /* For debug purpose only */
+  const char *str;
+} avr_map_op_t;
 
-enum
+static const avr_map_op_t avr_map_op[] =
   {
-    SIG_ID        = 0,
-    /* for QI and HI */
-    SIG_ROL       = 0xf,
-    SIG_REVERT_0  = 1 << 4,
-    SIG_SWAP1_0   = 1 << 5,
-    /* HI only */
-    SIG_REVERT_1  = 1 << 6,
-    SIG_SWAP1_1   = 1 << 7,
-    SIG_SWAP4_0   = 1 << 8,
-    SIG_SWAP4_1   = 1 << 9
+    { LROTATE_EXPR, 0, 0x76543210, 0, { 0, 0 }, "id" },
+    { LROTATE_EXPR, 1, 0x07654321, 2, { 0, 0 }, "<<<" },
+    { LROTATE_EXPR, 2, 0x10765432, 4, { 0, 0 }, "<<<" },
+    { LROTATE_EXPR, 3, 0x21076543, 4, { 0, 0 }, "<<<" },
+    { LROTATE_EXPR, 4, 0x32107654, 1, { 0, 0 }, "<<<" },
+    { LROTATE_EXPR, 5, 0x43210765, 3, { 0, 0 }, "<<<" },
+    { LROTATE_EXPR, 6, 0x54321076, 5, { 0, 0 }, "<<<" },
+    { LROTATE_EXPR, 7, 0x65432107, 3, { 0, 0 }, "<<<" },
+    { RSHIFT_EXPR, 1, 0x6543210c, 1, { 0, 0 }, ">>" },
+    { RSHIFT_EXPR, 1, 0x7543210c, 1, { 0, 0 }, ">>" },
+    { RSHIFT_EXPR, 2, 0x543210cc, 2, { 0, 0 }, ">>" },
+    { RSHIFT_EXPR, 2, 0x643210cc, 2, { 0, 0 }, ">>" },
+    { RSHIFT_EXPR, 2, 0x743210cc, 2, { 0, 0 }, ">>" },
+    { LSHIFT_EXPR, 1, 0xc7654321, 1, { 0, 0 }, "<<" },
+    { LSHIFT_EXPR, 2, 0xcc765432, 2, { 0, 0 }, "<<" }
   };
 
 
-/* Return basic map with signature SIG.  */
-
-static double_int
-avr_sig_map (int n ATTRIBUTE_UNUSED, int sig)
+/* Try to decompose F as F = (F o G^-1) o G as described above.
+   The result is a struct representing F o G^-1 and G.
+   If result.cost < 0 then such a decomposition does not exist.  */
+   
+static avr_map_op_t
+avr_map_decompose (double_int f, const avr_map_op_t *g, bool val_const_p)
 {
-  if (sig == SIG_ID)            return avr_id_map ();
-  else if (sig == SIG_REVERT_0) return avr_revert_map (0);
-  else if (sig == SIG_REVERT_1) return avr_revert_map (1);
-  else if (sig == SIG_SWAP1_0)  return avr_swap_map (1, 0);
-  else if (sig == SIG_SWAP1_1)  return avr_swap_map (1, 1);
-  else if (sig == SIG_SWAP4_0)  return avr_swap_map (4, 0);
-  else if (sig == SIG_SWAP4_1)  return avr_swap_map (4, 1);
-  else
-    gcc_unreachable();
-}
-
-
-/* Return the Hamming distance between the B-th byte of A and C.  */
+  int i;
+  bool val_used_p = 0 != avr_map_metric (f, MAP_MASK_PREIMAGE_F);
+  avr_map_op_t f_ginv = *g;
+  double_int ginv = uhwi_to_double_int (g->ginv);
 
-static bool
-avr_map_hamming_byte (int n, int b, double_int a, double_int c, bool strict)
-{
-  int i, hamming = 0;
+  f_ginv.cost = -1;
+  
+  /* Step 1:  Computing F o G^-1  */
 
-  for (i = 8*b; i < n && i < 8*b + 8; i++)
+  for (i = 7; i >= 0; i--)
     {
-      int ai = avr_map (a, i);
-      int ci = avr_map (c, i);
+      int x = avr_map (f, i);
+      
+      if (x <= 7)
+        {
+          x = avr_map (ginv, x);
 
-      hamming += ai != ci && (strict || (ai < n && ci < n));
+          /* The bit is no element of the image of G: no avail (cost = -1)  */
+          
+          if (x > 7)
+            return f_ginv;
+        }
+      
+      f_ginv.map = avr_double_int_push_digit (f_ginv.map, 16, x);
     }
-  
-  return hamming;
-}
 
+  /* Step 2:  Compute the cost of the operations.
+     The overall cost of doing an operation prior to the insertion is
+      the cost of the insertion plus the cost of the operation.  */
 
-/* Return the non-strict Hamming distance between A and B.  */
+  /* Step 2a:  Compute cost of F o G^-1  */
 
-#define avr_map_hamming_nonstrict(N,A,B)              \
-  (+ avr_map_hamming_byte (N, 0, A, B, false)         \
-   + avr_map_hamming_byte (N, 1, A, B, false))
-
-
-/* Return TRUE iff A and B represent the same mapping.  */
-
-#define avr_map_equal_p(N,A,B) (0 == avr_map_hamming_nonstrict (N, A, B))
-
-
-/* Return TRUE iff A is a map of signature S.  Notice that there is no
-   1:1 correspondance between maps and signatures and thus this is
-   only supported for basic signatures recognized by avr_sig_map().  */
-
-#define avr_map_sig_p(N,A,S) avr_map_equal_p (N, A, avr_sig_map (N, S))
+  if (0 == avr_map_metric (f_ginv.map, MAP_NONFIXED_0_7))
+    {
+      /* The mapping consists only of fixed points and can be folded
+         to AND/OR logic in the remainder.  Reasonable cost is 3. */
 
+      f_ginv.cost = 2 + (val_used_p && !val_const_p);
+    }
+  else
+    {
+      rtx xop[4];
 
-/* Swap odd/even bits of ld-reg %0:  %0 = bit-swap (%0)  */
+      /* Get the cost of the insn by calling the output worker with some
+         fake values.  Mimic effect of reloading xop[3]: Unused operands
+         are mapped to 0 and used operands are reloaded to xop[0].  */
 
-static const char*
-avr_out_swap_bits (rtx *xop, int *plen)
-{
-  xop[1] = tmp_reg_rtx;
+      xop[0] = all_regs_rtx[24];
+      xop[1] = gen_int_mode (double_int_to_uhwi (f_ginv.map), SImode);
+      xop[2] = all_regs_rtx[25];
+      xop[3] = val_used_p ? xop[0] : const0_rtx;
   
-  return avr_asm_len ("mov %1,%0"    CR_TAB
-                      "andi %0,0xaa" CR_TAB
-                      "eor %1,%0"    CR_TAB
-                      "lsr %0"       CR_TAB
-                      "lsl %1"       CR_TAB
-                      "or %0,%1", xop, plen, 6);
-}
+      avr_out_insert_bits (xop, &f_ginv.cost);
+      
+      f_ginv.cost += val_const_p && val_used_p ? 1 : 0;
+    }
+  
+  /* Step 2b:  Add cost of G  */
 
-/* Revert bit order:  %0 = Revert (%1) with %0 != %1 and clobber %1  */
+  f_ginv.cost += g->cost;
 
-static const char*
-avr_out_revert_bits (rtx *xop, int *plen)
-{
-  return avr_asm_len ("inc __zero_reg__" "\n"
-                      "0:\tror %1"       CR_TAB
-                      "rol %0"           CR_TAB
-                      "lsl __zero_reg__" CR_TAB
-                      "brne 0b", xop, plen, 5);
+  if (avr_log.builtin)
+    avr_edump (" %s%d=%d", g->str, g->arg, f_ginv.cost);
+
+  return f_ginv;
 }
 
 
-/* If OUT_P = true:  Output BST/BLD instruction according to MAP.
-   If OUT_P = false: Just dry-run and fix XOP[1] to resolve
-                     early-clobber conflicts if XOP[0] = XOP[1].  */
+/* Insert bits from XOP[1] into XOP[0] according to MAP.
+   XOP[0] and XOP[1] don't overlap.
+   If FIXP_P = true:  Move all bits according to MAP using BLD/BST sequences.
+   If FIXP_P = false: Just move the bit if its position in the destination
+   is different to its source position.  */
 
 static void
-avr_move_bits (rtx *xop, double_int map, int n_bits, bool out_p, int *plen)
+avr_move_bits (rtx *xop, double_int map, bool fixp_p, int *plen)
 {
-  int bit_dest, b, clobber = 0;
+  int bit_dest, b;
 
   /* T-flag contains this bit of the source, i.e. of XOP[1]  */
   int t_bit_src = -1;
 
-  if (!optimize && !out_p)
-    {
-      avr_asm_len ("mov __tmp_reg__,%1", xop, plen, 1);
-      xop[1] = tmp_reg_rtx;
-      return;
-    }
-  
   /* We order the operations according to the requested source bit b.  */
   
-  for (b = 0; b < n_bits; b++)
-    for (bit_dest = 0; bit_dest < n_bits; bit_dest++)
+  for (b = 0; b < 8; b++)
+    for (bit_dest = 0; bit_dest < 8; bit_dest++)
       {
         int bit_src = avr_map (map, bit_dest);
         
         if (b != bit_src
-            /* Same position: No need to copy as the caller did MOV.  */
-            || bit_dest == bit_src
-            /* Accessing bits 8..f for 8-bit version is void. */
-            || bit_src >= n_bits)
+            || bit_src >= 8
+            /* Same position: No need to copy as requested by FIXP_P.  */
+            || (bit_dest == bit_src && !fixp_p))
           continue;
 
         if (t_bit_src != bit_src)
@@ -10140,121 +10170,103 @@ avr_move_bits (rtx *xop, double_int map,
               
             t_bit_src = bit_src;
 
-            if (out_p)
-              {
-                xop[2] = GEN_INT (bit_src);
-                avr_asm_len ("bst %T1%T2", xop, plen, 1);
-              }
-            else if (clobber & (1 << bit_src))
-              {
-                /* Bit to be read was written already: Backup input
-                   to resolve early-clobber conflict.  */
-               
-                avr_asm_len ("mov __tmp_reg__,%1", xop, plen, 1);
-                xop[1] = tmp_reg_rtx;
-                return;
-              }
+            xop[3] = GEN_INT (bit_src);
+            avr_asm_len ("bst %T1%T3", xop, plen, 1);
           }
 
         /* Load destination bit with T.  */
         
-        if (out_p)
-          {
-            xop[2] = GEN_INT (bit_dest);
-            avr_asm_len ("bld %T0%T2", xop, plen, 1);
-          }
-        
-        clobber |= 1 << bit_dest;
+        xop[3] = GEN_INT (bit_dest);
+        avr_asm_len ("bld %T0%T3", xop, plen, 1);
       }
 }
 
 
-/* Print assembler code for `map_bitsqi' and `map_bitshi'.  */
+/* PLEN == 0: Print assembler code for `insert_bits'.
+   PLEN != 0: Compute code length in bytes.
+   
+   OP[0]:  Result
+   OP[1]:  The mapping composed of nibbles. If nibble no. N is
+           0:   Bit N of result is copied from bit OP[2].0
+           ...  ...
+           7:   Bit N of result is copied from bit OP[2].7
+           0xf: Bit N of result is copied from bit OP[3].N
+   OP[2]:  Bits to be inserted
+   OP[3]:  Target value  */
 
 const char*
-avr_out_map_bits (rtx insn, rtx *operands, int *plen)
+avr_out_insert_bits (rtx *op, int *plen)
 {
-  bool copy_0, copy_1;
-  int n_bits = GET_MODE_BITSIZE (GET_MODE (operands[0]));
-  double_int map = rtx_to_double_int (operands[1]);
-  rtx xop[3];
+  double_int map = rtx_to_double_int (op[1]);
+  unsigned mask_fixed;
+  bool fixp_p = true;
+  rtx xop[4];
 
-  xop[0] = operands[0];
-  xop[1] = operands[2];
+  xop[0] = op[0];
+  xop[1] = op[2];
+  xop[2] = op[3];
 
+  gcc_assert (REG_P (xop[2]) || CONST_INT_P (xop[2]));
+          
   if (plen)
     *plen = 0;
   else if (flag_print_asm_name)
-    avr_fdump (asm_out_file, ASM_COMMENT_START "%X\n", map);
+    fprintf (asm_out_file,
+             ASM_COMMENT_START "map = 0x%08" HOST_LONG_FORMAT "x\n",
+             double_int_to_uhwi (map) & GET_MODE_MASK (SImode));
 
-  switch (n_bits)
-    {
-    default:
-      gcc_unreachable();
-      
-    case 8:
-      if (avr_map_sig_p (n_bits, map, SIG_SWAP1_0))
-        {
-          return avr_out_swap_bits (xop, plen);
-        }
-      else if (avr_map_sig_p (n_bits, map, SIG_REVERT_0))
-        {
-          if (REGNO (xop[0]) == REGNO (xop[1])
-              || !reg_unused_after (insn, xop[1]))
-            {
-              avr_asm_len ("mov __tmp_reg__,%1", xop, plen, 1);
-              xop[1] = tmp_reg_rtx;
-            }
-          
-          return avr_out_revert_bits (xop, plen);
-        }
+  /* If MAP has fixed points it might be better to initialize the result
+     with the bits to be inserted instead of moving all bits by hand.  */
       
-      break; /* 8 */
+  mask_fixed = avr_map_metric (map, MAP_MASK_FIXED_0_7);
 
-    case 16:
+  if (REGNO (xop[0]) == REGNO (xop[1]))
+    {
+      /* Avoid early-clobber conflicts */
       
-      break; /* 16 */
+      avr_asm_len ("mov __tmp_reg__,%1", xop, plen, 1);
+      xop[1] = tmp_reg_rtx;
+      fixp_p = false;
     }
 
-  /* Copy whole byte is cheaper than moving bits that stay at the same
-     position.  Some bits in a byte stay at the same position iff the
-     strict Hamming distance to Identity is not 8.  */
-
-  copy_0 = 8 != avr_map_hamming_byte (n_bits, 0, map, avr_id_map(), true);
-  copy_1 = 8 != avr_map_hamming_byte (n_bits, 1, map, avr_id_map(), true);
-     
-  /* Perform the move(s) just worked out.  */
-
-  if (n_bits == 8)
+  if (avr_map_metric (map, MAP_MASK_PREIMAGE_F))
     {
-      if (REGNO (xop[0]) == REGNO (xop[1]))
-        {
-          /* Fix early-clobber clashes.
-             Notice XOP[0] hat no eary-clobber in its constraint.  */
-          
-          avr_move_bits (xop, map, n_bits, false, plen);
-        }
-      else if (copy_0)
+      /* XOP[2] is used and reloaded to XOP[0] already */
+      
+      int n_fix = 0, n_nofix = 0;
+      
+      gcc_assert (REG_P (xop[2]));
+      
+      /* Get the code size of the bit insertions; once with all bits
+         moved and once with fixed points omitted.  */
+  
+      avr_move_bits (xop, map, true, &n_fix);
+      avr_move_bits (xop, map, false, &n_nofix);
+
+      if (fixp_p && n_fix - n_nofix > 3)
         {
-          avr_asm_len ("mov %0,%1", xop, plen, 1);
+          xop[3] = gen_int_mode (~mask_fixed, QImode);
+        
+          avr_asm_len ("eor %0,%1"   CR_TAB
+                       "andi %0,%3"  CR_TAB
+                       "eor %0,%1", xop, plen, 3);
+          fixp_p = false;
         }
     }
-  else if (AVR_HAVE_MOVW && copy_0 && copy_1)
-    {
-      avr_asm_len ("movw %A0,%A1", xop, plen, 1);
-    }
   else
     {
-      if (copy_0)
-        avr_asm_len ("mov %A0,%A1", xop, plen, 1);
-
-      if (copy_1)
-        avr_asm_len ("mov %B0,%B1", xop, plen, 1);
+      /* XOP[2] is unused */
+      
+      if (fixp_p && mask_fixed)
+        {
+          avr_asm_len ("mov %0,%1", xop, plen, 1);
+          fixp_p = false;
+        }
     }
+  
+  /* Move/insert remaining bits.  */
 
-  /* Move individual bits.  */
-
-  avr_move_bits (xop, map, n_bits, true, plen);
+  avr_move_bits (xop, map, fixp_p, plen);
   
   return "";
 }
@@ -10270,8 +10282,7 @@ enum avr_builtin_id
     AVR_BUILTIN_WDR,
     AVR_BUILTIN_SLEEP,
     AVR_BUILTIN_SWAP,
-    AVR_BUILTIN_MAP8,
-    AVR_BUILTIN_MAP16,
+    AVR_BUILTIN_INSERT_BITS,
     AVR_BUILTIN_FMUL,
     AVR_BUILTIN_FMULS,
     AVR_BUILTIN_FMULSU,
@@ -10328,16 +10339,11 @@ avr_init_builtins (void)
                                 long_unsigned_type_node,
                                 NULL_TREE);
 
-  tree uchar_ftype_ulong_uchar
+  tree uchar_ftype_ulong_uchar_uchar
     = build_function_type_list (unsigned_char_type_node,
                                 long_unsigned_type_node,
                                 unsigned_char_type_node,
-                                NULL_TREE);
-
-  tree uint_ftype_ullong_uint
-    = build_function_type_list (unsigned_type_node,
-                                long_long_unsigned_type_node,
-                                unsigned_type_node,
+                                unsigned_char_type_node,
                                 NULL_TREE);
 
   DEF_BUILTIN ("__builtin_avr_nop", void_ftype_void, AVR_BUILTIN_NOP);
@@ -10356,10 +10362,8 @@ avr_init_builtins (void)
   DEF_BUILTIN ("__builtin_avr_fmulsu", int_ftype_char_uchar, 
                AVR_BUILTIN_FMULSU);
 
-  DEF_BUILTIN ("__builtin_avr_map8", uchar_ftype_ulong_uchar, 
-               AVR_BUILTIN_MAP8);
-  DEF_BUILTIN ("__builtin_avr_map16", uint_ftype_ullong_uint, 
-               AVR_BUILTIN_MAP16);
+  DEF_BUILTIN ("__builtin_avr_insert_bits", uchar_ftype_ulong_uchar_uchar,
+               AVR_BUILTIN_INSERT_BITS);
 
   avr_init_builtin_int24 ();
 }
@@ -10384,9 +10388,14 @@ bdesc_2arg[] =
   {
     { CODE_FOR_fmul, "__builtin_avr_fmul", AVR_BUILTIN_FMUL },
     { CODE_FOR_fmuls, "__builtin_avr_fmuls", AVR_BUILTIN_FMULS },
-    { CODE_FOR_fmulsu, "__builtin_avr_fmulsu", AVR_BUILTIN_FMULSU },
-    { CODE_FOR_map_bitsqi, "__builtin_avr_map8", AVR_BUILTIN_MAP8 },
-    { CODE_FOR_map_bitshi, "__builtin_avr_map16", AVR_BUILTIN_MAP16 }
+    { CODE_FOR_fmulsu, "__builtin_avr_fmulsu", AVR_BUILTIN_FMULSU }
+  };
+
+static const struct avr_builtin_description
+bdesc_3arg[] =
+  {
+    { CODE_FOR_insert_bits, "__builtin_avr_insert_bits",
+      AVR_BUILTIN_INSERT_BITS }
   };
 
 /* Subroutine of avr_expand_builtin to take care of unop insns.  */
@@ -10486,6 +10495,76 @@ avr_expand_binop_builtin (enum insn_code
   return target;
 }
 
+/* Subroutine of avr_expand_builtin to take care of 3-operand insns.  */
+
+static rtx
+avr_expand_triop_builtin (enum insn_code icode, tree exp, rtx target)
+{
+  rtx pat;
+  tree arg0 = CALL_EXPR_ARG (exp, 0);
+  tree arg1 = CALL_EXPR_ARG (exp, 1);
+  tree arg2 = CALL_EXPR_ARG (exp, 2);
+  rtx op0 = expand_expr (arg0, NULL_RTX, VOIDmode, EXPAND_NORMAL);
+  rtx op1 = expand_expr (arg1, NULL_RTX, VOIDmode, EXPAND_NORMAL);
+  rtx op2 = expand_expr (arg2, NULL_RTX, VOIDmode, EXPAND_NORMAL);
+  enum machine_mode op0mode = GET_MODE (op0);
+  enum machine_mode op1mode = GET_MODE (op1);
+  enum machine_mode op2mode = GET_MODE (op2);
+  enum machine_mode tmode = insn_data[icode].operand[0].mode;
+  enum machine_mode mode0 = insn_data[icode].operand[1].mode;
+  enum machine_mode mode1 = insn_data[icode].operand[2].mode;
+  enum machine_mode mode2 = insn_data[icode].operand[3].mode;
+
+  if (! target
+      || GET_MODE (target) != tmode
+      || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
+    {
+      target = gen_reg_rtx (tmode);
+    }
+
+  if ((op0mode == SImode || op0mode == VOIDmode) && mode0 == HImode)
+    {
+      op0mode = HImode;
+      op0 = gen_lowpart (HImode, op0);
+    }
+  
+  if ((op1mode == SImode || op1mode == VOIDmode) && mode1 == HImode)
+    {
+      op1mode = HImode;
+      op1 = gen_lowpart (HImode, op1);
+    }
+  
+  if ((op2mode == SImode || op2mode == VOIDmode) && mode2 == HImode)
+    {
+      op2mode = HImode;
+      op2 = gen_lowpart (HImode, op2);
+    }
+  
+  /* In case the insn wants input operands in modes different from
+     the result, abort.  */
+  
+  gcc_assert ((op0mode == mode0 || op0mode == VOIDmode)
+              && (op1mode == mode1 || op1mode == VOIDmode)
+              && (op2mode == mode2 || op2mode == VOIDmode));
+
+  if (! (*insn_data[icode].operand[1].predicate) (op0, mode0))
+    op0 = copy_to_mode_reg (mode0, op0);
+  
+  if (! (*insn_data[icode].operand[2].predicate) (op1, mode1))
+    op1 = copy_to_mode_reg (mode1, op1);
+
+  if (! (*insn_data[icode].operand[3].predicate) (op2, mode2))
+    op2 = copy_to_mode_reg (mode2, op2);
+
+  pat = GEN_FCN (icode) (target, op0, op1, op2);
+  
+  if (! pat)
+    return 0;
+
+  emit_insn (pat);
+  return target;
+}
+
 
 /* Expand an expression EXP that calls a built-in function,
    with result going to TARGET if that's convenient
@@ -10541,7 +10620,7 @@ avr_expand_builtin (tree exp, rtx target
         return 0;
       }
 
-    case AVR_BUILTIN_MAP8:
+    case AVR_BUILTIN_INSERT_BITS:
       {
         arg0 = CALL_EXPR_ARG (exp, 0);
         op0 = expand_expr (arg0, NULL_RTX, VOIDmode, EXPAND_NORMAL);
@@ -10553,19 +10632,6 @@ avr_expand_builtin (tree exp, rtx target
             return target;
           }
       }
-
-    case AVR_BUILTIN_MAP16:
-      {
-        arg0 = CALL_EXPR_ARG (exp, 0);
-        op0 = expand_expr (arg0, NULL_RTX, VOIDmode, EXPAND_NORMAL);
-
-        if (!const_double_operand (op0, VOIDmode))
-          {
-            error ("%s expects a compile time long long integer constant"
-                   " as first argument", bname);
-            return target;
-          }
-      }
     }
 
   for (i = 0, d = bdesc_1arg; i < ARRAY_SIZE (bdesc_1arg); i++, d++)
@@ -10576,9 +10642,157 @@ avr_expand_builtin (tree exp, rtx target
     if (d->id == id)
       return avr_expand_binop_builtin (d->icode, exp, target);
 
+  for (i = 0, d = bdesc_3arg; i < ARRAY_SIZE (bdesc_3arg); i++, d++)
+    if (d->id == id)
+      return avr_expand_triop_builtin (d->icode, exp, target);
+
   gcc_unreachable ();
 }
 
+
+/* Implement `TARGET_FOLD_BUILTIN'.  */
+
+static tree
+avr_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *arg,
+                  bool ignore ATTRIBUTE_UNUSED)
+{
+  unsigned int fcode = DECL_FUNCTION_CODE (fndecl);
+  tree val_type = TREE_TYPE (TREE_TYPE (fndecl));
+
+  if (!optimize)
+    return NULL_TREE;
+  
+  switch (fcode)
+    {
+    default:
+      break;
+
+    case AVR_BUILTIN_INSERT_BITS:
+      {
+        tree tbits = arg[1];
+        tree tval = arg[2];
+        tree tmap;
+        tree map_type = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
+        double_int map = tree_to_double_int (arg[0]);
+        bool changed = false;
+        unsigned i;
+        avr_map_op_t best_g;
+        
+        tmap = double_int_to_tree (map_type, map);
+
+        if (TREE_CODE (tval) != INTEGER_CST
+            && 0 == avr_map_metric (map, MAP_MASK_PREIMAGE_F))
+          {
+            /* There are no F in the map, i.e. 3rd operand is unused.
+               Replace that argument with some constant to render
+               respective input unused.  */
+            
+            tval = build_int_cst (val_type, 0);
+            changed = true;
+          }
+
+        if (TREE_CODE (tbits) != INTEGER_CST
+            && 0 == avr_map_metric (map, MAP_PREIMAGE_0_7))
+          {
+            /* Similar for the bits to be inserted. If they are unused,
+               we can just as well pass 0.  */
+            
+            tbits = build_int_cst (val_type, 0);
+          }
+
+        if (TREE_CODE (tbits) == INTEGER_CST)
+          {
+            /* Inserting bits known at compile time is easy and can be
+               performed by AND and OR with appropriate masks.  */
+
+            int bits = TREE_INT_CST_LOW (tbits);
+            int mask_ior = 0, mask_and = 0xff;
+
+            for (i = 0; i < 8; i++)
+              {
+                int mi = avr_map (map, i);
+
+                if (mi < 8)
+                  {
+                    if (bits & (1 << mi))     mask_ior |=  (1 << i);
+                    else                      mask_and &= ~(1 << i);
+                  }
+              }
+
+            tval = fold_build2 (BIT_IOR_EXPR, val_type, tval,
+                                build_int_cst (val_type, mask_ior));
+            return fold_build2 (BIT_AND_EXPR, val_type, tval,
+                                build_int_cst (val_type, mask_and));
+          }
+
+        if (changed)
+          return build_call_expr (fndecl, 3, tmap, tbits, tval);
+
+        /* If bits don't change their position we can use vanilla logic
+           to merge the two arguments.  */
+
+        if (0 == avr_map_metric (map, MAP_NONFIXED_0_7))
+          {
+            int mask_f = avr_map_metric (map, MAP_MASK_PREIMAGE_F);
+            tree tres, tmask = build_int_cst (val_type, mask_f ^ 0xff);
+
+            tres = fold_build2 (BIT_XOR_EXPR, val_type, tbits, tval);
+            tres = fold_build2 (BIT_AND_EXPR, val_type, tres, tmask);
+            return fold_build2 (BIT_XOR_EXPR, val_type, tres, tval);
+          }
+
+        /* Try to decomposing map to reduce overall cost.  */
+
+        if (avr_log.builtin)
+          avr_edump ("\n%?: %X\n%?: ROL cost: ", map);
+        
+        best_g = avr_map_op[0];
+        best_g.cost = 1000;
+        
+        for (i = 0; i < sizeof (avr_map_op) / sizeof (*avr_map_op); i++)
+          {
+            avr_map_op_t g
+              = avr_map_decompose (map, avr_map_op + i,
+                                   TREE_CODE (tval) == INTEGER_CST);
+
+            if (g.cost >= 0 && g.cost < best_g.cost)
+              best_g = g;
+          }
+
+        if (avr_log.builtin)
+          avr_edump ("\n");
+                     
+        if (best_g.arg == 0)
+          /* No optimization found */
+          break;
+        
+        /* Apply operation G to the 2nd argument.  */
+              
+        if (avr_log.builtin)
+          avr_edump ("%?: using OP(%s%d, %X) cost %d\n",
+                     best_g.str, best_g.arg, best_g.map, best_g.cost);
+
+        /* Do right-shifts arithmetically: They copy the MSB instead of
+           shifting in a non-usable value (0) as with logic right-shift.  */
+        
+        tbits = fold_convert (signed_char_type_node, tbits);
+        tbits = fold_build2 (best_g.code, signed_char_type_node, tbits,
+                             build_int_cst (val_type, best_g.arg));
+        tbits = fold_convert (val_type, tbits);
+
+        /* Use map o G^-1 instead of original map to undo the effect of G.  */
+        
+        tmap = double_int_to_tree (map_type, best_g.map);
+        
+        return build_call_expr (fndecl, 3, tmap, tbits, tval);
+      } /* AVR_BUILTIN_INSERT_BITS */
+    }
+
+  return NULL_TREE;
+}
+
+
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-avr.h"
Index: testsuite/gcc.target/avr/torture/builtin_insert_bits-1.c
===================================================================
--- testsuite/gcc.target/avr/torture/builtin_insert_bits-1.c	(revision 0)
+++ testsuite/gcc.target/avr/torture/builtin_insert_bits-1.c	(revision 0)
@@ -0,0 +1,97 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+#define MASK_F(M)                                       \
+  (0                                                    \
+   | ((0xf == (0xf & ((M) >> (4*0)))) ? (1 << 0) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*1)))) ? (1 << 1) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*2)))) ? (1 << 2) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*3)))) ? (1 << 3) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*4)))) ? (1 << 4) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*5)))) ? (1 << 5) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*6)))) ? (1 << 6) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*7)))) ? (1 << 7) : 0)   \
+   | 0)
+
+#define MASK_0_7(M)                                     \
+  (0                                                    \
+   | ((8 > (0xf & ((M) >> (4*0)))) ? (1 << 0) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*1)))) ? (1 << 1) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*2)))) ? (1 << 2) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*3)))) ? (1 << 3) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*4)))) ? (1 << 4) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*5)))) ? (1 << 5) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*6)))) ? (1 << 6) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*7)))) ? (1 << 7) : 0)      \
+   | 0)
+
+#define INSERT_BITS(M,B,V)                                              \
+  (__extension__({                                                      \
+      unsigned char _n, _r = 0;                                         \
+      _n = 0xf & (M >> (4*0)); if (_n<8) _r |= (!!(B & (1 << _n))) << 0; \
+      _n = 0xf & (M >> (4*1)); if (_n<8) _r |= (!!(B & (1 << _n))) << 1; \
+      _n = 0xf & (M >> (4*2)); if (_n<8) _r |= (!!(B & (1 << _n))) << 2; \
+      _n = 0xf & (M >> (4*3)); if (_n<8) _r |= (!!(B & (1 << _n))) << 3; \
+      _n = 0xf & (M >> (4*4)); if (_n<8) _r |= (!!(B & (1 << _n))) << 4; \
+      _n = 0xf & (M >> (4*5)); if (_n<8) _r |= (!!(B & (1 << _n))) << 5; \
+      _n = 0xf & (M >> (4*6)); if (_n<8) _r |= (!!(B & (1 << _n))) << 6; \
+      _n = 0xf & (M >> (4*7)); if (_n<8) _r |= (!!(B & (1 << _n))) << 7; \
+      (unsigned char) ((V) & MASK_F(M)) | _r;                           \
+    }))
+
+#define MASK_USED(M) (MASK_F(M) | MASK_0_7(M))
+
+#define TEST2(M,B,V)                                    \
+  do {                                                  \
+    __asm volatile (";" #M);                            \
+    r1 = MASK_USED (M)                                  \
+      & __builtin_avr_insert_bits (M,B,V);              \
+    r2 = INSERT_BITS (M,B,V);                           \
+    if (r1 != r2)                                       \
+      abort ();                                         \
+  } while(0)
+
+#define TEST1(M,X)                                      \
+  do {                                                  \
+    TEST2 (M,X,0x00); TEST2 (M,0x00,X);                 \
+    TEST2 (M,X,0xff); TEST2 (M,0xff,X);                 \
+    TEST2 (M,X,0xaa); TEST2 (M,0xaa,X);                 \
+    TEST2 (M,X,0xcc); TEST2 (M,0xcc,X);                 \
+    TEST2 (M,X,0x96); TEST2 (M,0x96,X);                 \
+  } while(0)
+
+
+
+void test8 (void)
+{
+  unsigned char r1, r2;
+  unsigned char ib;
+
+  static const unsigned char V[] =
+    {
+      0, 0xaa, 0xcc, 0xf0, 0xff, 0x5b, 0x4d
+    };
+
+  for (ib = 0; ib < sizeof (V) / sizeof (*V); ib++)
+    {
+      unsigned char b = V[ib];
+      
+      TEST1 (0x76543210, b);
+      TEST1 (0x3210ffff, b);
+      TEST1 (0x67452301, b);
+      TEST1 (0xf0f1f2f3, b);
+      TEST1 (0xff10ff54, b);
+      TEST1 (0x01234567, b);
+      TEST1 (0xff765f32, b);
+    }
+}
+
+/****************************************************************/
+
+int main()
+{
+  test8();
+  
+  exit(0);
+}
Index: testsuite/gcc.target/avr/torture/builtin_insert_bits-2.c
===================================================================
--- testsuite/gcc.target/avr/torture/builtin_insert_bits-2.c	(revision 0)
+++ testsuite/gcc.target/avr/torture/builtin_insert_bits-2.c	(revision 0)
@@ -0,0 +1,94 @@
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+#define MASK_F(M)                                       \
+  (0                                                    \
+   | ((0xf == (0xf & ((M) >> (4*0)))) ? (1 << 0) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*1)))) ? (1 << 1) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*2)))) ? (1 << 2) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*3)))) ? (1 << 3) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*4)))) ? (1 << 4) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*5)))) ? (1 << 5) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*6)))) ? (1 << 6) : 0)   \
+   | ((0xf == (0xf & ((M) >> (4*7)))) ? (1 << 7) : 0)   \
+   | 0)
+
+#define MASK_0_7(M)                                     \
+  (0                                                    \
+   | ((8 > (0xf & ((M) >> (4*0)))) ? (1 << 0) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*1)))) ? (1 << 1) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*2)))) ? (1 << 2) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*3)))) ? (1 << 3) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*4)))) ? (1 << 4) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*5)))) ? (1 << 5) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*6)))) ? (1 << 6) : 0)      \
+   | ((8 > (0xf & ((M) >> (4*7)))) ? (1 << 7) : 0)      \
+   | 0)
+
+#define INSERT_BITS(M,B,V)                                              \
+  (__extension__({                                                      \
+      unsigned char _n, _r = 0;                                         \
+      _n = 0xf & (M >> (4*0)); if (_n<8) _r |= (!!(B & (1 << _n))) << 0; \
+      _n = 0xf & (M >> (4*1)); if (_n<8) _r |= (!!(B & (1 << _n))) << 1; \
+      _n = 0xf & (M >> (4*2)); if (_n<8) _r |= (!!(B & (1 << _n))) << 2; \
+      _n = 0xf & (M >> (4*3)); if (_n<8) _r |= (!!(B & (1 << _n))) << 3; \
+      _n = 0xf & (M >> (4*4)); if (_n<8) _r |= (!!(B & (1 << _n))) << 4; \
+      _n = 0xf & (M >> (4*5)); if (_n<8) _r |= (!!(B & (1 << _n))) << 5; \
+      _n = 0xf & (M >> (4*6)); if (_n<8) _r |= (!!(B & (1 << _n))) << 6; \
+      _n = 0xf & (M >> (4*7)); if (_n<8) _r |= (!!(B & (1 << _n))) << 7; \
+      (unsigned char) ((V) & MASK_F(M)) | _r;                           \
+    }))
+
+#define MASK_USED(M) (MASK_F(M) | MASK_0_7(M))
+
+#define TEST2(M,B,V)                                    \
+  do {                                                  \
+    __asm volatile (";" #M);                            \
+    r1 = MASK_USED (M)                                  \
+      & __builtin_avr_insert_bits (M,B,V);              \
+    r2 = INSERT_BITS (M,B,V);                           \
+    if (r1 != r2)                                       \
+      abort ();                                         \
+  } while(0)
+
+void test8 (void)
+{
+  unsigned char r1, r2;
+  unsigned char ib, iv;
+
+  static const unsigned char V[] =
+    {
+      0, 0xaa, 0xcc, 0xf0, 0xff, 0x5b, 0x4d
+    };
+
+  for (ib = 0; ib < sizeof (V) / sizeof (*V); ib++)
+    {
+      unsigned char b = V[ib];
+      
+      for (iv = 0; iv < sizeof (V) / sizeof (*V); iv++)
+        {
+          unsigned char v = V[iv];
+          
+          TEST2 (0x76543210, b, v);
+          TEST2 (0xffffffff, b, v);
+          TEST2 (0x3210ffff, b, v);
+          TEST2 (0x67452301, b, v);
+          TEST2 (0xf0f1f2f3, b, v);
+          TEST2 (0xff10ff54, b, v);
+          TEST2 (0x0765f321, b, v);
+          TEST2 (0x11223344, b, v);
+          TEST2 (0x01234567, b, v);
+          TEST2 (0xff7765f3, b, v);
+        }
+    }
+}
+
+/****************************************************************/
+
+int main()
+{
+  test8();
+  
+  exit(0);
+}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]