This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch,AVR] PR54222: Add fixed point support


Denis Chertykov wrote:
> 2012/8/13 Georg-Johann Lay:
>> Denis Chertykov wrote:
>>> 2012/8/11 Georg-Johann Lay:
>>>> Weddington, Eric schrieb:
>>>>>> From: Georg-Johann Lay
>>>>>>
>>>>>>
>>>>>> The first step would be to bisect and find the patch that lead to
>>>>>> PR53923.  It was not a change in the avr BE, so the question goes
>>>>>> to the authors of the respective patch.
>>>>>>
>>>>>> Up to now I didn't even try to bisect; that would take years on the
>>>>>> host that I have available...
>>>>>>
>>>>>>> My only real concern is that this is a major feature addition and
>>>>>>> the AVR port is currently broken.
>>>>>> I don't know if it's the avr port or some parts of the middle end that
>>>>>> don't cooperate with avr.
>>>>> I would really, really love to see fixed point support added in,
>>>>> especially since I know that Sean has worked on it for quite a while,
>>>>> and you've also done a lot of work in getting the patches in shape to
>>>>> get them committed.
>>>>>
>>>>> But, if the AVR port is currently broken (by whomever, and whatever
>>>>> patch) and a major feature like this can't be tested to make sure it
>>>>> doesn't break anything else in the AVR backend, then I'm hesitant to
>>>>> approve (even though I really want to approve).
>>>> I don't understand enough of DF to fix PR53923.  The insn that leads
>>>> to the ICE is (in df-problems.c:dead_debug_insert_temp):
>>>>
>>> Today I have updated GCC svn tree and successfully compiled avr-gcc.
>>> The libgcc2-mulsc3.c from   also compiled without bugs.
>>>
>>> Denis.
>>>
>>> PS: May be I'm doing something wrong ? (I had too long vacations)
>> I am configuring with --target=avr --disable-nls --with-dwarf2
>> --enable-languages=c,c++ --enable-target-optspace=yes --enable-checking=yes,rtl
>>
>> Build GCC is "gcc version 4.3.2".
>> Build and host are i686-pc-linux-gnu.
>>
>> Maybe it's different on a 64-bit computer, but I only have 32-bit host.
>>
> 
> I have debugging PR53923 and on my opinion it's not an AVR port bug.
> Please commit fixed point support.
> 
> Denis.

Hi, here is an updated patch.

Some functions are reworked and there is some code clean up.

The test results look good, there are no additional regressions.

The new test cases in gcc.dg/fixed-point pass except some convert-*.c for
two reasons:

* Some test cases have a loss of precision and therefore fail.
  One fail is that 0x3fffffffc0000000 is compared against
  0x4000000000000000 and thus fails.  Presumably its a rounding
  error from float.  I'd say this is not critical.

* PR54330: This leads to wrong code for __satfractudadq and the
  wrong code is already present in .expand.  From the distance
  this looks like a middle-end or tree-ssa problem.

The new patch implements TARGET_BUILD_BUILTIN_VA_LIST.
Rationale is that avr-fixed.md adjust some modes bit these
changes are not reflected by the built-in macros made by gcc.
This leads to wrong code in libgcc because it deduces the
type layout from these built-in defines.  Thus, the respective
nodes must be patches *before* built-in macros are emit.

The changes to LIB2FUNCS_EXCLUDE currently have no effects,
this needs http://gcc.gnu.org/ml/gcc-patches/2012-08/msg01580.html
which is currently under review.

Ok to install?

Johann

libgcc/
	PR target/54222
	* config/avr/lib1funcs-fixed.S: New file.
	* config/avr/lib1funcs.S: Include it.  Undefine some divmodsi
	after they are used.
	(neg2, neg4): New macros.
	(__mulqihi3,__umulqihi3,__mulhi3): Rewrite non-MUL variants.
	(__mulhisi3,__umulhisi3,__mulsi3): Rewrite non-MUL variants.
	(__umulhisi3): Speed up MUL variant if there is enough flash.
	* config/avr/avr-lib.h (TA, UTA): Adjust according to gcc's
	avr-modes.def.
	* config/avr/t-avr (LIB1ASMFUNCS): Add: _fractqqsf, _fractuqqsf,
	_fracthqsf, _fractuhqsf, _fracthasf, _fractuhasf, _fractsasf,
	_fractusasf, _fractsfqq, _fractsfuqq, _fractsfhq, _fractsfuhq,
	_fractsfha, _fractsfsa, _mulqq3, _muluqq3, _mulhq3, _muluhq3,
	_mulha3, _muluha3, _mulsa3, _mulusa3, _divqq3, _udivuqq3, _divhq3,
	_udivuhq3, _divha3, _udivuha3, _divsa3, _udivusa3.
	(LIB2FUNCS_EXCLUDE): Add supported functions.

gcc/
	PR target/54222
	* avr-modes.def (HA, SA, DA, TA, UTA): Adjust modes.
	* avr/avr-fixed.md: New file.
	* avr/avr.md: Include it.
	(cc): Add: minus.
	(adjust_len): Add: minus, minus64, ufract, sfract.
	(ALL1, ALL2, ALL4, ORDERED234): New mode iterators.
	(MOVMODE): Add: QQ, UQQ, HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
	(MPUSH): Add: HQ, UHQ, HA, UHA, SQ, USQ, SA, USA.
	(pushqi1, xload8_A, xload_8, movqi_insn, *reload_inqi, addqi3,
	subqi3, ashlqi3, *ashlqi3, ashrqi3, lshrqi3, *lshrqi3, *cmpqi,
	cbranchqi4, *cpse.eq): Generalize to handle all 8-bit modes in ALL1.
	(*movhi, reload_inhi, addhi3, *addhi3, addhi3_clobber, subhi3,
	ashlhi3, *ashlhi3_const, ashrhi3, *ashirhi3_const, lshrhi3,
	*lshrhi3_const, *cmphi, cbranchhi4): Generalize to handle all
	16-bit modes in ALL2.
	(subhi3, casesi, strlenhi): Add clobber when expanding minus:HI.
	(*movsi, *reload_insi, addsi3, subsi3, ashlsi3, *ashlsi3_const,
	ashrsi3, *ashrhi3_const, *ashrsi3_const, lshrsi3, *lshrsi3_const,
	*reversed_tstsi, *cmpsi, cbranchsi4): Generalize to handle all
	32-bit modes in ALL4.
	* avr-dimode.md (ALL8): New mode iterator.
	(adddi3, adddi3_insn, adddi3_const_insn, subdi3, subdi3_insn,
	subdi3_const_insn, cbranchdi4, compare_di2,
	compare_const_di2, ashrdi3, lshrdi3, rotldi3, ashldi3_insn,
	ashrdi3_insn, lshrdi3_insn, rotldi3_insn): Generalize to handle
	all 64-bit modes in ALL8.
	* config/avr/avr-protos.h (avr_to_int_mode): New prototype.
	(avr_out_fract, avr_out_minus, avr_out_minus64): New prototypes.
	* config/avr/avr.c (TARGET_FIXED_POINT_SUPPORTED_P): Define to...
	(avr_fixed_point_supported_p): ...this new static function.
	(TARGET_BUILD_BUILTIN_VA_LIST): Define to...
	(avr_build_builtin_va_list): ...this new static function.
	(avr_adjust_type_node): New static function.
	(avr_scalar_mode_supported_p): Allow if ALL_FIXED_POINT_MODE_P.
	(avr_builtin_setjmp_frame_value): Use gen_subhi3 and return new
	pseudo instead of gen_rtx_MINUS.
	(avr_print_operand, avr_operand_rtx_cost): Handle: CONST_FIXED.
	(notice_update_cc): Handle: CC_MINUS.
	(output_movqi): Generalize to handle respective fixed-point modes.
	(output_movhi, output_movsisf, avr_2word_insn_p): Ditto.
	(avr_out_compare, avr_out_plus_1): Also handle fixed-point modes.
	(avr_assemble_integer): Ditto.
	(output_reload_in_const, output_reload_insisf): Ditto.
	(avr_compare_pattern): Skip all modes > 4 bytes.
	(avr_2word_insn_p): Skip movuqq_insn, movqq_insn.
	(avr_out_fract, avr_out_minus, avr_out_minus64): New functions.
	(avr_to_int_mode): New function.
	(adjust_insn_length): Handle: ADJUST_LEN_SFRACT,
	ADJUST_LEN_UFRACT, ADJUST_LEN_MINUS, ADJUST_LEN_MINUS64.
	* config/avr/predicates.md (const0_operand): Allow const_fixed.
	(const_operand, const_or_immediate_operand): New.
	(nonmemory_or_const_operand): New.
	* config/avr/constraints.md (Ynn, Y00, Y01, Y02, Ym1, Ym2, YIJ):
	New constraints.
	* config/avr/avr.h (LONG_LONG_ACCUM_TYPE_SIZE): Define.






Index: gcc/config/avr/predicates.md
===================================================================
--- gcc/config/avr/predicates.md	(revision 190535)
+++ gcc/config/avr/predicates.md	(working copy)
@@ -74,7 +74,7 @@ (define_predicate "nox_general_operand"
 
 ;; Return 1 if OP is the zero constant for MODE.
 (define_predicate "const0_operand"
-  (and (match_code "const_int,const_double")
+  (and (match_code "const_int,const_fixed,const_double")
        (match_test "op == CONST0_RTX (mode)")))
 
 ;; Return 1 if OP is the one constant integer for MODE.
@@ -248,3 +248,21 @@ (define_predicate "s16_operand"
 (define_predicate "o16_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), -(1<<16), -1)")))
+
+;; Const int, fixed, or double operand
+(define_predicate "const_operand"
+  (ior (match_code "const_fixed")
+       (match_code "const_double")
+       (match_operand 0 "const_int_operand")))
+
+;; Const int, const fixed, or const double operand
+(define_predicate "nonmemory_or_const_operand"
+  (ior (match_code "const_fixed")
+       (match_code "const_double")
+       (match_operand 0 "nonmemory_operand")))
+
+;; Immediate, const fixed, or const double operand
+(define_predicate "const_or_immediate_operand"
+  (ior (match_code "const_fixed")
+       (match_code "const_double")
+       (match_operand 0 "immediate_operand")))
Index: gcc/config/avr/avr-fixed.md
===================================================================
--- gcc/config/avr/avr-fixed.md	(revision 0)
+++ gcc/config/avr/avr-fixed.md	(revision 0)
@@ -0,0 +1,287 @@
+;;   This file contains instructions that support fixed-point operations
+;;   for Atmel AVR micro controllers.
+;;   Copyright (C) 2012
+;;   Free Software Foundation, Inc.
+;;
+;;   Contributed by Sean D'Epagnier  (sean@depagnier.com)
+;;                  Georg-Johann Lay (avr@gjlay.de)
+
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_mode_iterator ALL1Q [(QQ "") (UQQ "")])
+(define_mode_iterator ALL2Q [(HQ "") (UHQ "")])
+(define_mode_iterator ALL2A [(HA "") (UHA "")])
+(define_mode_iterator ALL2QA [(HQ "") (UHQ "")
+                              (HA "") (UHA "")])
+(define_mode_iterator ALL4A [(SA "") (USA "")])
+
+;;; Conversions
+
+(define_mode_iterator FIXED_A
+  [(QQ "") (UQQ "")
+   (HQ "") (UHQ "") (HA "") (UHA "")
+   (SQ "") (USQ "") (SA "") (USA "")
+   (DQ "") (UDQ "") (DA "") (UDA "")
+   (TA "") (UTA "")
+   (QI "") (HI "") (SI "") (DI "")])
+
+;; Same so that be can build cross products
+
+(define_mode_iterator FIXED_B
+  [(QQ "") (UQQ "")
+   (HQ "") (UHQ "") (HA "") (UHA "")
+   (SQ "") (USQ "") (SA "") (USA "")
+   (DQ "") (UDQ "") (DA "") (UDA "")
+   (TA "") (UTA "")
+   (QI "") (HI "") (SI "") (DI "")])
+
+(define_insn "fract<FIXED_B:mode><FIXED_A:mode>2"
+  [(set (match_operand:FIXED_A 0 "register_operand" "=r")
+        (fract_convert:FIXED_A
+         (match_operand:FIXED_B 1 "register_operand" "r")))]
+  "<FIXED_B:MODE>mode != <FIXED_A:MODE>mode"
+  {
+    return avr_out_fract (insn, operands, true, NULL);
+  }
+  [(set_attr "cc" "clobber")
+   (set_attr "adjust_len" "sfract")])
+
+(define_insn "fractuns<FIXED_B:mode><FIXED_A:mode>2"
+  [(set (match_operand:FIXED_A 0 "register_operand" "=r")
+        (unsigned_fract_convert:FIXED_A
+         (match_operand:FIXED_B 1 "register_operand" "r")))]
+  "<FIXED_B:MODE>mode != <FIXED_A:MODE>mode"
+  {
+    return avr_out_fract (insn, operands, false, NULL);
+  }
+  [(set_attr "cc" "clobber")
+   (set_attr "adjust_len" "ufract")])
+
+;******************************************************************************
+; mul
+
+;; "mulqq3" "muluqq3"
+(define_expand "mul<mode>3"
+  [(parallel [(match_operand:ALL1Q 0 "register_operand" "")
+              (match_operand:ALL1Q 1 "register_operand" "")
+              (match_operand:ALL1Q 2 "register_operand" "")])]
+  ""
+  {
+    emit_insn (AVR_HAVE_MUL
+      ? gen_mul<mode>3_enh (operands[0], operands[1], operands[2])
+      : gen_mul<mode>3_nomul (operands[0], operands[1], operands[2]));
+    DONE;
+  })
+
+(define_insn "mulqq3_enh"
+  [(set (match_operand:QQ 0 "register_operand"         "=r")
+        (mult:QQ (match_operand:QQ 1 "register_operand" "a")
+                 (match_operand:QQ 2 "register_operand" "a")))]
+  "AVR_HAVE_MUL"
+  "fmuls %1,%2\;dec r1\;brvs 0f\;inc r1\;0:\;mov %0,r1\;clr __zero_reg__"
+  [(set_attr "length" "6")
+   (set_attr "cc" "clobber")])
+
+(define_insn "muluqq3_enh"
+  [(set (match_operand:UQQ 0 "register_operand"          "=r")
+        (mult:UQQ (match_operand:UQQ 1 "register_operand" "r")
+                  (match_operand:UQQ 2 "register_operand" "r")))]
+  "AVR_HAVE_MUL"
+  "mul %1,%2\;mov %0,r1\;clr __zero_reg__"
+  [(set_attr "length" "3")
+   (set_attr "cc" "clobber")])
+
+(define_expand "mulqq3_nomul"
+  [(set (reg:QQ 24)
+        (match_operand:QQ 1 "register_operand" ""))
+   (set (reg:QQ 25)
+        (match_operand:QQ 2 "register_operand" ""))
+   ;; "*mulqq3.call"
+   (parallel [(set (reg:QQ 23)
+                   (mult:QQ (reg:QQ 24)
+                            (reg:QQ 25)))
+              (clobber (reg:QI 22))
+              (clobber (reg:HI 24))])
+   (set (match_operand:QQ 0 "register_operand" "")
+        (reg:QQ 23))]
+  "!AVR_HAVE_MUL")
+
+(define_expand "muluqq3_nomul"
+  [(set (reg:UQQ 22)
+        (match_operand:UQQ 1 "register_operand" ""))
+   (set (reg:UQQ 24)
+        (match_operand:UQQ 2 "register_operand" ""))
+   ;; "*umulqihi3.call"
+   (parallel [(set (reg:HI 24)
+                   (mult:HI (zero_extend:HI (reg:QI 22))
+                            (zero_extend:HI (reg:QI 24))))
+              (clobber (reg:QI 21))
+              (clobber (reg:HI 22))])
+   (set (match_operand:UQQ 0 "register_operand" "")
+        (reg:UQQ 25))]
+  "!AVR_HAVE_MUL")
+
+(define_insn "*mulqq3.call"
+  [(set (reg:QQ 23)
+        (mult:QQ (reg:QQ 24)
+                 (reg:QQ 25)))
+   (clobber (reg:QI 22))
+   (clobber (reg:HI 24))]
+  "!AVR_HAVE_MUL"
+  "%~call __mulqq3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+
+;; "mulhq3" "muluhq3"
+;; "mulha3" "muluha3"
+(define_expand "mul<mode>3"
+  [(set (reg:ALL2QA 18)
+        (match_operand:ALL2QA 1 "register_operand" ""))
+   (set (reg:ALL2QA 26)
+        (match_operand:ALL2QA 2 "register_operand" ""))
+   ;; "*mulhq3.call.enh"
+   (parallel [(set (reg:ALL2QA 24)
+                   (mult:ALL2QA (reg:ALL2QA 18)
+                                (reg:ALL2QA 26)))
+              (clobber (reg:HI 22))])
+   (set (match_operand:ALL2QA 0 "register_operand" "")
+        (reg:ALL2QA 24))]
+  "AVR_HAVE_MUL")
+
+;; "*mulhq3.call"  "*muluhq3.call"
+;; "*mulha3.call"  "*muluha3.call"
+(define_insn "*mul<mode>3.call"
+  [(set (reg:ALL2QA 24)
+        (mult:ALL2QA (reg:ALL2QA 18)
+                     (reg:ALL2QA 26)))
+   (clobber (reg:HI 22))]
+  "AVR_HAVE_MUL"
+  "%~call __mul<mode>3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+
+;; On the enhanced core, don't clobber either input and use a separate output
+
+;; "mulsa3" "mulusa3"
+(define_expand "mul<mode>3"
+  [(set (reg:ALL4A 16)
+        (match_operand:ALL4A 1 "register_operand" ""))
+   (set (reg:ALL4A 20)
+        (match_operand:ALL4A 2 "register_operand" ""))
+   (set (reg:ALL4A 24)
+        (mult:ALL4A (reg:ALL4A 16)
+                    (reg:ALL4A 20)))
+   (set (match_operand:ALL4A 0 "register_operand" "")
+        (reg:ALL4A 24))]
+  "AVR_HAVE_MUL")
+
+;; "*mulsa3.call" "*mulusa3.call"
+(define_insn "*mul<mode>3.call"
+  [(set (reg:ALL4A 24)
+        (mult:ALL4A (reg:ALL4A 16)
+                    (reg:ALL4A 20)))]
+  "AVR_HAVE_MUL"
+  "%~call __mul<mode>3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+; / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / /
+; div
+
+(define_code_iterator usdiv [udiv div])
+
+;; "divqq3" "udivuqq3"
+(define_expand "<code><mode>3"
+  [(set (reg:ALL1Q 25)
+        (match_operand:ALL1Q 1 "register_operand" ""))
+   (set (reg:ALL1Q 22)
+        (match_operand:ALL1Q 2 "register_operand" ""))
+   (parallel [(set (reg:ALL1Q 24)
+                   (usdiv:ALL1Q (reg:ALL1Q 25)
+                                (reg:ALL1Q 22)))
+              (clobber (reg:QI 25))])
+   (set (match_operand:ALL1Q 0 "register_operand" "")
+        (reg:ALL1Q 24))])
+
+;; "*divqq3.call" "*udivuqq3.call"
+(define_insn "*<code><mode>3.call"
+  [(set (reg:ALL1Q 24)
+        (usdiv:ALL1Q (reg:ALL1Q 25)
+                     (reg:ALL1Q 22)))
+   (clobber (reg:QI 25))]
+  ""
+  "%~call __<code><mode>3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+;; "divhq3" "udivuhq3"
+;; "divha3" "udivuha3"
+(define_expand "<code><mode>3"
+  [(set (reg:ALL2QA 26)
+        (match_operand:ALL2QA 1 "register_operand" ""))
+   (set (reg:ALL2QA 22)
+        (match_operand:ALL2QA 2 "register_operand" ""))
+   (parallel [(set (reg:ALL2QA 24)
+                   (usdiv:ALL2QA (reg:ALL2QA 26)
+                                 (reg:ALL2QA 22)))
+              (clobber (reg:HI 26))
+              (clobber (reg:QI 21))])
+   (set (match_operand:ALL2QA 0 "register_operand" "")
+        (reg:ALL2QA 24))])
+
+;; "*divhq3.call" "*udivuhq3.call"
+;; "*divha3.call" "*udivuha3.call"
+(define_insn "*<code><mode>3.call"
+  [(set (reg:ALL2QA 24)
+        (usdiv:ALL2QA (reg:ALL2QA 26)
+                      (reg:ALL2QA 22)))
+   (clobber (reg:HI 26))
+   (clobber (reg:QI 21))]
+  ""
+  "%~call __<code><mode>3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
+;; Note the first parameter gets passed in already offset by 2 bytes
+
+;; "divsa3" "udivusa3"
+(define_expand "<code><mode>3"
+  [(set (reg:ALL4A 24)
+        (match_operand:ALL4A 1 "register_operand" ""))
+   (set (reg:ALL4A 18)
+        (match_operand:ALL4A 2 "register_operand" ""))
+   (parallel [(set (reg:ALL4A 22)
+                   (usdiv:ALL4A (reg:ALL4A 24)
+                                (reg:ALL4A 18)))
+              (clobber (reg:HI 26))
+              (clobber (reg:HI 30))])
+   (set (match_operand:ALL4A 0 "register_operand" "")
+        (reg:ALL4A 22))])
+
+;; "*divsa3.call" "*udivusa3.call"
+(define_insn "*<code><mode>3.call"
+  [(set (reg:ALL4A 22)
+        (usdiv:ALL4A (reg:ALL4A 24)
+                     (reg:ALL4A 18)))
+   (clobber (reg:HI 26))
+   (clobber (reg:HI 30))]
+  ""
+  "%~call __<code><mode>3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
Index: gcc/config/avr/avr-dimode.md
===================================================================
--- gcc/config/avr/avr-dimode.md	(revision 190535)
+++ gcc/config/avr/avr-dimode.md	(working copy)
@@ -47,44 +47,58 @@ (define_constants
   [(ACC_A	18)
    (ACC_B	10)])
 
+;; Supported modes that are 8 bytes wide
+(define_mode_iterator ALL8 [(DI "")
+                            (DQ "") (UDQ "")
+                            (DA "") (UDA "")
+                            (TA "") (UTA "")])
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;; Addition
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 
-(define_expand "adddi3"
-  [(parallel [(match_operand:DI 0 "general_operand" "")
-              (match_operand:DI 1 "general_operand" "")
-              (match_operand:DI 2 "general_operand" "")])]
+;; "adddi3"
+;; "adddq3" "addudq3"
+;; "addda3" "adduda3"
+;; "addta3" "adduta3"
+(define_expand "add<mode>3"
+  [(parallel [(match_operand:ALL8 0 "general_operand" "")
+              (match_operand:ALL8 1 "general_operand" "")
+              (match_operand:ALL8 2 "general_operand" "")])]
   "avr_have_dimode"
   {
-    rtx acc_a = gen_rtx_REG (DImode, ACC_A);
+    rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
 
     emit_move_insn (acc_a, operands[1]);
 
-    if (s8_operand (operands[2], VOIDmode))
+    if (DImode == <MODE>mode
+        && s8_operand (operands[2], VOIDmode))
       {
         emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]);
         emit_insn (gen_adddi3_const8_insn ());
       }        
-    else if (CONST_INT_P (operands[2])
-             || CONST_DOUBLE_P (operands[2]))
+    else if (const_operand (operands[2], GET_MODE (operands[2])))
       {
-        emit_insn (gen_adddi3_const_insn (operands[2]));
+        emit_insn (gen_add<mode>3_const_insn (operands[2]));
       }
     else
       {
-        emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]);
-        emit_insn (gen_adddi3_insn ());
+        emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]);
+        emit_insn (gen_add<mode>3_insn ());
       }
 
     emit_move_insn (operands[0], acc_a);
     DONE;
   })
 
-(define_insn "adddi3_insn"
-  [(set (reg:DI ACC_A)
-        (plus:DI (reg:DI ACC_A)
-                 (reg:DI ACC_B)))]
+;; "adddi3_insn"
+;; "adddq3_insn" "addudq3_insn"
+;; "addda3_insn" "adduda3_insn"
+;; "addta3_insn" "adduta3_insn"
+(define_insn "add<mode>3_insn"
+  [(set (reg:ALL8 ACC_A)
+        (plus:ALL8 (reg:ALL8 ACC_A)
+                   (reg:ALL8 ACC_B)))]
   "avr_have_dimode"
   "%~call __adddi3"
   [(set_attr "adjust_len" "call")
@@ -99,10 +113,14 @@ (define_insn "adddi3_const8_insn"
   [(set_attr "adjust_len" "call")
    (set_attr "cc" "clobber")])
 
-(define_insn "adddi3_const_insn"
-  [(set (reg:DI ACC_A)
-        (plus:DI (reg:DI ACC_A)
-                 (match_operand:DI 0 "const_double_operand" "n")))]
+;; "adddi3_const_insn"
+;; "adddq3_const_insn" "addudq3_const_insn"
+;; "addda3_const_insn" "adduda3_const_insn"
+;; "addta3_const_insn" "adduta3_const_insn"
+(define_insn "add<mode>3_const_insn"
+  [(set (reg:ALL8 ACC_A)
+        (plus:ALL8 (reg:ALL8 ACC_A)
+                   (match_operand:ALL8 0 "const_operand" "n Ynn")))]
   "avr_have_dimode
    && !s8_operand (operands[0], VOIDmode)"
   {
@@ -116,30 +134,62 @@ (define_insn "adddi3_const_insn"
 ;; Subtraction
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 
-(define_expand "subdi3"
-  [(parallel [(match_operand:DI 0 "general_operand" "")
-              (match_operand:DI 1 "general_operand" "")
-              (match_operand:DI 2 "general_operand" "")])]
+;; "subdi3"
+;; "subdq3" "subudq3"
+;; "subda3" "subuda3"
+;; "subta3" "subuta3"
+(define_expand "sub<mode>3"
+  [(parallel [(match_operand:ALL8 0 "general_operand" "")
+              (match_operand:ALL8 1 "general_operand" "")
+              (match_operand:ALL8 2 "general_operand" "")])]
   "avr_have_dimode"
   {
-    rtx acc_a = gen_rtx_REG (DImode, ACC_A);
+    rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
 
     emit_move_insn (acc_a, operands[1]);
-    emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]);
-    emit_insn (gen_subdi3_insn ());
+
+    if (const_operand (operands[2], GET_MODE (operands[2])))
+      {
+        emit_insn (gen_sub<mode>3_const_insn (operands[2]));
+      }
+    else
+     {
+       emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]);
+       emit_insn (gen_sub<mode>3_insn ());
+     }
+
     emit_move_insn (operands[0], acc_a);
     DONE;
   })
 
-(define_insn "subdi3_insn"
-  [(set (reg:DI ACC_A)
-        (minus:DI (reg:DI ACC_A)
-                  (reg:DI ACC_B)))]
+;; "subdi3_insn"
+;; "subdq3_insn" "subudq3_insn"
+;; "subda3_insn" "subuda3_insn"
+;; "subta3_insn" "subuta3_insn"
+(define_insn "sub<mode>3_insn"
+  [(set (reg:ALL8 ACC_A)
+        (minus:ALL8 (reg:ALL8 ACC_A)
+                    (reg:ALL8 ACC_B)))]
   "avr_have_dimode"
   "%~call __subdi3"
   [(set_attr "adjust_len" "call")
    (set_attr "cc" "set_czn")])
 
+;; "subdi3_const_insn"
+;; "subdq3_const_insn" "subudq3_const_insn"
+;; "subda3_const_insn" "subuda3_const_insn"
+;; "subta3_const_insn" "subuta3_const_insn"
+(define_insn "sub<mode>3_const_insn"
+  [(set (reg:ALL8 ACC_A)
+        (minus:ALL8 (reg:ALL8 ACC_A)
+                    (match_operand:ALL8 0 "const_operand" "n Ynn")))]
+  "avr_have_dimode"
+  {
+    return avr_out_minus64 (operands[0], NULL);
+  }
+  [(set_attr "adjust_len" "minus64")
+   (set_attr "cc" "clobber")])
+
 
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;; Negation
@@ -180,15 +230,19 @@ (define_expand "conditional_jump"
          (pc)))]
   "avr_have_dimode")
 
-(define_expand "cbranchdi4"
-  [(parallel [(match_operand:DI 1 "register_operand" "")
-              (match_operand:DI 2 "nonmemory_operand" "")
+;; "cbranchdi4"
+;; "cbranchdq4" "cbranchudq4"
+;; "cbranchda4" "cbranchuda4"
+;; "cbranchta4" "cbranchuta4"
+(define_expand "cbranch<mode>4"
+  [(parallel [(match_operand:ALL8 1 "register_operand" "")
+              (match_operand:ALL8 2 "nonmemory_operand" "")
               (match_operator 0 "ordered_comparison_operator" [(cc0)
                                                                (const_int 0)])
               (label_ref (match_operand 3 "" ""))])]
   "avr_have_dimode"
   {
-    rtx acc_a = gen_rtx_REG (DImode, ACC_A);
+    rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
 
     emit_move_insn (acc_a, operands[1]);
 
@@ -197,25 +251,28 @@ (define_expand "cbranchdi4"
         emit_move_insn (gen_rtx_REG (QImode, REG_X), operands[2]);
         emit_insn (gen_compare_const8_di2 ());
       }        
-    else if (CONST_INT_P (operands[2])
-             || CONST_DOUBLE_P (operands[2]))
+    else if (const_operand (operands[2], GET_MODE (operands[2])))
       {
-        emit_insn (gen_compare_const_di2 (operands[2]));
+        emit_insn (gen_compare_const_<mode>2 (operands[2]));
       }
     else
       {
-        emit_move_insn (gen_rtx_REG (DImode, ACC_B), operands[2]);
-        emit_insn (gen_compare_di2 ());
+        emit_move_insn (gen_rtx_REG (<MODE>mode, ACC_B), operands[2]);
+        emit_insn (gen_compare_<mode>2 ());
       }
 
     emit_jump_insn (gen_conditional_jump (operands[0], operands[3]));
     DONE;
   })
 
-(define_insn "compare_di2"
+;; "compare_di2"
+;; "compare_dq2" "compare_udq2"
+;; "compare_da2" "compare_uda2"
+;; "compare_ta2" "compare_uta2"
+(define_insn "compare_<mode>2"
   [(set (cc0)
-        (compare (reg:DI ACC_A)
-                 (reg:DI ACC_B)))]
+        (compare (reg:ALL8 ACC_A)
+                 (reg:ALL8 ACC_B)))]
   "avr_have_dimode"
   "%~call __cmpdi2"
   [(set_attr "adjust_len" "call")
@@ -230,10 +287,14 @@ (define_insn "compare_const8_di2"
   [(set_attr "adjust_len" "call")
    (set_attr "cc" "compare")])
 
-(define_insn "compare_const_di2"
+;; "compare_const_di2"
+;; "compare_const_dq2" "compare_const_udq2"
+;; "compare_const_da2" "compare_const_uda2"
+;; "compare_const_ta2" "compare_const_uta2"
+(define_insn "compare_const_<mode>2"
   [(set (cc0)
-        (compare (reg:DI ACC_A)
-                 (match_operand:DI 0 "const_double_operand" "n")))
+        (compare (reg:ALL8 ACC_A)
+                 (match_operand:ALL8 0 "const_operand" "n Ynn")))
    (clobber (match_scratch:QI 1 "=&d"))]
   "avr_have_dimode
    && !s8_operand (operands[0], VOIDmode)"
@@ -254,29 +315,39 @@ (define_code_iterator di_shifts
 ;; Shift functions from libgcc are called without defining these insns,
 ;; but with them we can describe their reduced register footprint.
 
-;; "ashldi3"
-;; "ashrdi3"
-;; "lshrdi3"
-;; "rotldi3"
-(define_expand "<code_stdname>di3"
-  [(parallel [(match_operand:DI 0 "general_operand" "")
-              (di_shifts:DI (match_operand:DI 1 "general_operand" "")
-                            (match_operand:QI 2 "general_operand" ""))])]
+;; "ashldi3"   "ashrdi3"   "lshrdi3"   "rotldi3"
+;; "ashldq3"   "ashrdq3"   "lshrdq3"   "rotldq3"
+;; "ashlda3"   "ashrda3"   "lshrda3"   "rotlda3"
+;; "ashlta3"   "ashrta3"   "lshrta3"   "rotlta3"
+;; "ashludq3"  "ashrudq3"  "lshrudq3"  "rotludq3"
+;; "ashluda3"  "ashruda3"  "lshruda3"  "rotluda3"
+;; "ashluta3"  "ashruta3"  "lshruta3"  "rotluta3"
+(define_expand "<code_stdname><mode>3"
+  [(parallel [(match_operand:ALL8 0 "general_operand" "")
+              (di_shifts:ALL8 (match_operand:ALL8 1 "general_operand" "")
+                              (match_operand:QI 2 "general_operand" ""))])]
   "avr_have_dimode"
   {
-    rtx acc_a = gen_rtx_REG (DImode, ACC_A);
+    rtx acc_a = gen_rtx_REG (<MODE>mode, ACC_A);
 
     emit_move_insn (acc_a, operands[1]);
     emit_move_insn (gen_rtx_REG (QImode, 16), operands[2]);
-    emit_insn (gen_<code_stdname>di3_insn ());
+    emit_insn (gen_<code_stdname><mode>3_insn ());
     emit_move_insn (operands[0], acc_a);
     DONE;
   })
 
-(define_insn "<code_stdname>di3_insn"
-  [(set (reg:DI ACC_A)
-        (di_shifts:DI (reg:DI ACC_A)
-                      (reg:QI 16)))]
+;; "ashldi3_insn"   "ashrdi3_insn"   "lshrdi3_insn"   "rotldi3_insn"
+;; "ashldq3_insn"   "ashrdq3_insn"   "lshrdq3_insn"   "rotldq3_insn"
+;; "ashlda3_insn"   "ashrda3_insn"   "lshrda3_insn"   "rotlda3_insn"
+;; "ashlta3_insn"   "ashrta3_insn"   "lshrta3_insn"   "rotlta3_insn"
+;; "ashludq3_insn"  "ashrudq3_insn"  "lshrudq3_insn"  "rotludq3_insn"
+;; "ashluda3_insn"  "ashruda3_insn"  "lshruda3_insn"  "rotluda3_insn"
+;; "ashluta3_insn"  "ashruta3_insn"  "lshruta3_insn"  "rotluta3_insn"
+(define_insn "<code_stdname><mode>3_insn"
+  [(set (reg:ALL8 ACC_A)
+        (di_shifts:ALL8 (reg:ALL8 ACC_A)
+                        (reg:QI 16)))]
   "avr_have_dimode"
   "%~call __<code_stdname>di3"
   [(set_attr "adjust_len" "call")
Index: gcc/config/avr/avr.md
===================================================================
--- gcc/config/avr/avr.md	(revision 190535)
+++ gcc/config/avr/avr.md	(working copy)
@@ -88,10 +88,10 @@ (define_c_enum "unspecv"
 
 (include "predicates.md")
 (include "constraints.md")
-  
+
 ;; Condition code settings.
 (define_attr "cc" "none,set_czn,set_zn,set_n,compare,clobber,
-                   out_plus, out_plus_noclobber,ldi"
+                   out_plus, out_plus_noclobber,ldi,minus"
   (const_string "none"))
 
 (define_attr "type" "branch,branch1,arith,xcall"
@@ -139,8 +139,10 @@ (define_attr "length" ""
 
 (define_attr "adjust_len"
   "out_bitop, out_plus, out_plus_noclobber, plus64, addto_sp,
+   minus, minus64,
    tsthi, tstpsi, tstsi, compare, compare64, call,
    mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32,
+   ufract, sfract,
    xload, movmem, load_lpm,
    ashlqi, ashrqi, lshrqi,
    ashlhi, ashrhi, lshrhi,
@@ -225,8 +227,20 @@ (define_mode_iterator QISI [(QI "") (HI
 (define_mode_iterator QIDI [(QI "") (HI "") (PSI "") (SI "") (DI "")])
 (define_mode_iterator HISI [(HI "") (PSI "") (SI "")])
 
+(define_mode_iterator ALL1 [(QI "") (QQ "") (UQQ "")])
+(define_mode_iterator ALL2 [(HI "") (HQ "") (UHQ "") (HA "") (UHA "")])
+(define_mode_iterator ALL4 [(SI "") (SQ "") (USQ "") (SA "") (USA "")])
+
 ;; All supported move-modes
-(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "")])
+(define_mode_iterator MOVMODE [(QI "") (HI "") (SI "") (SF "") (PSI "")
+                               (QQ "") (UQQ "")
+                               (HQ "") (UHQ "") (HA "") (UHA "")
+                               (SQ "") (USQ "") (SA "") (USA "")])
+
+;; Supported ordered modes that are 2, 3, 4 bytes wide
+(define_mode_iterator ORDERED234 [(HI "") (SI "") (PSI "")
+                                  (HQ "") (UHQ "") (HA "") (UHA "")
+                                  (SQ "") (USQ "") (SA "") (USA "")])
 
 ;; Define code iterators
 ;; Define two incarnations so that we can build the cross product.
@@ -317,9 +331,11 @@ (define_expand "nonlocal_goto"
   DONE;
 })
 
-(define_insn "pushqi1"
-  [(set (mem:QI (post_dec:HI (reg:HI REG_SP)))
-        (match_operand:QI 0 "reg_or_0_operand" "r,L"))]
+;; "pushqi1"
+;; "pushqq1"  "pushuqq1"
+(define_insn "push<mode>1"
+  [(set (mem:ALL1 (post_dec:HI (reg:HI REG_SP)))
+        (match_operand:ALL1 0 "reg_or_0_operand" "r,Y00"))]
   ""
   "@
 	push %0
@@ -334,7 +350,11 @@ (define_mode_iterator MPUSH
    (PSI "")
    (SI "") (CSI "")
    (DI "") (CDI "")
-   (SF "") (SC "")])
+   (SF "") (SC "")
+   (HA "") (UHA "") (HQ "") (UHQ "")
+   (SA "") (USA "") (SQ "") (USQ "")
+   (DA "") (UDA "") (DQ "") (UDQ "")
+   (TA "") (UTA "")])
 
 (define_expand "push<mode>1"
   [(match_operand:MPUSH 0 "" "")]
@@ -422,12 +442,14 @@ (define_insn "load_<mode>_clobber"
    (set_attr "cc" "clobber")])
 
 
-(define_insn_and_split "xload8_A"
-  [(set (match_operand:QI 0 "register_operand" "=r")
-        (match_operand:QI 1 "memory_operand"    "m"))
+;; "xload8qi_A"
+;; "xload8qq_A" "xload8uqq_A"
+(define_insn_and_split "xload8<mode>_A"
+  [(set (match_operand:ALL1 0 "register_operand" "=r")
+        (match_operand:ALL1 1 "memory_operand"    "m"))
    (clobber (reg:HI REG_Z))]
   "can_create_pseudo_p()
-   && !avr_xload_libgcc_p (QImode)
+   && !avr_xload_libgcc_p (<MODE>mode)
    && avr_mem_memx_p (operands[1])
    && REG_P (XEXP (operands[1], 0))"
   { gcc_unreachable(); }
@@ -441,16 +463,16 @@ (define_insn_and_split "xload8_A"
     emit_move_insn (reg_z, simplify_gen_subreg (HImode, addr, PSImode, 0));
     emit_move_insn (hi8, simplify_gen_subreg (QImode, addr, PSImode, 2));
 
-    insn = emit_insn (gen_xload_8 (operands[0], hi8));
+    insn = emit_insn (gen_xload<mode>_8 (operands[0], hi8));
     set_mem_addr_space (SET_SRC (single_set (insn)),
                                  MEM_ADDR_SPACE (operands[1]));
     DONE;
   })
 
-;; "xloadqi_A"
-;; "xloadhi_A"
+;; "xloadqi_A" "xloadqq_A" "xloaduqq_A"
+;; "xloadhi_A" "xloadhq_A" "xloaduhq_A" "xloadha_A" "xloaduha_A"
+;; "xloadsi_A" "xloadsq_A" "xloadusq_A" "xloadsa_A" "xloadusa_A"
 ;; "xloadpsi_A"
-;; "xloadsi_A"
 ;; "xloadsf_A"
 (define_insn_and_split "xload<mode>_A"
   [(set (match_operand:MOVMODE 0 "register_operand" "=r")
@@ -488,11 +510,13 @@ (define_insn_and_split "xload<mode>_A"
 ;; Move value from address space memx to a register
 ;; These insns must be prior to respective generic move insn.
 
-(define_insn "xload_8"
-  [(set (match_operand:QI 0 "register_operand"                   "=&r,r")
-        (mem:QI (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r")
-                            (reg:HI REG_Z))))]
-  "!avr_xload_libgcc_p (QImode)"
+;; "xloadqi_8"
+;; "xloadqq_8" "xloaduqq_8"
+(define_insn "xload<mode>_8"
+  [(set (match_operand:ALL1 0 "register_operand"                   "=&r,r")
+        (mem:ALL1 (lo_sum:PSI (match_operand:QI 1 "register_operand" "r,r")
+                              (reg:HI REG_Z))))]
+  "!avr_xload_libgcc_p (<MODE>mode)"
   {
     return avr_out_xload (insn, operands, NULL);
   }
@@ -504,11 +528,11 @@ (define_insn "xload_8"
 ;; R21:Z : 24-bit source address
 ;; R22   : 1-4 byte output
 
-;; "xload_qi_libgcc"
-;; "xload_hi_libgcc"
-;; "xload_psi_libgcc"
-;; "xload_si_libgcc"
+;; "xload_qi_libgcc" "xload_qq_libgcc" "xload_uqq_libgcc"
+;; "xload_hi_libgcc" "xload_hq_libgcc" "xload_uhq_libgcc" "xload_ha_libgcc" "xload_uha_libgcc"
+;; "xload_si_libgcc" "xload_sq_libgcc" "xload_usq_libgcc" "xload_sa_libgcc" "xload_usa_libgcc"
 ;; "xload_sf_libgcc"
+;; "xload_psi_libgcc"
 (define_insn "xload_<mode>_libgcc"
   [(set (reg:MOVMODE 22)
         (mem:MOVMODE (lo_sum:PSI (reg:QI 21)
@@ -528,9 +552,9 @@ (define_insn "xload_<mode>_libgcc"
 
 ;; General move expanders
 
-;; "movqi"
-;; "movhi"
-;; "movsi"
+;; "movqi" "movqq" "movuqq"
+;; "movhi" "movhq" "movuhq" "movha" "movuha"
+;; "movsi" "movsq" "movusq" "movsa" "movusa"
 ;; "movsf"
 ;; "movpsi"
 (define_expand "mov<mode>"
@@ -546,8 +570,7 @@ (define_expand "mov<mode>"
   
     /* One of the operands has to be in a register.  */
     if (!register_operand (dest, <MODE>mode)
-        && !(register_operand (src, <MODE>mode)
-             || src == CONST0_RTX (<MODE>mode)))
+        && !reg_or_0_operand (src, <MODE>mode))
       {
         operands[1] = src = copy_to_mode_reg (<MODE>mode, src);
       }
@@ -560,7 +583,9 @@ (define_expand "mov<mode>"
         src = replace_equiv_address (src, copy_to_mode_reg (PSImode, addr));
 
       if (!avr_xload_libgcc_p (<MODE>mode))
-        emit_insn (gen_xload8_A (dest, src));
+        /* ; No <mode> here because gen_xload8<mode>_A only iterates over ALL1.
+           ; insn-emit does not depend on the mode, it' all about operands.  */
+        emit_insn (gen_xload8qi_A (dest, src));
       else
         emit_insn (gen_xload<mode>_A (dest, src));
 
@@ -627,12 +652,13 @@ (define_expand "mov<mode>"
 ;; are call-saved registers, and most of LD_REGS are call-used registers,
 ;; so this may still be a win for registers live across function calls.
 
-(define_insn "movqi_insn"
-  [(set (match_operand:QI 0 "nonimmediate_operand" "=r ,d,Qm,r ,q,r,*r")
-        (match_operand:QI 1 "nox_general_operand"   "rL,i,rL,Qm,r,q,i"))]
-  "register_operand (operands[0], QImode)
-   || register_operand (operands[1], QImode)
-   || const0_rtx == operands[1]"
+;; "movqi_insn"
+;; "movqq_insn" "movuqq_insn"
+(define_insn "mov<mode>_insn"
+  [(set (match_operand:ALL1 0 "nonimmediate_operand" "=r    ,d    ,Qm   ,r ,q,r,*r")
+        (match_operand:ALL1 1 "nox_general_operand"   "r Y00,n Ynn,r Y00,Qm,r,q,i"))]
+  "register_operand (operands[0], <MODE>mode)
+   || reg_or_0_operand (operands[1], <MODE>mode)"
   {
     return output_movqi (insn, operands, NULL);
   }
@@ -643,9 +669,11 @@ (define_insn "movqi_insn"
 ;; This is used in peephole2 to optimize loading immediate constants
 ;; if a scratch register from LD_REGS happens to be available.
 
-(define_insn "*reload_inqi"
-  [(set (match_operand:QI 0 "register_operand" "=l")
-        (match_operand:QI 1 "immediate_operand" "i"))
+;; "*reload_inqi"
+;; "*reload_inqq" "*reload_inuqq"
+(define_insn "*reload_in<mode>"
+  [(set (match_operand:ALL1 0 "register_operand"    "=l")
+        (match_operand:ALL1 1 "const_operand"        "i"))
    (clobber (match_operand:QI 2 "register_operand" "=&d"))]
   "reload_completed"
   "ldi %2,lo8(%1)
@@ -655,14 +683,15 @@ (define_insn "*reload_inqi"
 
 (define_peephole2
   [(match_scratch:QI 2 "d")
-   (set (match_operand:QI 0 "l_register_operand" "")
-        (match_operand:QI 1 "immediate_operand" ""))]
-  "(operands[1] != const0_rtx
-    && operands[1] != const1_rtx
-    && operands[1] != constm1_rtx)"
-  [(parallel [(set (match_dup 0) (match_dup 1))
-              (clobber (match_dup 2))])]
-  "")
+   (set (match_operand:ALL1 0 "l_register_operand" "")
+        (match_operand:ALL1 1 "const_operand" ""))]
+  ; No need for a clobber reg for 0x0, 0x01 or 0xff
+  "!satisfies_constraint_Y00 (operands[1])
+   && !satisfies_constraint_Y01 (operands[1])
+   && !satisfies_constraint_Ym1 (operands[1])"
+  [(parallel [(set (match_dup 0)
+                   (match_dup 1))
+              (clobber (match_dup 2))])])
 
 ;;============================================================================
 ;; move word (16 bit)
@@ -693,18 +722,20 @@ (define_insn "movhi_sp_r"
 
 (define_peephole2
   [(match_scratch:QI 2 "d")
-   (set (match_operand:HI 0 "l_register_operand" "")
-        (match_operand:HI 1 "immediate_operand" ""))]
-  "(operands[1] != const0_rtx 
-    && operands[1] != constm1_rtx)"
-  [(parallel [(set (match_dup 0) (match_dup 1))
-              (clobber (match_dup 2))])]
-  "")
+   (set (match_operand:ALL2 0 "l_register_operand" "")
+        (match_operand:ALL2 1 "const_or_immediate_operand" ""))]
+  "operands[1] != CONST0_RTX (<MODE>mode)"
+  [(parallel [(set (match_dup 0)
+                   (match_dup 1))
+              (clobber (match_dup 2))])])
 
 ;; '*' because it is not used in rtl generation, only in above peephole
-(define_insn "*reload_inhi"
-  [(set (match_operand:HI 0 "register_operand" "=r")
-        (match_operand:HI 1 "immediate_operand" "i"))
+;; "*reload_inhi"
+;; "*reload_inhq" "*reload_inuhq"
+;; "*reload_inha" "*reload_inuha"
+(define_insn "*reload_in<mode>"
+  [(set (match_operand:ALL2 0 "l_register_operand"  "=l")
+        (match_operand:ALL2 1 "immediate_operand"    "i"))
    (clobber (match_operand:QI 2 "register_operand" "=&d"))]
   "reload_completed"
   {
@@ -712,14 +743,16 @@ (define_insn "*reload_inhi"
   }
   [(set_attr "length" "4")
    (set_attr "adjust_len" "reload_in16")
-   (set_attr "cc" "none")])
+   (set_attr "cc" "clobber")])
 
-(define_insn "*movhi"
-  [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,d,*r,q,r")
-        (match_operand:HI 1 "nox_general_operand"   "r,L,m,rL,i,i ,r,q"))]
-  "register_operand (operands[0], HImode)
-   || register_operand (operands[1], HImode)
-   || const0_rtx == operands[1]"
+;; "*movhi"
+;; "*movhq" "*movuhq"
+;; "*movha" "*movuha"
+(define_insn "*mov<mode>"
+  [(set (match_operand:ALL2 0 "nonimmediate_operand" "=r,r  ,r,m    ,d,*r,q,r")
+        (match_operand:ALL2 1 "nox_general_operand"   "r,Y00,m,r Y00,i,i ,r,q"))]
+  "register_operand (operands[0], <MODE>mode)
+   || reg_or_0_operand (operands[1], <MODE>mode)"
   {
     return output_movhi (insn, operands, NULL);
   }
@@ -728,28 +761,30 @@ (define_insn "*movhi"
    (set_attr "cc" "none,none,clobber,clobber,none,clobber,none,none")])
 
 (define_peephole2 ; movw
-  [(set (match_operand:QI 0 "even_register_operand" "")
-        (match_operand:QI 1 "even_register_operand" ""))
-   (set (match_operand:QI 2 "odd_register_operand" "")
-        (match_operand:QI 3 "odd_register_operand" ""))]
+  [(set (match_operand:ALL1 0 "even_register_operand" "")
+        (match_operand:ALL1 1 "even_register_operand" ""))
+   (set (match_operand:ALL1 2 "odd_register_operand" "")
+        (match_operand:ALL1 3 "odd_register_operand" ""))]
   "(AVR_HAVE_MOVW
     && REGNO (operands[0]) == REGNO (operands[2]) - 1
     && REGNO (operands[1]) == REGNO (operands[3]) - 1)"
-  [(set (match_dup 4) (match_dup 5))]
+  [(set (match_dup 4)
+        (match_dup 5))]
   {
     operands[4] = gen_rtx_REG (HImode, REGNO (operands[0]));
     operands[5] = gen_rtx_REG (HImode, REGNO (operands[1]));
   })
 
 (define_peephole2 ; movw_r
-  [(set (match_operand:QI 0 "odd_register_operand" "")
-        (match_operand:QI 1 "odd_register_operand" ""))
-   (set (match_operand:QI 2 "even_register_operand" "")
-        (match_operand:QI 3 "even_register_operand" ""))]
+  [(set (match_operand:ALL1 0 "odd_register_operand" "")
+        (match_operand:ALL1 1 "odd_register_operand" ""))
+   (set (match_operand:ALL1 2 "even_register_operand" "")
+        (match_operand:ALL1 3 "even_register_operand" ""))]
   "(AVR_HAVE_MOVW
     && REGNO (operands[2]) == REGNO (operands[0]) - 1
     && REGNO (operands[3]) == REGNO (operands[1]) - 1)"
-  [(set (match_dup 4) (match_dup 5))]
+  [(set (match_dup 4)
+        (match_dup 5))]
   {
     operands[4] = gen_rtx_REG (HImode, REGNO (operands[2]));
     operands[5] = gen_rtx_REG (HImode, REGNO (operands[3]));
@@ -801,19 +836,21 @@ (define_insn "*movpsi"
 
 (define_peephole2 ; *reload_insi
   [(match_scratch:QI 2 "d")
-   (set (match_operand:SI 0 "l_register_operand" "")
-        (match_operand:SI 1 "const_int_operand" ""))
+   (set (match_operand:ALL4 0 "l_register_operand" "")
+        (match_operand:ALL4 1 "immediate_operand" ""))
    (match_dup 2)]
-  "(operands[1] != const0_rtx
-    && operands[1] != constm1_rtx)"
-  [(parallel [(set (match_dup 0) (match_dup 1))
-              (clobber (match_dup 2))])]
-  "")
+  "operands[1] != CONST0_RTX (<MODE>mode)"
+  [(parallel [(set (match_dup 0)
+                   (match_dup 1))
+              (clobber (match_dup 2))])])
 
 ;; '*' because it is not used in rtl generation.
+;; "*reload_insi"
+;; "*reload_insq" "*reload_inusq"
+;; "*reload_insa" "*reload_inusa"
 (define_insn "*reload_insi"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-        (match_operand:SI 1 "const_int_operand" "n"))
+  [(set (match_operand:ALL4 0 "register_operand"   "=r")
+        (match_operand:ALL4 1 "immediate_operand"   "n Ynn"))
    (clobber (match_operand:QI 2 "register_operand" "=&d"))]
   "reload_completed"
   {
@@ -824,12 +861,14 @@ (define_insn "*reload_insi"
    (set_attr "cc" "clobber")])
 
 
-(define_insn "*movsi"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r")
-        (match_operand:SI 1 "nox_general_operand"   "r,L,Qm,rL,i ,i"))]
-  "register_operand (operands[0], SImode)
-   || register_operand (operands[1], SImode)
-   || const0_rtx == operands[1]"
+;; "*movsi"
+;; "*movsq" "*movusq"
+;; "*movsa" "*movusa"
+(define_insn "*mov<mode>"
+  [(set (match_operand:ALL4 0 "nonimmediate_operand" "=r,r  ,r ,Qm   ,!d,r")
+        (match_operand:ALL4 1 "nox_general_operand"   "r,Y00,Qm,r Y00,i ,i"))]
+  "register_operand (operands[0], <MODE>mode)
+   || reg_or_0_operand (operands[1], <MODE>mode)"
   {
     return output_movsisf (insn, operands, NULL);
   }
@@ -844,8 +883,7 @@ (define_insn "*movsf"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,r ,Qm,!d,r")
         (match_operand:SF 1 "nox_general_operand"   "r,G,Qm,rG,F ,F"))]
   "register_operand (operands[0], SFmode)
-   || register_operand (operands[1], SFmode)
-   || operands[1] == CONST0_RTX (SFmode)"
+   || reg_or_0_operand (operands[1], SFmode)"
   {
     return output_movsisf (insn, operands, NULL);
   }
@@ -861,8 +899,7 @@ (define_peephole2 ; *reload_insf
   "operands[1] != CONST0_RTX (SFmode)"
   [(parallel [(set (match_dup 0) 
                    (match_dup 1))
-              (clobber (match_dup 2))])]
-  "")
+              (clobber (match_dup 2))])])
 
 ;; '*' because it is not used in rtl generation.
 (define_insn "*reload_insf"
@@ -1015,9 +1052,10 @@ (define_expand "strlenhi"
    (set (match_dup 4)
         (plus:HI (match_dup 4)
                  (const_int -1)))
-   (set (match_operand:HI 0 "register_operand" "")
-        (minus:HI (match_dup 4)
-                  (match_dup 5)))]
+   (parallel [(set (match_operand:HI 0 "register_operand" "")
+                   (minus:HI (match_dup 4)
+                             (match_dup 5)))
+              (clobber (scratch:QI))])]
   ""
   {
     rtx addr;
@@ -1043,10 +1081,12 @@ (define_insn "*strlenhi"
 ;+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 ; add bytes
 
-(define_insn "addqi3"
-  [(set (match_operand:QI 0 "register_operand"          "=r,d,r,r,r,r")
-        (plus:QI (match_operand:QI 1 "register_operand" "%0,0,0,0,0,0")
-                 (match_operand:QI 2 "nonmemory_operand" "r,i,P,N,K,Cm2")))]
+;; "addqi3"
+;; "addqq3" "adduqq3"
+(define_insn "add<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand"            "=r,d    ,r  ,r  ,r  ,r")
+        (plus:ALL1 (match_operand:ALL1 1 "register_operand" "%0,0    ,0  ,0  ,0  ,0")
+                   (match_operand:ALL1 2 "nonmemory_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))]
   ""
   "@
 	add %0,%2
@@ -1058,11 +1098,13 @@ (define_insn "addqi3"
   [(set_attr "length" "1,1,1,1,2,2")
    (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")])
 
-
-(define_expand "addhi3"
-  [(set (match_operand:HI 0 "register_operand" "")
-        (plus:HI (match_operand:HI 1 "register_operand" "")
-                 (match_operand:HI 2 "nonmemory_operand" "")))]
+;; "addhi3"
+;; "addhq3" "adduhq3"
+;; "addha3" "adduha3"
+(define_expand "add<mode>3"
+  [(set (match_operand:ALL2 0 "register_operand" "")
+        (plus:ALL2 (match_operand:ALL2 1 "register_operand" "")
+                   (match_operand:ALL2 2 "nonmemory_or_const_operand" "")))]
   ""
   {
     if (CONST_INT_P (operands[2]))
@@ -1079,6 +1121,12 @@ (define_expand "addhi3"
             DONE;
           }
       }
+
+    if (CONST_FIXED == GET_CODE (operands[2]))
+      {
+        emit_insn (gen_add<mode>3_clobber (operands[0], operands[1], operands[2]));
+        DONE;
+      }
   })
 
 
@@ -1124,24 +1172,22 @@ (define_insn "*addhi3_sp"
   [(set_attr "length" "6")
    (set_attr "adjust_len" "addto_sp")])
 
-(define_insn "*addhi3"
-  [(set (match_operand:HI 0 "register_operand"          "=r,d,!w,d")
-        (plus:HI (match_operand:HI 1 "register_operand" "%0,0,0 ,0")
-                 (match_operand:HI 2 "nonmemory_operand" "r,s,IJ,n")))]
+;; "*addhi3"
+;; "*addhq3" "*adduhq3"
+;; "*addha3" "*adduha3"
+(define_insn "*add<mode>3"
+  [(set (match_operand:ALL2 0 "register_operand"                     "=r,d,!w    ,d")
+        (plus:ALL2 (match_operand:ALL2 1 "register_operand"          "%0,0,0     ,0")
+                   (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,s,IJ YIJ,n Ynn")))]
   ""
   {
-    static const char * const asm_code[] =
-      {
-        "add %A0,%A2\;adc %B0,%B2",
-        "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))",
-        "",
-        ""
-      };
-
-    if (*asm_code[which_alternative])
-      return asm_code[which_alternative];
-
-    return avr_out_plus_noclobber (operands, NULL, NULL);
+    if (REG_P (operands[2]))
+      return "add %A0,%A2\;adc %B0,%B2";
+    else if (CONST_INT_P (operands[2])
+             || CONST_FIXED == GET_CODE (operands[2]))
+      return avr_out_plus_noclobber (operands, NULL, NULL);
+    else
+      return "subi %A0,lo8(-(%2))\;sbci %B0,hi8(-(%2))";
   }
   [(set_attr "length" "2,2,2,2")
    (set_attr "adjust_len" "*,*,out_plus_noclobber,out_plus_noclobber")
@@ -1152,41 +1198,44 @@ (define_insn "*addhi3"
 ;; itself because that insn is special to reload.
 
 (define_peephole2 ; addhi3_clobber
-  [(set (match_operand:HI 0 "d_register_operand" "")
-        (match_operand:HI 1 "const_int_operand" ""))
-   (set (match_operand:HI 2 "l_register_operand" "")
-        (plus:HI (match_dup 2)
-                 (match_dup 0)))]
+  [(set (match_operand:ALL2 0 "d_register_operand" "")
+        (match_operand:ALL2 1 "const_operand" ""))
+   (set (match_operand:ALL2 2 "l_register_operand" "")
+        (plus:ALL2 (match_dup 2)
+                   (match_dup 0)))]
   "peep2_reg_dead_p (2, operands[0])"
   [(parallel [(set (match_dup 2)
-                   (plus:HI (match_dup 2)
-                            (match_dup 1)))
+                   (plus:ALL2 (match_dup 2)
+                              (match_dup 1)))
               (clobber (match_dup 3))])]
   {
-    operands[3] = simplify_gen_subreg (QImode, operands[0], HImode, 0);
+    operands[3] = simplify_gen_subreg (QImode, operands[0], <MODE>mode, 0);
   })
 
 ;; Same, but with reload to NO_LD_REGS
 ;; Combine *reload_inhi with *addhi3
 
 (define_peephole2 ; addhi3_clobber
-  [(parallel [(set (match_operand:HI 0 "l_register_operand" "")
-                   (match_operand:HI 1 "const_int_operand" ""))
+  [(parallel [(set (match_operand:ALL2 0 "l_register_operand" "")
+                   (match_operand:ALL2 1 "const_operand" ""))
               (clobber (match_operand:QI 2 "d_register_operand" ""))])
-   (set (match_operand:HI 3 "l_register_operand" "")
-        (plus:HI (match_dup 3)
-                 (match_dup 0)))]
+   (set (match_operand:ALL2 3 "l_register_operand" "")
+        (plus:ALL2 (match_dup 3)
+                   (match_dup 0)))]
   "peep2_reg_dead_p (2, operands[0])"
   [(parallel [(set (match_dup 3)
-                   (plus:HI (match_dup 3)
-                            (match_dup 1)))
+                   (plus:ALL2 (match_dup 3)
+                              (match_dup 1)))
               (clobber (match_dup 2))])])
 
-(define_insn "addhi3_clobber"
-  [(set (match_operand:HI 0 "register_operand"           "=!w,d,r")
-        (plus:HI (match_operand:HI 1 "register_operand"   "%0,0,0")
-                 (match_operand:HI 2 "const_int_operand"  "IJ,n,n")))
-   (clobber (match_scratch:QI 3                           "=X,X,&d"))]
+;; "addhi3_clobber"
+;; "addhq3_clobber" "adduhq3_clobber"
+;; "addha3_clobber" "adduha3_clobber"
+(define_insn "add<mode>3_clobber"
+  [(set (match_operand:ALL2 0 "register_operand"            "=!w    ,d    ,r")
+        (plus:ALL2 (match_operand:ALL2 1 "register_operand"  "%0    ,0    ,0")
+                   (match_operand:ALL2 2 "const_operand"     "IJ YIJ,n Ynn,n Ynn")))
+   (clobber (match_scratch:QI 3                             "=X     ,X    ,&d"))]
   ""
   {
     gcc_assert (REGNO (operands[0]) == REGNO (operands[1]));
@@ -1198,29 +1247,24 @@ (define_insn "addhi3_clobber"
    (set_attr "cc" "out_plus")])
 
 
-(define_insn "addsi3"
-  [(set (match_operand:SI 0 "register_operand"          "=r,d ,d,r")
-        (plus:SI (match_operand:SI 1 "register_operand" "%0,0 ,0,0")
-                 (match_operand:SI 2 "nonmemory_operand" "r,s ,n,n")))
-   (clobber (match_scratch:QI 3                         "=X,X ,X,&d"))]
+;; "addsi3"
+;; "addsq3" "addusq3"
+;; "addsa3" "addusa3"
+(define_insn "add<mode>3"
+  [(set (match_operand:ALL4 0 "register_operand"            "=r,d ,r")
+        (plus:ALL4 (match_operand:ALL4 1 "register_operand" "%0,0 ,0")
+                   (match_operand:ALL4 2 "nonmemory_operand" "r,i ,n Ynn")))
+   (clobber (match_scratch:QI 3                             "=X,X ,&d"))]
   ""
   {
-    static const char * const asm_code[] =
-      {
-        "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2",
-        "subi %0,lo8(-(%2))\;sbci %B0,hi8(-(%2))\;sbci %C0,hlo8(-(%2))\;sbci %D0,hhi8(-(%2))",
-        "",
-        ""
-      };
-
-    if (*asm_code[which_alternative])
-      return asm_code[which_alternative];
+    if (REG_P (operands[2]))
+      return "add %A0,%A2\;adc %B0,%B2\;adc %C0,%C2\;adc %D0,%D2";
 
     return avr_out_plus (operands, NULL, NULL);
   }
-  [(set_attr "length" "4,4,4,8")
-   (set_attr "adjust_len" "*,*,out_plus,out_plus")
-   (set_attr "cc" "set_n,set_czn,out_plus,out_plus")])
+  [(set_attr "length" "4,4,8")
+   (set_attr "adjust_len" "*,out_plus,out_plus")
+   (set_attr "cc" "set_n,out_plus,out_plus")])
 
 (define_insn "*addpsi3_zero_extend.qi"
   [(set (match_operand:PSI 0 "register_operand"                          "=r")
@@ -1329,27 +1373,38 @@ (define_insn "*subpsi3_sign_extend.hi"
 
 ;-----------------------------------------------------------------------------
 ; sub bytes
-(define_insn "subqi3"
-  [(set (match_operand:QI 0 "register_operand" "=r,d")
-        (minus:QI (match_operand:QI 1 "register_operand" "0,0")
-                  (match_operand:QI 2 "nonmemory_operand" "r,i")))]
+
+;; "subqi3"
+;; "subqq3" "subuqq3"
+(define_insn "sub<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand"                      "=r,d    ,r  ,r  ,r  ,r")
+        (minus:ALL1 (match_operand:ALL1 1 "register_operand"           "0,0    ,0  ,0  ,0  ,0")
+                    (match_operand:ALL1 2 "nonmemory_or_const_operand" "r,n Ynn,Y01,Ym1,Y02,Ym2")))]
   ""
   "@
 	sub %0,%2
-	subi %0,lo8(%2)"
-  [(set_attr "length" "1,1")
-   (set_attr "cc" "set_czn,set_czn")])
+	subi %0,lo8(%2)
+	dec %0
+	inc %0
+	dec %0\;dec %0
+	inc %0\;inc %0"
+  [(set_attr "length" "1,1,1,1,2,2")
+   (set_attr "cc" "set_czn,set_czn,set_zn,set_zn,set_zn,set_zn")])
 
-(define_insn "subhi3"
-  [(set (match_operand:HI 0 "register_operand" "=r,d")
-        (minus:HI (match_operand:HI 1 "register_operand" "0,0")
-                  (match_operand:HI 2 "nonmemory_operand" "r,i")))]
+;; "subhi3"
+;; "subhq3" "subuhq3"
+;; "subha3" "subuha3"
+(define_insn "sub<mode>3"
+  [(set (match_operand:ALL2 0 "register_operand"                      "=r,d    ,*r")
+        (minus:ALL2 (match_operand:ALL2 1 "register_operand"           "0,0    ,0")
+                    (match_operand:ALL2 2 "nonmemory_or_const_operand" "r,i Ynn,Ynn")))
+   (clobber (match_scratch:QI 3                                       "=X,X    ,&d"))]
   ""
-  "@
-	sub %A0,%A2\;sbc %B0,%B2
-	subi %A0,lo8(%2)\;sbci %B0,hi8(%2)"
-  [(set_attr "length" "2,2")
-   (set_attr "cc" "set_czn,set_czn")])
+  {
+    return avr_out_minus (operands, NULL, NULL);
+  }
+  [(set_attr "adjust_len" "minus")
+   (set_attr "cc" "minus")])
 
 (define_insn "*subhi3_zero_extend1"
   [(set (match_operand:HI 0 "register_operand"                          "=r")
@@ -1373,13 +1428,23 @@ (define_insn "*subhi3.sign_extend2"
   [(set_attr "length" "5")
    (set_attr "cc" "clobber")])
 
-(define_insn "subsi3"
-  [(set (match_operand:SI 0 "register_operand"          "=r")
-        (minus:SI (match_operand:SI 1 "register_operand" "0")
-                  (match_operand:SI 2 "register_operand" "r")))]
+;; "subsi3"
+;; "subsq3" "subusq3"
+;; "subsa3" "subusa3"
+(define_insn "sub<mode>3"
+  [(set (match_operand:ALL4 0 "register_operand"                      "=r,d    ,r")
+        (minus:ALL4 (match_operand:ALL4 1 "register_operand"           "0,0    ,0")
+                    (match_operand:ALL4 2 "nonmemory_or_const_operand" "r,n Ynn,Ynn")))
+   (clobber (match_scratch:QI 3                                       "=X,X    ,&d"))]
   ""
-  "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2"
+  {
+    if (REG_P (operands[2]))
+      return "sub %0,%2\;sbc %B0,%B2\;sbc %C0,%C2\;sbc %D0,%D2";
+    
+    return avr_out_minus (operands, NULL, NULL);
+  }
   [(set_attr "length" "4")
+   (set_attr "adjust_len" "*,minus,minus")
    (set_attr "cc" "set_czn")])
 
 (define_insn "*subsi3_zero_extend"
@@ -1515,8 +1580,18 @@ (define_insn "*addsi3.lt0"
 	adc %A0,__zero_reg__\;adc %B0,__zero_reg__\;adc %C0,__zero_reg__\;adc %D0,__zero_reg__"
   [(set_attr "length" "6")
    (set_attr "cc" "clobber")])
-  
 
+(define_insn "*umulqihi3.call"
+  [(set (reg:HI 24)
+        (mult:HI (zero_extend:HI (reg:QI 22))
+                 (zero_extend:HI (reg:QI 24))))
+   (clobber (reg:QI 21))
+   (clobber (reg:HI 22))]
+  "!AVR_HAVE_MUL"
+  "%~call __umulqihi3"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+  
 ;; "umulqihi3"
 ;; "mulqihi3"
 (define_insn "<extend_u>mulqihi3"
@@ -3303,44 +3378,58 @@ (define_insn_and_split "*rotb<mode>"
 ;;<< << << << << << << << << << << << << << << << << << << << << << << << << <<
 ;; arithmetic shift left
 
-(define_expand "ashlqi3"
-  [(set (match_operand:QI 0 "register_operand"            "")
-        (ashift:QI (match_operand:QI 1 "register_operand" "")
-                   (match_operand:QI 2 "nop_general_operand" "")))])
+;; "ashlqi3"
+;; "ashlqq3"  "ashluqq3"
+(define_expand "ashl<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand" "")
+        (ashift:ALL1 (match_operand:ALL1 1 "register_operand" "")
+                     (match_operand:QI 2 "nop_general_operand" "")))])
 
 (define_split ; ashlqi3_const4
-  [(set (match_operand:QI 0 "d_register_operand" "")
-        (ashift:QI (match_dup 0)
-                   (const_int 4)))]
+  [(set (match_operand:ALL1 0 "d_register_operand" "")
+        (ashift:ALL1 (match_dup 0)
+                     (const_int 4)))]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (and:QI (match_dup 0) (const_int -16)))]
-  "")
+  [(set (match_dup 1)
+        (rotate:QI (match_dup 1)
+                   (const_int 4)))
+   (set (match_dup 1)
+        (and:QI (match_dup 1)
+                (const_int -16)))]
+  {
+    operands[1] = avr_to_int_mode (operands[0]);
+  })
 
 (define_split ; ashlqi3_const5
-  [(set (match_operand:QI 0 "d_register_operand" "")
-        (ashift:QI (match_dup 0)
-                   (const_int 5)))]
+  [(set (match_operand:ALL1 0 "d_register_operand" "")
+        (ashift:ALL1 (match_dup 0)
+                     (const_int 5)))]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1)))
-   (set (match_dup 0) (and:QI (match_dup 0) (const_int -32)))]
-  "")
+  [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4)))
+   (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 1)))
+   (set (match_dup 1) (and:QI (match_dup 1) (const_int -32)))]
+  {
+    operands[1] = avr_to_int_mode (operands[0]);
+  })
 
 (define_split ; ashlqi3_const6
-  [(set (match_operand:QI 0 "d_register_operand" "")
-        (ashift:QI (match_dup 0)
-                   (const_int 6)))]
+  [(set (match_operand:ALL1 0 "d_register_operand" "")
+        (ashift:ALL1 (match_dup 0)
+                     (const_int 6)))]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2)))
-   (set (match_dup 0) (and:QI (match_dup 0) (const_int -64)))]
-  "")
+  [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4)))
+   (set (match_dup 1) (ashift:QI (match_dup 1) (const_int 2)))
+   (set (match_dup 1) (and:QI (match_dup 1) (const_int -64)))]
+  {
+    operands[1] = avr_to_int_mode (operands[0]);
+  })
 
-(define_insn "*ashlqi3"
-  [(set (match_operand:QI 0 "register_operand"              "=r,r,r,r,!d,r,r")
-        (ashift:QI (match_operand:QI 1 "register_operand"    "0,0,0,0,0 ,0,0")
-                   (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))]
+;; "*ashlqi3"
+;; "*ashlqq3"  "*ashluqq3"
+(define_insn "*ashl<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand"              "=r,r,r,r,!d,r,r")
+        (ashift:ALL1 (match_operand:ALL1 1 "register_operand"  "0,0,0,0,0 ,0,0")
+                     (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))]
   ""
   {
     return ashlqi3_out (insn, operands, NULL);
@@ -3349,10 +3438,10 @@ (define_insn "*ashlqi3"
    (set_attr "adjust_len" "ashlqi")
    (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")])
 
-(define_insn "ashlhi3"
-  [(set (match_operand:HI 0 "register_operand"              "=r,r,r,r,r,r,r")
-        (ashift:HI (match_operand:HI 1 "register_operand"    "0,0,0,r,0,0,0")
-                   (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+(define_insn "ashl<mode>3"
+  [(set (match_operand:ALL2 0 "register_operand"              "=r,r,r,r,r,r,r")
+        (ashift:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,0,r,0,0,0")
+                     (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
   ""
   {
     return ashlhi3_out (insn, operands, NULL);
@@ -3377,8 +3466,7 @@ (define_insn_and_split "*ashl<extend_su>
   ""
   [(set (match_dup 0)
         (ashift:QI (match_dup 1)
-                   (match_dup 2)))]
-  "")
+                   (match_dup 2)))])
 
 ;; ??? Combiner does not recognize that it could split the following insn;
 ;;     presumably because he has no register handy?
@@ -3443,10 +3531,13 @@ (define_peephole2
   })
 
 
-(define_insn "ashlsi3"
-  [(set (match_operand:SI 0 "register_operand"              "=r,r,r,r,r,r,r")
-        (ashift:SI (match_operand:SI 1 "register_operand"    "0,0,0,r,0,0,0")
-                   (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+;; "ashlsi3"
+;; "ashlsq3"  "ashlusq3"
+;; "ashlsa3"  "ashlusa3"
+(define_insn "ashl<mode>3"
+  [(set (match_operand:ALL4 0 "register_operand"                "=r,r,r,r,r,r,r")
+        (ashift:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,0,0")
+                     (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,n,Qm")))]
   ""
   {
     return ashlsi3_out (insn, operands, NULL);
@@ -3458,55 +3549,65 @@ (define_insn "ashlsi3"
 ;; Optimize if a scratch register from LD_REGS happens to be available.
 
 (define_peephole2 ; ashlqi3_l_const4
-  [(set (match_operand:QI 0 "l_register_operand" "")
-        (ashift:QI (match_dup 0)
-                   (const_int 4)))
+  [(set (match_operand:ALL1 0 "l_register_operand" "")
+        (ashift:ALL1 (match_dup 0)
+                     (const_int 4)))
    (match_scratch:QI 1 "d")]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
+  [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4)))
    (set (match_dup 1) (const_int -16))
-   (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))]
-  "")
+   (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))]
+  {
+    operands[2] = avr_to_int_mode (operands[0]);
+  })
 
 (define_peephole2 ; ashlqi3_l_const5
-  [(set (match_operand:QI 0 "l_register_operand" "")
-        (ashift:QI (match_dup 0)
-                   (const_int 5)))
+  [(set (match_operand:ALL1 0 "l_register_operand" "")
+        (ashift:ALL1 (match_dup 0)
+                     (const_int 5)))
    (match_scratch:QI 1 "d")]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 1)))
+  [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4)))
+   (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 1)))
    (set (match_dup 1) (const_int -32))
-   (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))]
-  "")
+   (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))]
+  {
+    operands[2] = avr_to_int_mode (operands[0]);
+  })
 
 (define_peephole2 ; ashlqi3_l_const6
-  [(set (match_operand:QI 0 "l_register_operand" "")
-        (ashift:QI (match_dup 0)
-                   (const_int 6)))
+  [(set (match_operand:ALL1 0 "l_register_operand" "")
+        (ashift:ALL1 (match_dup 0)
+                     (const_int 6)))
    (match_scratch:QI 1 "d")]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (ashift:QI (match_dup 0) (const_int 2)))
+  [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4)))
+   (set (match_dup 2) (ashift:QI (match_dup 2) (const_int 2)))
    (set (match_dup 1) (const_int -64))
-   (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))]
-  "")
+   (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))]
+  {
+    operands[2] = avr_to_int_mode (operands[0]);
+  })
 
 (define_peephole2
   [(match_scratch:QI 3 "d")
-   (set (match_operand:HI 0 "register_operand" "")
-        (ashift:HI (match_operand:HI 1 "register_operand" "")
-                   (match_operand:QI 2 "const_int_operand" "")))]
+   (set (match_operand:ALL2 0 "register_operand" "")
+        (ashift:ALL2 (match_operand:ALL2 1 "register_operand" "")
+                     (match_operand:QI 2 "const_int_operand" "")))]
   ""
-  [(parallel [(set (match_dup 0) (ashift:HI (match_dup 1) (match_dup 2)))
-              (clobber (match_dup 3))])]
-  "")
-
-(define_insn "*ashlhi3_const"
-  [(set (match_operand:HI 0 "register_operand"            "=r,r,r,r,r")
-        (ashift:HI (match_operand:HI 1 "register_operand"  "0,0,r,0,0")
-                   (match_operand:QI 2 "const_int_operand" "L,P,O,K,n")))
-   (clobber (match_scratch:QI 3                           "=X,X,X,X,&d"))]
+  [(parallel [(set (match_dup 0)
+                   (ashift:ALL2 (match_dup 1)
+                                (match_dup 2)))
+              (clobber (match_dup 3))])])
+
+;; "*ashlhi3_const"
+;; "*ashlhq3_const"  "*ashluhq3_const"
+;; "*ashlha3_const"  "*ashluha3_const"
+(define_insn "*ashl<mode>3_const"
+  [(set (match_operand:ALL2 0 "register_operand"              "=r,r,r,r,r")
+        (ashift:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,0,0")
+                     (match_operand:QI 2 "const_int_operand"   "L,P,O,K,n")))
+   (clobber (match_scratch:QI 3                               "=X,X,X,X,&d"))]
   "reload_completed"
   {
     return ashlhi3_out (insn, operands, NULL);
@@ -3517,19 +3618,24 @@ (define_insn "*ashlhi3_const"
 
 (define_peephole2
   [(match_scratch:QI 3 "d")
-   (set (match_operand:SI 0 "register_operand" "")
-        (ashift:SI (match_operand:SI 1 "register_operand" "")
-                   (match_operand:QI 2 "const_int_operand" "")))]
+   (set (match_operand:ALL4 0 "register_operand" "")
+        (ashift:ALL4 (match_operand:ALL4 1 "register_operand" "")
+                     (match_operand:QI 2 "const_int_operand" "")))]
   ""
-  [(parallel [(set (match_dup 0) (ashift:SI (match_dup 1) (match_dup 2)))
+  [(parallel [(set (match_dup 0)
+                   (ashift:ALL4 (match_dup 1)
+                                (match_dup 2)))
               (clobber (match_dup 3))])]
   "")
 
-(define_insn "*ashlsi3_const"
-  [(set (match_operand:SI 0 "register_operand"            "=r,r,r,r")
-        (ashift:SI (match_operand:SI 1 "register_operand"  "0,0,r,0")
-                   (match_operand:QI 2 "const_int_operand" "L,P,O,n")))
-   (clobber (match_scratch:QI 3                           "=X,X,X,&d"))]
+;; "*ashlsi3_const"
+;; "*ashlsq3_const"  "*ashlusq3_const"
+;; "*ashlsa3_const"  "*ashlusa3_const"
+(define_insn "*ashl<mode>3_const"
+  [(set (match_operand:ALL4 0 "register_operand"              "=r,r,r,r")
+        (ashift:ALL4 (match_operand:ALL4 1 "register_operand"  "0,0,r,0")
+                     (match_operand:QI 2 "const_int_operand"   "L,P,O,n")))
+   (clobber (match_scratch:QI 3                               "=X,X,X,&d"))]
   "reload_completed"
   {
     return ashlsi3_out (insn, operands, NULL);
@@ -3580,10 +3686,12 @@ (define_insn "*ashlpsi3"
 ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
 ;; arithmetic shift right
 
-(define_insn "ashrqi3"
-  [(set (match_operand:QI 0 "register_operand"                "=r,r,r,r,r          ,r      ,r")
-        (ashiftrt:QI (match_operand:QI 1 "register_operand"    "0,0,0,0,0          ,0      ,0")
-                     (match_operand:QI 2 "nop_general_operand" "r,L,P,K,C03 C04 C05,C06 C07,Qm")))]
+;; "ashrqi3"
+;; "ashrqq3"  "ashruqq3"
+(define_insn "ashr<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand"                  "=r,r,r,r,r          ,r      ,r")
+        (ashiftrt:ALL1 (match_operand:ALL1 1 "register_operand"    "0,0,0,0,0          ,0      ,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,K,C03 C04 C05,C06 C07,Qm")))]
   ""
   {
     return ashrqi3_out (insn, operands, NULL);
@@ -3592,10 +3700,13 @@ (define_insn "ashrqi3"
    (set_attr "adjust_len" "ashrqi")
    (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,clobber,clobber")])
 
-(define_insn "ashrhi3"
-  [(set (match_operand:HI 0 "register_operand"                "=r,r,r,r,r,r,r")
-        (ashiftrt:HI (match_operand:HI 1 "register_operand"    "0,0,0,r,0,0,0")
-                     (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+;; "ashrhi3"
+;; "ashrhq3"  "ashruhq3"
+;; "ashrha3"  "ashruha3"
+(define_insn "ashr<mode>3"
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r,r,r")
+        (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,0,r,0,0,0")
+                       (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
   ""
   {
     return ashrhi3_out (insn, operands, NULL);
@@ -3616,10 +3727,13 @@ (define_insn "ashrpsi3"
   [(set_attr "adjust_len" "ashrpsi")
    (set_attr "cc" "clobber")])
 
-(define_insn "ashrsi3"
-  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r")
-        (ashiftrt:SI (match_operand:SI 1 "register_operand"    "0,0,0,r,0,0,0")
-                     (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+;; "ashrsi3"
+;; "ashrsq3"  "ashrusq3"
+;; "ashrsa3"  "ashrusa3"
+(define_insn "ashr<mode>3"
+  [(set (match_operand:ALL4 0 "register_operand"                  "=r,r,r,r,r,r,r")
+        (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,n,Qm")))]
   ""
   {
     return ashrsi3_out (insn, operands, NULL);
@@ -3632,19 +3746,23 @@ (define_insn "ashrsi3"
 
 (define_peephole2
   [(match_scratch:QI 3 "d")
-   (set (match_operand:HI 0 "register_operand" "")
-        (ashiftrt:HI (match_operand:HI 1 "register_operand" "")
-                     (match_operand:QI 2 "const_int_operand" "")))]
+   (set (match_operand:ALL2 0 "register_operand" "")
+        (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "")
+                       (match_operand:QI 2 "const_int_operand" "")))]
   ""
-  [(parallel [(set (match_dup 0) (ashiftrt:HI (match_dup 1) (match_dup 2)))
-              (clobber (match_dup 3))])]
-  "")
-
-(define_insn "*ashrhi3_const"
-  [(set (match_operand:HI 0 "register_operand"              "=r,r,r,r,r")
-        (ashiftrt:HI (match_operand:HI 1 "register_operand"  "0,0,r,0,0")
-                     (match_operand:QI 2 "const_int_operand" "L,P,O,K,n")))
-   (clobber (match_scratch:QI 3                             "=X,X,X,X,&d"))]
+  [(parallel [(set (match_dup 0)
+                   (ashiftrt:ALL2 (match_dup 1)
+                                  (match_dup 2)))
+              (clobber (match_dup 3))])])
+
+;; "*ashrhi3_const"
+;; "*ashrhq3_const"  "*ashruhq3_const"
+;; "*ashrha3_const"  "*ashruha3_const"
+(define_insn "*ashr<mode>3_const"
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r")
+        (ashiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,0,0")
+                       (match_operand:QI 2 "const_int_operand"   "L,P,O,K,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,X,&d"))]
   "reload_completed"
   {
     return ashrhi3_out (insn, operands, NULL);
@@ -3655,19 +3773,23 @@ (define_insn "*ashrhi3_const"
 
 (define_peephole2
   [(match_scratch:QI 3 "d")
-   (set (match_operand:SI 0 "register_operand" "")
-        (ashiftrt:SI (match_operand:SI 1 "register_operand" "")
-                     (match_operand:QI 2 "const_int_operand" "")))]
+   (set (match_operand:ALL4 0 "register_operand" "")
+        (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "")
+                       (match_operand:QI 2 "const_int_operand" "")))]
   ""
-  [(parallel [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (match_dup 2)))
-              (clobber (match_dup 3))])]
-  "")
-
-(define_insn "*ashrsi3_const"
-  [(set (match_operand:SI 0 "register_operand"              "=r,r,r,r")
-        (ashiftrt:SI (match_operand:SI 1 "register_operand"  "0,0,r,0")
-                     (match_operand:QI 2 "const_int_operand" "L,P,O,n")))
-   (clobber (match_scratch:QI 3                             "=X,X,X,&d"))]
+  [(parallel [(set (match_dup 0)
+                   (ashiftrt:ALL4 (match_dup 1)
+                                  (match_dup 2)))
+              (clobber (match_dup 3))])])
+
+;; "*ashrsi3_const"
+;; "*ashrsq3_const"  "*ashrusq3_const"
+;; "*ashrsa3_const"  "*ashrusa3_const"
+(define_insn "*ashr<mode>3_const"
+  [(set (match_operand:ALL4 0 "register_operand"                "=r,r,r,r")
+        (ashiftrt:ALL4 (match_operand:ALL4 1 "register_operand"  "0,0,r,0")
+                       (match_operand:QI 2 "const_int_operand"   "L,P,O,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,&d"))]
   "reload_completed"
   {
     return ashrsi3_out (insn, operands, NULL);
@@ -3679,44 +3801,59 @@ (define_insn "*ashrsi3_const"
 ;; >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >>
 ;; logical shift right
 
-(define_expand "lshrqi3"
-  [(set (match_operand:QI 0 "register_operand" "")
-        (lshiftrt:QI (match_operand:QI 1 "register_operand" "")
-                     (match_operand:QI 2 "nop_general_operand" "")))])
+;; "lshrqi3"
+;; "lshrqq3 "lshruqq3"
+(define_expand "lshr<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand" "")
+        (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand" "")
+                       (match_operand:QI 2 "nop_general_operand" "")))])
 
 (define_split	; lshrqi3_const4
-  [(set (match_operand:QI 0 "d_register_operand" "")
-        (lshiftrt:QI (match_dup 0)
-                     (const_int 4)))]
+  [(set (match_operand:ALL1 0 "d_register_operand" "")
+        (lshiftrt:ALL1 (match_dup 0)
+                       (const_int 4)))]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (and:QI (match_dup 0) (const_int 15)))]
-  "")
+  [(set (match_dup 1)
+        (rotate:QI (match_dup 1)
+                   (const_int 4)))
+   (set (match_dup 1)
+        (and:QI (match_dup 1)
+                (const_int 15)))]
+  {
+    operands[1] = avr_to_int_mode (operands[0]);
+  })
 
 (define_split	; lshrqi3_const5
-  [(set (match_operand:QI 0 "d_register_operand" "")
-        (lshiftrt:QI (match_dup 0)
-                     (const_int 5)))]
-  ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1)))
-   (set (match_dup 0) (and:QI (match_dup 0) (const_int 7)))]
-  "")
+  [(set (match_operand:ALL1 0 "d_register_operand" "")
+        (lshiftrt:ALL1 (match_dup 0)
+                       (const_int 5)))]
+  ""
+  [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4)))
+   (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 1)))
+   (set (match_dup 1) (and:QI (match_dup 1) (const_int 7)))]
+  {
+    operands[1] = avr_to_int_mode (operands[0]);
+  })
 
 (define_split	; lshrqi3_const6
   [(set (match_operand:QI 0 "d_register_operand" "")
         (lshiftrt:QI (match_dup 0)
                      (const_int 6)))]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2)))
-   (set (match_dup 0) (and:QI (match_dup 0) (const_int 3)))]
-  "")
+  [(set (match_dup 1) (rotate:QI (match_dup 1) (const_int 4)))
+   (set (match_dup 1) (lshiftrt:QI (match_dup 1) (const_int 2)))
+   (set (match_dup 1) (and:QI (match_dup 1) (const_int 3)))]
+  {
+    operands[1] = avr_to_int_mode (operands[0]);
+  })
 
-(define_insn "*lshrqi3"
-  [(set (match_operand:QI 0 "register_operand"                "=r,r,r,r,!d,r,r")
-        (lshiftrt:QI (match_operand:QI 1 "register_operand"    "0,0,0,0,0 ,0,0")
-                     (match_operand:QI 2 "nop_general_operand" "r,L,P,K,n ,n,Qm")))]
+;; "*lshrqi3"
+;; "*lshrqq3"
+;; "*lshruqq3"
+(define_insn "*lshr<mode>3"
+  [(set (match_operand:ALL1 0 "register_operand"                  "=r,r,r,r,!d,r,r")
+        (lshiftrt:ALL1 (match_operand:ALL1 1 "register_operand"    "0,0,0,0,0 ,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,K,n ,n,Qm")))]
   ""
   {
     return lshrqi3_out (insn, operands, NULL);
@@ -3725,10 +3862,13 @@ (define_insn "*lshrqi3"
    (set_attr "adjust_len" "lshrqi")
    (set_attr "cc" "clobber,none,set_czn,set_czn,set_czn,set_czn,clobber")])
 
-(define_insn "lshrhi3"
-  [(set (match_operand:HI 0 "register_operand"                "=r,r,r,r,r,r,r")
-        (lshiftrt:HI (match_operand:HI 1 "register_operand"    "0,0,0,r,0,0,0")
-                     (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+;; "lshrhi3"
+;; "lshrhq3"  "lshruhq3"
+;; "lshrha3"  "lshruha3"
+(define_insn "lshr<mode>3"
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r,r,r")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"    "0,0,0,r,0,0,0")
+                       (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
   ""
   {
     return lshrhi3_out (insn, operands, NULL);
@@ -3749,10 +3889,13 @@ (define_insn "lshrpsi3"
   [(set_attr "adjust_len" "lshrpsi")
    (set_attr "cc" "clobber")])
 
-(define_insn "lshrsi3"
-  [(set (match_operand:SI 0 "register_operand"                "=r,r,r,r,r,r,r")
-        (lshiftrt:SI (match_operand:SI 1 "register_operand"    "0,0,0,r,0,0,0")
-                     (match_operand:QI 2 "nop_general_operand" "r,L,P,O,K,n,Qm")))]
+;; "lshrsi3"
+;; "lshrsq3"  "lshrusq3"
+;; "lshrsa3"  "lshrusa3"
+(define_insn "lshr<mode>3"
+  [(set (match_operand:ALL4 0 "register_operand"                  "=r,r,r,r,r,r,r")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand"    "0,0,0,r,0,0,0")
+                       (match_operand:QI 2 "nop_general_operand"   "r,L,P,O,K,n,Qm")))]
   ""
   {
     return lshrsi3_out (insn, operands, NULL);
@@ -3764,55 +3907,65 @@ (define_insn "lshrsi3"
 ;; Optimize if a scratch register from LD_REGS happens to be available.
 
 (define_peephole2 ; lshrqi3_l_const4
-  [(set (match_operand:QI 0 "l_register_operand" "")
-        (lshiftrt:QI (match_dup 0)
-                     (const_int 4)))
+  [(set (match_operand:ALL1 0 "l_register_operand" "")
+        (lshiftrt:ALL1 (match_dup 0)
+                       (const_int 4)))
    (match_scratch:QI 1 "d")]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
+  [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4)))
    (set (match_dup 1) (const_int 15))
-   (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))]
-  "")
+   (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))]
+  {
+    operands[2] = avr_to_int_mode (operands[0]);
+  })
 
 (define_peephole2 ; lshrqi3_l_const5
-  [(set (match_operand:QI 0 "l_register_operand" "")
-        (lshiftrt:QI (match_dup 0)
-                     (const_int 5)))
+  [(set (match_operand:ALL1 0 "l_register_operand" "")
+        (lshiftrt:ALL1 (match_dup 0)
+                       (const_int 5)))
    (match_scratch:QI 1 "d")]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 1)))
+  [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4)))
+   (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 1)))
    (set (match_dup 1) (const_int 7))
-   (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))]
-  "")
+   (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))]
+  {
+    operands[2] = avr_to_int_mode (operands[0]);
+  })
 
 (define_peephole2 ; lshrqi3_l_const6
-  [(set (match_operand:QI 0 "l_register_operand" "")
-        (lshiftrt:QI (match_dup 0)
-                     (const_int 6)))
+  [(set (match_operand:ALL1 0 "l_register_operand" "")
+        (lshiftrt:ALL1 (match_dup 0)
+                       (const_int 6)))
    (match_scratch:QI 1 "d")]
   ""
-  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
-   (set (match_dup 0) (lshiftrt:QI (match_dup 0) (const_int 2)))
+  [(set (match_dup 2) (rotate:QI (match_dup 2) (const_int 4)))
+   (set (match_dup 2) (lshiftrt:QI (match_dup 2) (const_int 2)))
    (set (match_dup 1) (const_int 3))
-   (set (match_dup 0) (and:QI (match_dup 0) (match_dup 1)))]
-  "")
+   (set (match_dup 2) (and:QI (match_dup 2) (match_dup 1)))]
+  {
+    operands[2] = avr_to_int_mode (operands[0]);
+  })
 
 (define_peephole2
   [(match_scratch:QI 3 "d")
-   (set (match_operand:HI 0 "register_operand" "")
-        (lshiftrt:HI (match_operand:HI 1 "register_operand" "")
-                     (match_operand:QI 2 "const_int_operand" "")))]
+   (set (match_operand:ALL2 0 "register_operand" "")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand" "")
+                       (match_operand:QI 2 "const_int_operand" "")))]
   ""
-  [(parallel [(set (match_dup 0) (lshiftrt:HI (match_dup 1) (match_dup 2)))
-              (clobber (match_dup 3))])]
-  "")
-
-(define_insn "*lshrhi3_const"
-  [(set (match_operand:HI 0 "register_operand"              "=r,r,r,r,r")
-        (lshiftrt:HI (match_operand:HI 1 "register_operand"  "0,0,r,0,0")
-                     (match_operand:QI 2 "const_int_operand" "L,P,O,K,n")))
-   (clobber (match_scratch:QI 3                             "=X,X,X,X,&d"))]
+  [(parallel [(set (match_dup 0)
+                   (lshiftrt:ALL2 (match_dup 1)
+                                  (match_dup 2)))
+              (clobber (match_dup 3))])])
+
+;; "*lshrhi3_const"
+;; "*lshrhq3_const"  "*lshruhq3_const"
+;; "*lshrha3_const"  "*lshruha3_const"
+(define_insn "*lshr<mode>3_const"
+  [(set (match_operand:ALL2 0 "register_operand"                "=r,r,r,r,r")
+        (lshiftrt:ALL2 (match_operand:ALL2 1 "register_operand"  "0,0,r,0,0")
+                       (match_operand:QI 2 "const_int_operand"   "L,P,O,K,n")))
+   (clobber (match_scratch:QI 3                                 "=X,X,X,X,&d"))]
   "reload_completed"
   {
     return lshrhi3_out (insn, operands, NULL);
@@ -3823,19 +3976,23 @@ (define_insn "*lshrhi3_const"
 
 (define_peephole2
   [(match_scratch:QI 3 "d")
-   (set (match_operand:SI 0 "register_operand" "")
-        (lshiftrt:SI (match_operand:SI 1 "register_operand" "")
-                     (match_operand:QI 2 "const_int_operand" "")))]
+   (set (match_operand:ALL4 0 "register_operand" "")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "")
+                       (match_operand:QI 2 "const_int_operand" "")))]
   ""
-  [(parallel [(set (match_dup 0) (lshiftrt:SI (match_dup 1) (match_dup 2)))
-              (clobber (match_dup 3))])]
-  "")
-
-(define_insn "*lshrsi3_const"
-  [(set (match_operand:SI 0 "register_operand"              "=r,r,r,r")
-        (lshiftrt:SI (match_operand:SI 1 "register_operand"  "0,0,r,0")
-                     (match_operand:QI 2 "const_int_operand" "L,P,O,n")))
-   (clobber (match_scratch:QI 3                             "=X,X,X,&d"))]
+  [(parallel [(set (match_dup 0)
+                   (lshiftrt:ALL4 (match_dup 1)
+                                  (match_dup 2)))
+              (clobber (match_dup 3))])])
+
+;; "*lshrsi3_const"
+;; "*lshrsq3_const"  "*lshrusq3_const"
+;; "*lshrsa3_const"  "*lshrusa3_const"
+(define_insn "*lshr<mode>3_const"
+  [(set (match_operand:ALL4 0 "register_operand"               "=r,r,r,r")
+        (lshiftrt:ALL4 (match_operand:ALL4 1 "register_operand" "0,0,r,0")
+                       (match_operand:QI 2 "const_int_operand"  "L,P,O,n")))
+   (clobber (match_scratch:QI 3                                "=X,X,X,&d"))]
   "reload_completed"
   {
     return lshrsi3_out (insn, operands, NULL);
@@ -4278,24 +4435,29 @@ (define_insn "*negated_tstsi"
   [(set_attr "cc" "compare")
    (set_attr "length" "4")])
 
-(define_insn "*reversed_tstsi"
+;; "*reversed_tstsi"
+;; "*reversed_tstsq" "*reversed_tstusq"
+;; "*reversed_tstsa" "*reversed_tstusa"
+(define_insn "*reversed_tst<mode>"
   [(set (cc0)
-        (compare (const_int 0)
-                 (match_operand:SI 0 "register_operand" "r")))
-   (clobber (match_scratch:QI 1 "=X"))]
-  ""
-  "cp __zero_reg__,%A0
-	cpc __zero_reg__,%B0
-	cpc __zero_reg__,%C0
-	cpc __zero_reg__,%D0"
+        (compare (match_operand:ALL4 0 "const0_operand"   "Y00")
+                 (match_operand:ALL4 1 "register_operand" "r")))
+   (clobber (match_scratch:QI 2 "=X"))]
+  ""
+  "cp __zero_reg__,%A1
+	cpc __zero_reg__,%B1
+	cpc __zero_reg__,%C1
+	cpc __zero_reg__,%D1"
   [(set_attr "cc" "compare")
    (set_attr "length" "4")])
 
 
-(define_insn "*cmpqi"
+;; "*cmpqi"
+;; "*cmpqq" "*cmpuqq"
+(define_insn "*cmp<mode>"
   [(set (cc0)
-        (compare (match_operand:QI 0 "register_operand"  "r,r,d")
-                 (match_operand:QI 1 "nonmemory_operand" "L,r,i")))]
+        (compare (match_operand:ALL1 0 "register_operand"  "r  ,r,d")
+                 (match_operand:ALL1 1 "nonmemory_operand" "Y00,r,i")))]
   ""
   "@
 	tst %0
@@ -4313,11 +4475,14 @@ (define_insn "*cmpqi_sign_extend"
   [(set_attr "cc" "compare")
    (set_attr "length" "1")])
 
-(define_insn "*cmphi"
+;; "*cmphi"
+;; "*cmphq" "*cmpuhq"
+;; "*cmpha" "*cmpuha"
+(define_insn "*cmp<mode>"
   [(set (cc0)
-        (compare (match_operand:HI 0 "register_operand"  "!w,r,r,d ,r  ,d,r")
-                 (match_operand:HI 1 "nonmemory_operand" "L ,L,r,s ,s  ,M,n")))
-   (clobber (match_scratch:QI 2                         "=X ,X,X,&d,&d ,X,&d"))]
+        (compare (match_operand:ALL2 0 "register_operand"  "!w  ,r  ,r,d ,r  ,d,r")
+                 (match_operand:ALL2 1 "nonmemory_operand"  "Y00,Y00,r,s ,s  ,M,n Ynn")))
+   (clobber (match_scratch:QI 2                            "=X  ,X  ,X,&d,&d ,X,&d"))]
   ""
   {
     switch (which_alternative)
@@ -4330,11 +4495,15 @@ (define_insn "*cmphi"
         return "cp %A0,%A1\;cpc %B0,%B1";
 
       case 3:
+        if (<MODE>mode != HImode)
+          break;
         return reg_unused_after (insn, operands[0])
                ? "subi %A0,lo8(%1)\;sbci %B0,hi8(%1)"
                : "ldi %2,hi8(%1)\;cpi %A0,lo8(%1)\;cpc %B0,%2";
                
       case 4:
+        if (<MODE>mode != HImode)
+          break;
         return "ldi %2,lo8(%1)\;cp %A0,%2\;ldi %2,hi8(%1)\;cpc %B0,%2";
       }
       
@@ -4374,11 +4543,14 @@ (define_insn "*cmppsi"
    (set_attr "length" "3,3,5,6,3,7")
    (set_attr "adjust_len" "tstpsi,*,*,*,compare,compare")])
 
-(define_insn "*cmpsi"
+;; "*cmpsi"
+;; "*cmpsq" "*cmpusq"
+;; "*cmpsa" "*cmpusa"
+(define_insn "*cmp<mode>"
   [(set (cc0)
-        (compare (match_operand:SI 0 "register_operand"  "r,r ,d,r ,r")
-                 (match_operand:SI 1 "nonmemory_operand" "L,r ,M,M ,n")))
-   (clobber (match_scratch:QI 2                         "=X,X ,X,&d,&d"))]
+        (compare (match_operand:ALL4 0 "register_operand"  "r  ,r ,d,r ,r")
+                 (match_operand:ALL4 1 "nonmemory_operand" "Y00,r ,M,M ,n Ynn")))
+   (clobber (match_scratch:QI 2                           "=X  ,X ,X,&d,&d"))]
   ""
   {
     if (0 == which_alternative)
@@ -4398,55 +4570,33 @@ (define_insn "*cmpsi"
 ;; ----------------------------------------------------------------------
 ;; Conditional jump instructions
 
-(define_expand "cbranchsi4"
-  [(parallel [(set (cc0)
-                   (compare (match_operand:SI 1 "register_operand" "")
-                            (match_operand:SI 2 "nonmemory_operand" "")))
-              (clobber (match_scratch:QI 4 ""))])
+;; "cbranchqi4"
+;; "cbranchqq4"  "cbranchuqq4"
+(define_expand "cbranch<mode>4"
+  [(set (cc0)
+        (compare (match_operand:ALL1 1 "register_operand" "")
+                 (match_operand:ALL1 2 "nonmemory_operand" "")))
    (set (pc)
         (if_then_else
-              (match_operator 0 "ordered_comparison_operator" [(cc0)
-                                                               (const_int 0)])
-              (label_ref (match_operand 3 "" ""))
-              (pc)))]
- "")
-
-(define_expand "cbranchpsi4"
-  [(parallel [(set (cc0)
-                   (compare (match_operand:PSI 1 "register_operand" "")
-                            (match_operand:PSI 2 "nonmemory_operand" "")))
-              (clobber (match_scratch:QI 4 ""))])
-   (set (pc)
-        (if_then_else (match_operator 0 "ordered_comparison_operator" [(cc0)
-                                                                       (const_int 0)])
-                      (label_ref (match_operand 3 "" ""))
-                      (pc)))]
- "")
+         (match_operator 0 "ordered_comparison_operator" [(cc0)
+                                                          (const_int 0)])
+         (label_ref (match_operand 3 "" ""))
+         (pc)))])
 
-(define_expand "cbranchhi4"
+;; "cbranchhi4"  "cbranchhq4"  "cbranchuhq4"  "cbranchha4"  "cbranchuha4"
+;; "cbranchsi4"  "cbranchsq4"  "cbranchusq4"  "cbranchsa4"  "cbranchusa4"
+;; "cbranchpsi4"
+(define_expand "cbranch<mode>4"
   [(parallel [(set (cc0)
-                   (compare (match_operand:HI 1 "register_operand" "")
-                            (match_operand:HI 2 "nonmemory_operand" "")))
+                   (compare (match_operand:ORDERED234 1 "register_operand" "")
+                            (match_operand:ORDERED234 2 "nonmemory_operand" "")))
               (clobber (match_scratch:QI 4 ""))])
    (set (pc)
         (if_then_else
-              (match_operator 0 "ordered_comparison_operator" [(cc0)
-                                                               (const_int 0)])
-              (label_ref (match_operand 3 "" ""))
-              (pc)))]
- "")
-
-(define_expand "cbranchqi4"
-  [(set (cc0)
-        (compare (match_operand:QI 1 "register_operand" "")
-                 (match_operand:QI 2 "nonmemory_operand" "")))
-   (set (pc)
-        (if_then_else
-              (match_operator 0 "ordered_comparison_operator" [(cc0)
-                                                               (const_int 0)])
-              (label_ref (match_operand 3 "" ""))
-              (pc)))]
- "")
+         (match_operator 0 "ordered_comparison_operator" [(cc0)
+                                                          (const_int 0)])
+         (label_ref (match_operand 3 "" ""))
+         (pc)))])
 
 
 ;; Test a single bit in a QI/HI/SImode register.
@@ -4477,7 +4627,7 @@ (define_insn "*sbrx_branch<mode>"
                                     (const_int 4))))
    (set_attr "cc" "clobber")])
 
-;; Same test based on Bitwise AND RTL. Keep this incase gcc changes patterns.
+;; Same test based on bitwise AND.  Keep this in case gcc changes patterns.
 ;; or for old peepholes.
 ;; Fixme - bitwise Mask will not work for DImode
 
@@ -4492,12 +4642,12 @@ (define_insn "*sbrx_and_branch<mode>"
          (label_ref (match_operand 3 "" ""))
          (pc)))]
   ""
-{
+  {
     HOST_WIDE_INT bitnumber;
     bitnumber = exact_log2 (GET_MODE_MASK (<MODE>mode) & INTVAL (operands[2]));
     operands[2] = GEN_INT (bitnumber);
     return avr_out_sbxx_branch (insn, operands);
-}
+  }
   [(set (attr "length")
         (if_then_else (and (ge (minus (pc) (match_dup 3)) (const_int -2046))
                            (le (minus (pc) (match_dup 3)) (const_int 2046)))
@@ -4837,9 +4987,10 @@ (define_insn "*tablejump"
 
 
 (define_expand "casesi"
-  [(set (match_dup 6)
-        (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0)
-                  (match_operand:HI 1 "register_operand" "")))
+  [(parallel [(set (match_dup 6)
+                   (minus:HI (subreg:HI (match_operand:SI 0 "register_operand" "") 0)
+                             (match_operand:HI 1 "register_operand" "")))
+              (clobber (scratch:QI))])
    (parallel [(set (cc0)
                    (compare (match_dup 6)
                             (match_operand:HI 2 "register_operand" "")))
@@ -5201,8 +5352,8 @@ (define_peephole ; "*dec-and-branchqi!=-
 
 (define_peephole ; "*cpse.eq"
   [(set (cc0)
-        (compare (match_operand:QI 1 "register_operand" "r,r")
-                 (match_operand:QI 2 "reg_or_0_operand" "r,L")))
+        (compare (match_operand:ALL1 1 "register_operand" "r,r")
+                 (match_operand:ALL1 2 "reg_or_0_operand" "r,Y00")))
    (set (pc)
         (if_then_else (eq (cc0)
                           (const_int 0))
@@ -5236,8 +5387,8 @@ (define_peephole ; "*cpse.eq"
 
 (define_peephole ; "*cpse.ne"
   [(set (cc0)
-        (compare (match_operand:QI 1 "register_operand" "")
-                 (match_operand:QI 2 "reg_or_0_operand" "")))
+        (compare (match_operand:ALL1 1 "register_operand" "")
+                 (match_operand:ALL1 2 "reg_or_0_operand" "")))
    (set (pc)
         (if_then_else (ne (cc0)
                           (const_int 0))
@@ -5246,7 +5397,7 @@ (define_peephole ; "*cpse.ne"
   "!AVR_HAVE_JMP_CALL
    || !avr_current_device->errata_skip"
   {
-    if (operands[2] == const0_rtx)
+    if (operands[2] == CONST0_RTX (<MODE>mode))
       operands[2] = zero_reg_rtx;
 
     return 3 == avr_jump_mode (operands[0], insn)
@@ -6265,4 +6416,8 @@ (define_insn_and_split "*extzv.qihi2"
   })
 
 
+;; Fixed-point instructions
+(include "avr-fixed.md")
+
+;; Operations on 64-bit registers
 (include "avr-dimode.md")
Index: gcc/config/avr/avr-modes.def
===================================================================
--- gcc/config/avr/avr-modes.def	(revision 190535)
+++ gcc/config/avr/avr-modes.def	(working copy)
@@ -1 +1,28 @@
 FRACTIONAL_INT_MODE (PSI, 24, 3);
+
+/* On 8 bit machines it requires fewer instructions for fixed point
+   routines if the decimal place is on a byte boundary which is not
+   the default for signed accum types.  */
+
+ADJUST_IBIT (HA, 7);
+ADJUST_FBIT (HA, 8);
+
+ADJUST_IBIT (SA, 15);
+ADJUST_FBIT (SA, 16);
+
+ADJUST_IBIT (DA, 31);
+ADJUST_FBIT (DA, 32);
+
+/* Make TA and UTA 64 bits wide.
+   128 bit wide modes would be insane on a 8-bit machine.
+   This needs special treatment in avr.c and avr-lib.h.  */
+
+ADJUST_BYTESIZE  (TA, 8);
+ADJUST_ALIGNMENT (TA, 1);
+ADJUST_IBIT (TA, 15);
+ADJUST_FBIT (TA, 48);
+
+ADJUST_BYTESIZE  (UTA, 8);
+ADJUST_ALIGNMENT (UTA, 1);
+ADJUST_IBIT (UTA, 16);
+ADJUST_FBIT (UTA, 48);
Index: gcc/config/avr/avr-protos.h
===================================================================
--- gcc/config/avr/avr-protos.h	(revision 190535)
+++ gcc/config/avr/avr-protos.h	(working copy)
@@ -79,6 +79,9 @@ extern const char* avr_load_lpm (rtx, rt
 
 extern bool avr_rotate_bytes (rtx operands[]);
 
+extern const char* avr_out_fract (rtx, rtx[], bool, int*);
+extern rtx avr_to_int_mode (rtx);
+
 extern void expand_prologue (void);
 extern void expand_epilogue (bool);
 extern bool avr_emit_movmemhi (rtx*);
@@ -92,6 +95,8 @@ extern const char* avr_out_plus (rtx*, i
 extern const char* avr_out_plus_noclobber (rtx*, int*, int*);
 extern const char* avr_out_plus64 (rtx, int*);
 extern const char* avr_out_addto_sp (rtx*, int*);
+extern const char* avr_out_minus (rtx*, int*, int*);
+extern const char* avr_out_minus64 (rtx, int*);
 extern const char* avr_out_xload (rtx, rtx*, int*);
 extern const char* avr_out_movmem (rtx, rtx*, int*);
 extern const char* avr_out_insert_bits (rtx*, int*);
Index: gcc/config/avr/constraints.md
===================================================================
--- gcc/config/avr/constraints.md	(revision 190535)
+++ gcc/config/avr/constraints.md	(working copy)
@@ -192,3 +192,47 @@ (define_constraint "C0f"
   "32-bit integer constant where no nibble equals 0xf."
   (and (match_code "const_int")
        (match_test "!avr_has_nibble_0xf (op)")))
+
+;; CONST_FIXED is no element of 'n' so cook our own.
+;; "i" or "s" would match but because the insn uses iterators that cover
+;; INT_MODE, "i" or "s" is not always possible.
+
+(define_constraint "Ynn"
+  "Fixed-point constant known at compile time."
+  (match_code "const_fixed"))
+
+(define_constraint "Y00"
+  "Fixed-point or integer constant with bit representation 0x0"
+  (and (match_code "const_fixed,const_int")
+       (match_test "op == CONST0_RTX (GET_MODE (op))")))
+
+(define_constraint "Y01"
+  "Fixed-point or integer constant with bit representation 0x1"
+  (ior (and (match_code "const_fixed")
+            (match_test "1 == INTVAL (avr_to_int_mode (op))"))
+       (match_test "satisfies_constraint_P (op)")))
+
+(define_constraint "Ym1"
+  "Fixed-point or integer constant with bit representation -0x1"
+  (ior (and (match_code "const_fixed")
+            (match_test "-1 == INTVAL (avr_to_int_mode (op))"))
+       (match_test "satisfies_constraint_N (op)")))
+
+(define_constraint "Y02"
+  "Fixed-point or integer constant with bit representation 0x2"
+  (ior (and (match_code "const_fixed")
+            (match_test "2 == INTVAL (avr_to_int_mode (op))"))
+       (match_test "satisfies_constraint_K (op)")))
+
+(define_constraint "Ym2"
+  "Fixed-point or integer constant with bit representation -0x2"
+  (ior (and (match_code "const_fixed")
+            (match_test "-2 == INTVAL (avr_to_int_mode (op))"))
+       (match_test "satisfies_constraint_Cm2 (op)")))
+
+;; Similar to "IJ" used with ADIW/SBIW, but for CONST_FIXED.
+
+(define_constraint "YIJ"
+  "Fixed-point constant from @minus{}0x003f to 0x003f."
+  (and (match_code "const_fixed")
+       (match_test "IN_RANGE (INTVAL (avr_to_int_mode (op)), -63, 63)")))
Index: gcc/config/avr/avr.c
===================================================================
--- gcc/config/avr/avr.c	(revision 190535)
+++ gcc/config/avr/avr.c	(working copy)
@@ -49,6 +49,10 @@
 #include "params.h"
 #include "df.h"
 
+#ifndef CONST_FIXED_P
+#define CONST_FIXED_P(X) (CONST_FIXED == GET_CODE (X))
+#endif
+
 /* Maximal allowed offset for an address in the LD command */
 #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE))
 
@@ -264,6 +268,23 @@ avr_popcount_each_byte (rtx xval, int n_
   return true;
 }
 
+
+/* Access some RTX as INT_MODE.  If X is a CONST_FIXED we can get
+   the bit representation of X by "casting" it to CONST_INT.  */
+
+rtx
+avr_to_int_mode (rtx x)
+{
+  enum machine_mode mode = GET_MODE (x);
+
+  return VOIDmode == mode
+    ? x
+    : simplify_gen_subreg (int_mode_for_mode (mode), x, mode, 0);
+}
+
+
+/* Implement `TARGET_OPTION_OVERRIDE'.  */
+
 static void
 avr_option_override (void)
 {
@@ -389,9 +410,14 @@ avr_regno_reg_class (int r)
 }
 
 
+/* Implement `TARGET_SCALAR_MODE_SUPPORTED_P'.  */
+
 static bool
 avr_scalar_mode_supported_p (enum machine_mode mode)
 {
+  if (ALL_FIXED_POINT_MODE_P (mode))
+    return true;
+
   if (PSImode == mode)
     return true;
 
@@ -715,6 +741,58 @@ avr_initial_elimination_offset (int from
     }
 }
 
+
+/* Helper for the function below.  */
+
+static void
+avr_adjust_type_node (tree *node, enum machine_mode mode, int sat_p)
+{
+  *node = make_node (FIXED_POINT_TYPE);
+  TYPE_SATURATING (*node) = sat_p;
+  TYPE_UNSIGNED (*node) = UNSIGNED_FIXED_POINT_MODE_P (mode);
+  TYPE_IBIT (*node) = GET_MODE_IBIT (mode);
+  TYPE_FBIT (*node) = GET_MODE_FBIT (mode);
+  TYPE_PRECISION (*node) = GET_MODE_BITSIZE (mode);
+  TYPE_ALIGN (*node) = 8;
+  SET_TYPE_MODE (*node, mode);
+
+  layout_type (*node);
+}
+
+
+/* Implement `TARGET_BUILD_BUILTIN_VA_LIST'.  */
+
+static tree
+avr_build_builtin_va_list (void)
+{
+  /* avr-modes.def adjusts [U]TA to be 64-bit modes with 48 fractional bits.
+     This is more appropriate for the 8-bit machine AVR than 128-bit modes.
+     The ADJUST_IBIT/FBIT are handled in toplev:init_adjust_machine_modes()
+     which is auto-generated by genmodes, but the compiler assigns [U]DAmode
+     to the long long accum modes instead of the desired [U]TAmode.
+
+     Fix this now, right after node setup in tree.c:build_common_tree_nodes().
+     This must run before c-cppbuiltin.c:builtin_define_fixed_point_constants()
+     which built-in defines macros like __ULLACCUM_FBIT__ that are used by
+     libgcc to detect IBIT and FBIT.  */
+
+  avr_adjust_type_node (&ta_type_node, TAmode, 0);
+  avr_adjust_type_node (&uta_type_node, UTAmode, 0);
+  avr_adjust_type_node (&sat_ta_type_node, TAmode, 1);
+  avr_adjust_type_node (&sat_uta_type_node, UTAmode, 1);
+
+  unsigned_long_long_accum_type_node = uta_type_node;
+  long_long_accum_type_node = ta_type_node;
+  sat_unsigned_long_long_accum_type_node = sat_uta_type_node;
+  sat_long_long_accum_type_node = sat_ta_type_node;
+
+  /* Dispatch to the default handler.  */
+  
+  return std_build_builtin_va_list ();
+}
+
+
+/* Implement `TARGET_BUILTIN_SETJMP_FRAME_VALUE'.  */
 /* Actual start of frame is virtual_stack_vars_rtx this is offset from 
    frame pointer by +STARTING_FRAME_OFFSET.
    Using saved frame = virtual_stack_vars_rtx - STARTING_FRAME_OFFSET
@@ -723,10 +801,13 @@ avr_initial_elimination_offset (int from
 static rtx
 avr_builtin_setjmp_frame_value (void)
 {
-  return gen_rtx_MINUS (Pmode, virtual_stack_vars_rtx, 
-                        gen_int_mode (STARTING_FRAME_OFFSET, Pmode));
+  rtx xval = gen_reg_rtx (Pmode);
+  emit_insn (gen_subhi3 (xval, virtual_stack_vars_rtx,
+                         gen_int_mode (STARTING_FRAME_OFFSET, Pmode)));
+  return xval;
 }
 
+
 /* Return contents of MEM at frame pointer + stack size + 1 (+2 if 3 byte PC).
    This is return address of function.  */
 rtx 
@@ -1580,7 +1661,7 @@ avr_legitimate_address_p (enum machine_m
                                   MEM, strict);
 
       if (strict
-          && DImode == mode
+          && GET_MODE_SIZE (mode) > 4
           && REG_X == REGNO (x))
         {
           ok = false;
@@ -2081,6 +2162,14 @@ avr_print_operand (FILE *file, rtx x, in
       /* Use normal symbol for direct address no linker trampoline needed */
       output_addr_const (file, x);
     }
+  else if (GET_CODE (x) == CONST_FIXED)
+    {
+      HOST_WIDE_INT ival = INTVAL (avr_to_int_mode (x));
+      if (code != 0)
+        output_operand_lossage ("Unsupported code '%c'for fixed-point:",
+                                code);
+      fprintf (file, HOST_WIDE_INT_PRINT_DEC, ival);
+    }
   else if (GET_CODE (x) == CONST_DOUBLE)
     {
       long val;
@@ -2116,6 +2205,7 @@ notice_update_cc (rtx body ATTRIBUTE_UNU
 
     case CC_OUT_PLUS:
     case CC_OUT_PLUS_NOCLOBBER:
+    case CC_MINUS:
     case CC_LDI:
       {
         rtx *op = recog_data.operand;
@@ -2139,6 +2229,11 @@ notice_update_cc (rtx body ATTRIBUTE_UNU
             cc = (enum attr_cc) icc;
             break;
 
+          case CC_MINUS:
+            avr_out_minus (op, &len_dummy, &icc);
+            cc = (enum attr_cc) icc;
+            break;
+
           case CC_LDI:
 
             cc = (op[1] == CONST0_RTX (GET_MODE (op[0]))
@@ -2779,9 +2874,11 @@ output_movqi (rtx insn, rtx operands[],
   if (real_l)
     *real_l = 1;
   
-  if (register_operand (dest, QImode))
+  gcc_assert (1 == GET_MODE_SIZE (GET_MODE (dest)));
+
+  if (REG_P (dest))
     {
-      if (register_operand (src, QImode)) /* mov r,r */
+      if (REG_P (src)) /* mov r,r */
 	{
 	  if (test_hard_reg_class (STACK_REG, dest))
 	    return "out %0,%1";
@@ -2803,7 +2900,7 @@ output_movqi (rtx insn, rtx operands[],
       rtx xop[2];
 
       xop[0] = dest;
-      xop[1] = src == const0_rtx ? zero_reg_rtx : src;
+      xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src;
 
       return out_movqi_mr_r (insn, xop, real_l);
     }
@@ -2825,6 +2922,8 @@ output_movhi (rtx insn, rtx xop[], int *
       return avr_out_lpm (insn, xop, plen);
     }
 
+  gcc_assert (2 == GET_MODE_SIZE (GET_MODE (dest)));
+
   if (REG_P (dest))
     {
       if (REG_P (src)) /* mov r,r */
@@ -2843,7 +2942,6 @@ output_movhi (rtx insn, rtx xop[], int *
               return TARGET_NO_INTERRUPTS
                 ? avr_asm_len ("out __SP_H__,%B1" CR_TAB
                                "out __SP_L__,%A1", xop, plen, -2)
-
                 : avr_asm_len ("in __tmp_reg__,__SREG__"  CR_TAB
                                "cli"                      CR_TAB
                                "out __SP_H__,%B1"         CR_TAB
@@ -2880,7 +2978,7 @@ output_movhi (rtx insn, rtx xop[], int *
       rtx xop[2];
 
       xop[0] = dest;
-      xop[1] = src == const0_rtx ? zero_reg_rtx : src;
+      xop[1] = src == CONST0_RTX (GET_MODE (dest)) ? zero_reg_rtx : src;
 
       return out_movhi_mr_r (insn, xop, plen);
     }
@@ -3403,9 +3501,10 @@ output_movsisf (rtx insn, rtx operands[]
   if (!l)
     l = &dummy;
   
-  if (register_operand (dest, VOIDmode))
+  gcc_assert (4 == GET_MODE_SIZE (GET_MODE (dest)));
+  if (REG_P (dest))
     {
-      if (register_operand (src, VOIDmode)) /* mov r,r */
+      if (REG_P (src)) /* mov r,r */
 	{
 	  if (true_regnum (dest) > true_regnum (src))
 	    {
@@ -3440,10 +3539,10 @@ output_movsisf (rtx insn, rtx operands[]
 	{
           return output_reload_insisf (operands, NULL_RTX, real_l);
         }
-      else if (GET_CODE (src) == MEM)
+      else if (MEM_P (src))
 	return out_movsi_r_mr (insn, operands, real_l); /* mov r,m */
     }
-  else if (GET_CODE (dest) == MEM)
+  else if (MEM_P (dest))
     {
       const char *templ;
 
@@ -4126,14 +4225,25 @@ avr_out_compare (rtx insn, rtx *xop, int
   rtx xval = xop[1];
   
   /* MODE of the comparison.  */
-  enum machine_mode mode = GET_MODE (xreg);
+  enum machine_mode mode;
 
   /* Number of bytes to operate on.  */
-  int i, n_bytes = GET_MODE_SIZE (mode);
+  int i, n_bytes = GET_MODE_SIZE (GET_MODE (xreg));
 
   /* Value (0..0xff) held in clobber register xop[2] or -1 if unknown.  */
   int clobber_val = -1;
 
+  /* Map fixed mode operands to integer operands with the same binary
+     representation.  They are easier to handle in the remainder.  */
+
+  if (CONST_FIXED == GET_CODE (xval))
+    {
+      xreg = avr_to_int_mode (xop[0]);
+      xval = avr_to_int_mode (xop[1]);
+    }
+  
+  mode = GET_MODE (xreg);
+
   gcc_assert (REG_P (xreg));
   gcc_assert ((CONST_INT_P (xval) && n_bytes <= 4)
               || (const_double_operand (xval, VOIDmode) && n_bytes == 8));
@@ -4143,7 +4253,7 @@ avr_out_compare (rtx insn, rtx *xop, int
 
   /* Comparisons == +/-1 and != +/-1 can be done similar to camparing
      against 0 by ORing the bytes.  This is one instruction shorter.
-     Notice that DImode comparisons are always against reg:DI 18
+     Notice that 64-bit comparisons are always against reg:ALL8 18 (ACC_A)
      and therefore don't use this.  */
 
   if (!test_hard_reg_class (LD_REGS, xreg)
@@ -5884,6 +5994,9 @@ avr_out_plus_1 (rtx *xop, int *plen, enu
   /* MODE of the operation.  */
   enum machine_mode mode = GET_MODE (xop[0]);
 
+  /* INT_MODE of the same size.  */
+  enum machine_mode imode = int_mode_for_mode (mode);
+
   /* Number of bytes to operate on.  */
   int i, n_bytes = GET_MODE_SIZE (mode);
 
@@ -5908,8 +6021,11 @@ avr_out_plus_1 (rtx *xop, int *plen, enu
   
   *pcc = (MINUS == code) ? CC_SET_CZN : CC_CLOBBER;
 
+  if (CONST_FIXED_P (xval))
+    xval = avr_to_int_mode (xval);
+
   if (MINUS == code)
-    xval = simplify_unary_operation (NEG, mode, xval, mode);
+    xval = simplify_unary_operation (NEG, imode, xval, imode);
 
   op[2] = xop[3];
 
@@ -5920,7 +6036,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu
     {
       /* We operate byte-wise on the destination.  */
       rtx reg8 = simplify_gen_subreg (QImode, xop[0], mode, i);
-      rtx xval8 = simplify_gen_subreg (QImode, xval, mode, i);
+      rtx xval8 = simplify_gen_subreg (QImode, xval, imode, i);
 
       /* 8-bit value to operate with this byte. */
       unsigned int val8 = UINTVAL (xval8) & GET_MODE_MASK (QImode);
@@ -5941,7 +6057,7 @@ avr_out_plus_1 (rtx *xop, int *plen, enu
           && i + 2 <= n_bytes
           && test_hard_reg_class (ADDW_REGS, reg8))
         {
-          rtx xval16 = simplify_gen_subreg (HImode, xval, mode, i);
+          rtx xval16 = simplify_gen_subreg (HImode, xval, imode, i);
           unsigned int val16 = UINTVAL (xval16) & GET_MODE_MASK (HImode);
 
           /* Registers R24, X, Y, Z can use ADIW/SBIW with constants < 64
@@ -6085,6 +6201,41 @@ avr_out_plus_noclobber (rtx *xop, int *p
 }
 
 
+/* Output subtraction of register XOP[0] and compile time constant XOP[2]:
+
+      XOP[0] = XOP[0] - XOP[2]
+
+   This is basically the same as `avr_out_plus' except that we subtract.
+   It's needed because (minus x const) is not mapped to (plus x -const)
+   for the fixed point modes.  */
+
+const char*
+avr_out_minus (rtx *xop, int *plen, int *pcc)
+{
+  rtx op[4];
+
+  if (pcc)
+    *pcc = (int) CC_SET_CZN;
+
+  if (REG_P (xop[2]))
+    return avr_asm_len ("sub %A0,%A2" CR_TAB
+                        "sbc %B0,%B2", xop, plen, -2);
+
+  if (!CONST_INT_P (xop[2])
+      && !CONST_FIXED_P (xop[2]))
+    return avr_asm_len ("subi %A0,lo8(%2)" CR_TAB
+                        "sbci %B0,hi8(%2)", xop, plen, -2);
+  
+  op[0] = avr_to_int_mode (xop[0]);
+  op[1] = avr_to_int_mode (xop[1]);
+  op[2] = gen_int_mode (-INTVAL (avr_to_int_mode (xop[2])),
+                        GET_MODE (op[0]));
+  op[3] = xop[3];
+
+  return avr_out_plus (op, plen, pcc);
+}
+
+
 /* Prepare operands of adddi3_const_insn to be used with avr_out_plus_1.  */
 
 const char*
@@ -6103,6 +6254,19 @@ avr_out_plus64 (rtx addend, int *plen)
   return "";
 }
 
+
+/* Prepare operands of subdi3_const_insn to be used with avr_out_plus64.  */
+
+const char*
+avr_out_minus64 (rtx subtrahend, int *plen)
+{
+  rtx xneg = avr_to_int_mode (subtrahend);
+  xneg = simplify_unary_operation (NEG, DImode, xneg, DImode);
+
+  return avr_out_plus64 (xneg, plen);
+}
+
+
 /* Output bit operation (IOR, AND, XOR) with register XOP[0] and compile
    time constant XOP[2]:
 
@@ -6442,6 +6606,349 @@ avr_rotate_bytes (rtx operands[])
     return true;
 }
 
+
+/* Outputs instructions needed for fixed point type conversion.
+   This includes converting between any fixed point type, as well
+   as converting to any integer type.  Conversion between integer
+   types is not supported.
+
+   The number of instructions generated depends on the types
+   being converted and the registers assigned to them.
+
+   The number of instructions required to complete the conversion
+   is least if the registers for source and destination are overlapping
+   and are aligned at the decimal place as actual movement of data is
+   completely avoided.  In some cases, the conversion may already be
+   complete without any instructions needed.
+
+   When converting to signed types from signed types, sign extension
+   is implemented.
+
+   Converting signed fractional types requires a bit shift if converting
+   to or from any unsigned fractional type because the decimal place is
+   shifted by 1 bit.  When the destination is a signed fractional, the sign
+   is stored in either the carry or T bit.  */
+
+const char*
+avr_out_fract (rtx insn, rtx operands[], bool intsigned, int *plen)
+{
+  int i;
+  bool sbit[2];
+  /* ilen: Length of integral part (in bytes)
+     flen: Length of fractional part (in bytes)
+     tlen: Length of operand (in bytes)
+     blen: Length of operand (in bits) */
+  int ilen[2], flen[2], tlen[2], blen[2];
+  int rdest, rsource, offset;
+  int start, end, dir;
+  bool sign_in_T = false, sign_in_Carry = false, sign_done = false;
+  bool widening_sign_extend = false;
+  int clrword = -1, lastclr = 0, clr = 0;
+  rtx xop[6];
+
+  const int dest = 0;
+  const int src = 1;
+
+  xop[dest] = operands[dest];
+  xop[src] = operands[src];
+
+  if (plen)
+    *plen = 0;
+
+  /* Determine format (integer and fractional parts)
+     of types needing conversion.  */
+
+  for (i = 0; i < 2; i++)
+    {
+      enum machine_mode mode = GET_MODE (xop[i]);
+
+      tlen[i] = GET_MODE_SIZE (mode);
+      blen[i] = GET_MODE_BITSIZE (mode);
+
+      if (SCALAR_INT_MODE_P (mode))
+        {
+          sbit[i] = intsigned;
+          ilen[i] = GET_MODE_SIZE (mode);
+          flen[i] = 0;
+        }
+      else if (ALL_SCALAR_FIXED_POINT_MODE_P (mode))
+        {
+          sbit[i] = SIGNED_SCALAR_FIXED_POINT_MODE_P (mode);
+          ilen[i] = (GET_MODE_IBIT (mode) + 1) / 8;
+          flen[i] = (GET_MODE_FBIT (mode) + 1) / 8;
+        }
+      else
+        fatal_insn ("unsupported fixed-point conversion", insn);
+    }
+
+  /* Perform sign extension if source and dest are both signed,
+     and there are more integer parts in dest than in source.  */
+
+  widening_sign_extend = sbit[dest] && sbit[src] && ilen[dest] > ilen[src];
+
+  rdest = REGNO (xop[dest]);
+  rsource = REGNO (xop[src]);
+  offset = flen[src] - flen[dest];
+
+  /* Position of MSB resp. sign bit.  */
+
+  xop[2] = GEN_INT (blen[dest] - 1);
+  xop[3] = GEN_INT (blen[src] - 1);
+
+  /* Store the sign bit if the destination is a signed fract and the source
+     has a sign in the integer part.  */
+
+  if (sbit[dest] && ilen[dest] == 0 && sbit[src] && ilen[src] > 0)
+    {
+      /* To avoid using BST and BLD if the source and destination registers
+         overlap or the source is unused after, we can use LSL to store the
+         sign bit in carry since we don't need the integral part of the source.
+         Restoring the sign from carry saves one BLD instruction below.  */
+
+      if (reg_unused_after (insn, xop[src])
+          || (rdest < rsource + tlen[src]
+              && rdest + tlen[dest] > rsource))
+        {
+          avr_asm_len ("lsl %T1%t3", xop, plen, 1);
+          sign_in_Carry = true;
+        }
+      else
+        {
+          avr_asm_len ("bst %T1%T3", xop, plen, 1);
+          sign_in_T = true;
+        }
+    }
+
+  /* Pick the correct direction to shift bytes.  */
+
+  if (rdest < rsource + offset)
+    {
+      dir = 1;
+      start = 0;
+      end = tlen[dest];
+    }
+  else
+    {
+      dir = -1;
+      start = tlen[dest] - 1;
+      end = -1;
+    }
+
+  /* Perform conversion by moving registers into place, clearing
+     destination registers that do not overlap with any source.  */
+
+  for (i = start; i != end; i += dir)
+    {
+      int destloc = rdest + i;
+      int sourceloc = rsource + i + offset;
+
+      /* Source register location is outside range of source register,
+         so clear this byte in the dest.  */
+
+      if (sourceloc < rsource
+          || sourceloc >= rsource + tlen[src])
+        {
+          if (AVR_HAVE_MOVW
+              && i + dir != end
+              && (sourceloc + dir < rsource
+                  || sourceloc + dir >= rsource + tlen[src])
+              && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2))
+                  || (dir == -1 && (destloc % 2) && (sourceloc % 2)))
+              && clrword != -1)
+            {
+              /* Use already cleared word to clear two bytes at a time.  */
+
+              int even_i = i & ~1;
+              int even_clrword = clrword & ~1;
+
+              xop[4] = GEN_INT (8 * even_i);
+              xop[5] = GEN_INT (8 * even_clrword);
+              avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1);
+              i += dir;
+            }
+          else
+            {
+              if (i == tlen[dest] - 1
+                  && widening_sign_extend
+                  && blen[src] - 1 - 8 * offset < 0)
+                {
+                  /* The SBRC below that sign-extends would come
+                     up with a negative bit number because the sign
+                     bit is out of reach.  ALso avoid some early-clobber
+                     situations because of premature CLR.  */
+
+                  if (reg_unused_after (insn, xop[src]))
+                    avr_asm_len ("lsl %T1%t3" CR_TAB
+                                 "sbc %T0%t2,%T0%t2", xop, plen, 2);
+                  else
+                    avr_asm_len ("mov __tmp_reg__,%T1%t3"  CR_TAB
+                                 "lsl __tmp_reg__"         CR_TAB
+                                 "sbc %T0%t2,%T0%t2", xop, plen, 3);
+                  sign_done = true;
+
+                  continue;
+                }
+              
+              /* Do not clear the register if it is going to get
+                 sign extended with a MOV later.  */
+
+              if (sbit[dest] && sbit[src]
+                  && i != tlen[dest] - 1
+                  && i >= flen[dest])
+                {
+                  continue;
+                }
+
+              xop[4] = GEN_INT (8 * i);
+              avr_asm_len ("clr %T0%t4", xop, plen, 1);
+
+              /* If the last byte was cleared too, we have a cleared
+                 word we can MOVW to clear two bytes at a time.  */
+
+              if (lastclr) 
+                clrword = i;
+
+              clr = 1;
+            }
+        }
+      else if (destloc == sourceloc)
+        {
+          /* Source byte is already in destination:  Nothing needed.  */
+
+          continue;
+        }
+      else
+        {
+          /* Registers do not line up and source register location
+             is within range:  Perform move, shifting with MOV or MOVW.  */
+
+          if (AVR_HAVE_MOVW
+              && i + dir != end
+              && sourceloc + dir >= rsource
+              && sourceloc + dir < rsource + tlen[src]
+              && ((dir == 1 && !(destloc % 2) && !(sourceloc % 2))
+                  || (dir == -1 && (destloc % 2) && (sourceloc % 2))))
+            {
+              int even_i = i & ~1;
+              int even_i_plus_offset = (i + offset) & ~1;
+
+              xop[4] = GEN_INT (8 * even_i);
+              xop[5] = GEN_INT (8 * even_i_plus_offset);
+              avr_asm_len ("movw %T0%t4,%T1%t5", xop, plen, 1);
+              i += dir;
+            }
+          else
+            {
+              xop[4] = GEN_INT (8 * i);
+              xop[5] = GEN_INT (8 * (i + offset));
+              avr_asm_len ("mov %T0%t4,%T1%t5", xop, plen, 1);
+            }
+        }
+
+      lastclr = clr;
+      clr = 0;
+    }
+      
+  /* Perform sign extension if source and dest are both signed,
+     and there are more integer parts in dest than in source.  */
+
+  if (widening_sign_extend)
+    {
+      if (!sign_done)
+        {
+          xop[4] = GEN_INT (blen[src] - 1 - 8 * offset);
+
+          /* Register was cleared above, so can become 0xff and extended.
+             Note:  Instead of the CLR/SBRC/COM the sign extension could
+             be performed after the LSL below by means of a SBC if only
+             one byte has to be shifted left.  */
+
+          avr_asm_len ("sbrc %T0%T4" CR_TAB
+                       "com %T0%t2", xop, plen, 2);
+        }
+
+      /* Sign extend additional bytes by MOV and MOVW.  */
+
+      start = tlen[dest] - 2;
+      end = flen[dest] + ilen[src] - 1;
+
+      for (i = start; i != end; i--)
+        {
+          if (AVR_HAVE_MOVW && i != start && i-1 != end)
+            {
+              i--;
+              xop[4] = GEN_INT (8 * i);
+              xop[5] = GEN_INT (8 * (tlen[dest] - 2));
+              avr_asm_len ("movw %T0%t4,%T0%t5", xop, plen, 1);
+            }
+          else
+            {
+              xop[4] = GEN_INT (8 * i);
+              xop[5] = GEN_INT (8 * (tlen[dest] - 1));
+              avr_asm_len ("mov %T0%t4,%T0%t5", xop, plen, 1);
+            }
+        }
+    }
+
+  /* If destination is a signed fract, and the source was not, a shift
+     by 1 bit is needed.  Also restore sign from carry or T.  */
+
+  if (sbit[dest] && !ilen[dest] && (!sbit[src] || ilen[src]))
+    {
+      /* We have flen[src] non-zero fractional bytes to shift.
+         Because of the right shift, handle one byte more so that the
+         LSB won't be lost.  */
+
+      int nonzero = flen[src] + 1;
+
+      /* If the LSB is in the T flag and there are no fractional
+         bits, the high byte is zero and no shift needed.  */
+      
+      if (flen[src] == 0 && sign_in_T)
+        nonzero = 0;
+
+      start = flen[dest] - 1;
+      end = start - nonzero;
+
+      for (i = start; i > end && i >= 0; i--)
+        {
+          xop[4] = GEN_INT (8 * i);
+          if (i == start && !sign_in_Carry)
+            avr_asm_len ("lsr %T0%t4", xop, plen, 1);
+          else
+            avr_asm_len ("ror %T0%t4", xop, plen, 1);
+        }
+
+      if (sign_in_T)
+        {
+          avr_asm_len ("bld %T0%T2", xop, plen, 1);
+        }
+    }
+  else if (sbit[src] && !ilen[src] && (!sbit[dest] || ilen[dest]))
+    {
+      /* If source was a signed fract and dest was not, shift 1 bit
+         other way.  */
+
+      start = flen[dest] - flen[src];
+
+      if (start < 0)
+        start = 0;
+
+      for (i = start; i < flen[dest]; i++)
+        {
+          xop[4] = GEN_INT (8 * i);
+
+          if (i == start)
+            avr_asm_len ("lsl %T0%t4", xop, plen, 1);
+          else
+            avr_asm_len ("rol %T0%t4", xop, plen, 1);
+        }
+    }
+
+  return "";
+}
+
+
 /* Modifies the length assigned to instruction INSN
    LEN is the initially computed length of the insn.  */
 
@@ -6489,6 +6996,8 @@ adjust_insn_length (rtx insn, int len)
       
     case ADJUST_LEN_OUT_PLUS: avr_out_plus (op, &len, NULL); break;
     case ADJUST_LEN_PLUS64: avr_out_plus64 (op[0], &len); break;
+    case ADJUST_LEN_MINUS: avr_out_minus (op, &len, NULL); break;
+    case ADJUST_LEN_MINUS64: avr_out_minus64 (op[0], &len); break;
     case ADJUST_LEN_OUT_PLUS_NOCLOBBER:
       avr_out_plus_noclobber (op, &len, NULL); break;
 
@@ -6502,6 +7011,9 @@ adjust_insn_length (rtx insn, int len)
     case ADJUST_LEN_XLOAD: avr_out_xload (insn, op, &len); break;
     case ADJUST_LEN_LOAD_LPM: avr_load_lpm (insn, op, &len); break;
 
+    case ADJUST_LEN_SFRACT: avr_out_fract (insn, op, true, &len); break;
+    case ADJUST_LEN_UFRACT: avr_out_fract (insn, op, false, &len); break;
+
     case ADJUST_LEN_TSTHI: avr_out_tsthi (insn, op, &len); break;
     case ADJUST_LEN_TSTPSI: avr_out_tstpsi (insn, op, &len); break;
     case ADJUST_LEN_TSTSI: avr_out_tstsi (insn, op, &len); break;
@@ -6683,6 +7195,20 @@ avr_assemble_integer (rtx x, unsigned in
       
       return true;
     }
+  else if (CONST_FIXED_P (x))
+    {
+      unsigned n;
+
+      /* varasm fails to handle big fixed modes that don't fit in hwi.  */
+
+      for (n = 0; n < size; n++)
+        {
+          rtx xn = simplify_gen_subreg (QImode, x, GET_MODE (x), n);
+          default_assemble_integer (xn, 1, aligned_p);
+        }
+
+      return true;
+    }
   
   return default_assemble_integer (x, size, aligned_p);
 }
@@ -7489,6 +8015,7 @@ avr_operand_rtx_cost (rtx x, enum machin
       return 0;
 
     case CONST_INT:
+    case CONST_FIXED:
     case CONST_DOUBLE:
       return COSTS_N_INSNS (GET_MODE_SIZE (mode));
 
@@ -7518,6 +8045,7 @@ avr_rtx_costs_1 (rtx x, int codearg, int
   switch (code)
     {
     case CONST_INT:
+    case CONST_FIXED:
     case CONST_DOUBLE:
     case SYMBOL_REF:
     case CONST:
@@ -8446,11 +8974,17 @@ avr_compare_pattern (rtx insn)
   if (pattern
       && NONJUMP_INSN_P (insn)
       && SET_DEST (pattern) == cc0_rtx
-      && GET_CODE (SET_SRC (pattern)) == COMPARE
-      && DImode != GET_MODE (XEXP (SET_SRC (pattern), 0))
-      && DImode != GET_MODE (XEXP (SET_SRC (pattern), 1)))
+      && GET_CODE (SET_SRC (pattern)) == COMPARE)
     {
-      return pattern;
+      enum machine_mode mode0 = GET_MODE (XEXP (SET_SRC (pattern), 0));
+      enum machine_mode mode1 = GET_MODE (XEXP (SET_SRC (pattern), 1));
+
+      /* The 64-bit comparisons have fixed operands ACC_A and ACC_B.
+         They must not be swapped, thus skip them.  */
+
+      if ((mode0 == VOIDmode || GET_MODE_SIZE (mode0) <= 4)
+          && (mode1 == VOIDmode || GET_MODE_SIZE (mode1) <= 4))
+        return pattern;
     }
 
   return NULL_RTX;
@@ -8788,6 +9322,8 @@ avr_2word_insn_p (rtx insn)
       return false;
       
     case CODE_FOR_movqi_insn:
+    case CODE_FOR_movuqq_insn:
+    case CODE_FOR_movqq_insn:
       {
         rtx set  = single_set (insn);
         rtx src  = SET_SRC (set);
@@ -8796,7 +9332,7 @@ avr_2word_insn_p (rtx insn)
         /* Factor out LDS and STS from movqi_insn.  */
         
         if (MEM_P (dest)
-            && (REG_P (src) || src == const0_rtx))
+            && (REG_P (src) || src == CONST0_RTX (GET_MODE (dest))))
           {
             return CONSTANT_ADDRESS_P (XEXP (dest, 0));
           }
@@ -9021,7 +9557,7 @@ output_reload_in_const (rtx *op, rtx clo
   
   if (NULL_RTX == clobber_reg
       && !test_hard_reg_class (LD_REGS, dest)
-      && (! (CONST_INT_P (src) || CONST_DOUBLE_P (src))
+      && (! (CONST_INT_P (src) || CONST_FIXED_P (src) || CONST_DOUBLE_P (src))
           || !avr_popcount_each_byte (src, n_bytes,
                                       (1 << 0) | (1 << 1) | (1 << 8))))
     {
@@ -9048,6 +9584,7 @@ output_reload_in_const (rtx *op, rtx clo
       ldreg_p = test_hard_reg_class (LD_REGS, xdest[n]);
 
       if (!CONST_INT_P (src)
+          && !CONST_FIXED_P (src)
           && !CONST_DOUBLE_P (src))
         {
           static const char* const asm_code[][2] =
@@ -9239,6 +9776,7 @@ output_reload_insisf (rtx *op, rtx clobb
   if (AVR_HAVE_MOVW
       && !test_hard_reg_class (LD_REGS, op[0])
       && (CONST_INT_P (op[1])
+          || CONST_FIXED_P (op[1])
           || CONST_DOUBLE_P (op[1])))
     {
       int len_clr, len_noclr;
@@ -10834,6 +11372,12 @@ avr_fold_builtin (tree fndecl, int n_arg
 #undef  TARGET_SCALAR_MODE_SUPPORTED_P
 #define TARGET_SCALAR_MODE_SUPPORTED_P avr_scalar_mode_supported_p
 
+#undef  TARGET_BUILD_BUILTIN_VA_LIST
+#define TARGET_BUILD_BUILTIN_VA_LIST avr_build_builtin_va_list
+
+#undef  TARGET_FIXED_POINT_SUPPORTED_P
+#define TARGET_FIXED_POINT_SUPPORTED_P hook_bool_void_true
+
 #undef  TARGET_ADDR_SPACE_SUBSET_P
 #define TARGET_ADDR_SPACE_SUBSET_P avr_addr_space_subset_p
 
Index: gcc/config/avr/avr.h
===================================================================
--- gcc/config/avr/avr.h	(revision 190535)
+++ gcc/config/avr/avr.h	(working copy)
@@ -261,6 +261,7 @@ enum
 #define FLOAT_TYPE_SIZE 32
 #define DOUBLE_TYPE_SIZE 32
 #define LONG_DOUBLE_TYPE_SIZE 32
+#define LONG_LONG_ACCUM_TYPE_SIZE 64
 
 #define DEFAULT_SIGNED_CHAR 1
 
Index: libgcc/config/avr/avr-lib.h
===================================================================
--- libgcc/config/avr/avr-lib.h	(revision 190620)
+++ libgcc/config/avr/avr-lib.h	(working copy)
@@ -4,3 +4,79 @@
 #define DI SI
 typedef int QItype __attribute__ ((mode (QI)));
 #endif
+
+/* fixed-bit.h does not define functions for TA and UTA because
+   that part is wrapped in #if MIN_UNITS_PER_WORD > 4.
+   This would lead to empty functions for TA and UTA.
+   Thus, supply appropriate defines as if HAVE_[U]TA == 1.
+   #define HAVE_[U]TA 1 won't work because avr-modes.def
+   uses ADJUST_BYTESIZE(TA,8) and fixed-bit.h is not generic enough
+   to arrange for such changes of the mode size.  */
+
+typedef unsigned _Fract UTAtype __attribute__ ((mode (UTA)));
+
+#if defined (UTA_MODE)
+#define FIXED_SIZE      8       /* in bytes */
+#define INT_C_TYPE      UDItype
+#define UINT_C_TYPE     UDItype
+#define HINT_C_TYPE     USItype
+#define HUINT_C_TYPE    USItype
+#define MODE_NAME       UTA
+#define MODE_NAME_S     uta
+#define MODE_UNSIGNED   1
+#endif
+
+#if defined (FROM_UTA)
+#define FROM_TYPE               4       /* ID for fixed-point */
+#define FROM_MODE_NAME          UTA
+#define FROM_MODE_NAME_S        uta
+#define FROM_INT_C_TYPE         UDItype
+#define FROM_SINT_C_TYPE        DItype
+#define FROM_UINT_C_TYPE        UDItype
+#define FROM_MODE_UNSIGNED      1
+#define FROM_FIXED_SIZE         8       /* in bytes */
+#elif defined (TO_UTA)
+#define TO_TYPE                 4       /* ID for fixed-point */
+#define TO_MODE_NAME            UTA
+#define TO_MODE_NAME_S          uta
+#define TO_INT_C_TYPE           UDItype
+#define TO_SINT_C_TYPE          DItype
+#define TO_UINT_C_TYPE          UDItype
+#define TO_MODE_UNSIGNED        1
+#define TO_FIXED_SIZE           8       /* in bytes */
+#endif
+
+/* Same for TAmode */
+
+typedef _Fract TAtype  __attribute__ ((mode (TA)));
+
+#if defined (TA_MODE)
+#define FIXED_SIZE      8       /* in bytes */
+#define INT_C_TYPE      DItype
+#define UINT_C_TYPE     UDItype
+#define HINT_C_TYPE     SItype
+#define HUINT_C_TYPE    USItype
+#define MODE_NAME       TA
+#define MODE_NAME_S     ta
+#define MODE_UNSIGNED   0
+#endif
+
+#if defined (FROM_TA)
+#define FROM_TYPE               4       /* ID for fixed-point */
+#define FROM_MODE_NAME          TA
+#define FROM_MODE_NAME_S        ta
+#define FROM_INT_C_TYPE         DItype
+#define FROM_SINT_C_TYPE        DItype
+#define FROM_UINT_C_TYPE        UDItype
+#define FROM_MODE_UNSIGNED      0
+#define FROM_FIXED_SIZE         8       /* in bytes */
+#elif defined (TO_TA)
+#define TO_TYPE                 4       /* ID for fixed-point */
+#define TO_MODE_NAME            TA
+#define TO_MODE_NAME_S          ta
+#define TO_INT_C_TYPE           DItype
+#define TO_SINT_C_TYPE          DItype
+#define TO_UINT_C_TYPE          UDItype
+#define TO_MODE_UNSIGNED        0
+#define TO_FIXED_SIZE           8       /* in bytes */
+#endif
Index: libgcc/config/avr/lib1funcs-fixed.S
===================================================================
--- libgcc/config/avr/lib1funcs-fixed.S	(revision 0)
+++ libgcc/config/avr/lib1funcs-fixed.S	(revision 0)
@@ -0,0 +1,874 @@
+/*  -*- Mode: Asm -*-  */
+;;    Copyright (C) 2012
+;;    Free Software Foundation, Inc.
+;;    Contributed by Sean D'Epagnier  (sean@depagnier.com)
+;;                   Georg-Johann Lay (avr@gjlay.de)
+
+;; This file is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by the
+;; Free Software Foundation; either version 3, or (at your option) any
+;; later version.
+
+;; In addition to the permissions in the GNU General Public License, the
+;; Free Software Foundation gives you unlimited permission to link the
+;; compiled version of this file into combinations with other programs,
+;; and to distribute those combinations without any restriction coming
+;; from the use of this file.  (The General Public License restrictions
+;; do apply in other respects; for example, they cover modification of
+;; the file, and distribution when not linked into a combine
+;; executable.)
+
+;; This file is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with this program; see the file COPYING.  If not, write to
+;; the Free Software Foundation, 51 Franklin Street, Fifth Floor,
+;; Boston, MA 02110-1301, USA.
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Fixed point library routines for AVR
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+.section .text.libgcc.fixed, "ax", @progbits
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Conversions to float
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+#if defined (L_fractqqsf)
+DEFUN __fractqqsf
+    ;; Move in place for SA -> SF conversion
+    clr     r22
+    mov     r23, r24
+    lsl     r23
+    ;; Sign-extend
+    sbc     r24, r24
+    mov     r25, r24
+    XJMP    __fractsasf
+ENDF __fractqqsf
+#endif  /* L_fractqqsf */
+
+#if defined (L_fractuqqsf)
+DEFUN __fractuqqsf
+    ;; Move in place for USA -> SF conversion
+    clr     r22
+    mov     r23, r24
+    ;; Zero-extend
+    clr     r24
+    clr     r25
+    XJMP    __fractusasf
+ENDF __fractuqqsf
+#endif  /* L_fractuqqsf */
+
+#if defined (L_fracthqsf)
+DEFUN __fracthqsf
+    ;; Move in place for SA -> SF conversion
+    wmov    22, 24
+    lsl     r22
+    rol     r23
+    ;; Sign-extend
+    sbc     r24, r24
+    mov     r25, r24
+    XJMP    __fractsasf
+ENDF __fracthqsf
+#endif  /* L_fracthqsf */
+
+#if defined (L_fractuhqsf)
+DEFUN __fractuhqsf
+    ;; Move in place for USA -> SF conversion
+    wmov    22, 24
+    ;; Zero-extend
+    clr     r24
+    clr     r25
+    XJMP    __fractusasf
+ENDF __fractuhqsf
+#endif  /* L_fractuhqsf */
+
+#if defined (L_fracthasf)
+DEFUN __fracthasf
+    ;; Move in place for SA -> SF conversion
+    clr     r22
+    mov     r23, r24
+    mov     r24, r25
+    ;; Sign-extend
+    lsl     r25
+    sbc     r25, r25
+    XJMP    __fractsasf
+ENDF __fracthasf
+#endif  /* L_fracthasf */
+
+#if defined (L_fractuhasf)
+DEFUN __fractuhasf
+    ;; Move in place for USA -> SF conversion
+    clr     r22
+    mov     r23, r24
+    mov     r24, r25
+    ;; Zero-extend
+    clr     r25
+    XJMP    __fractusasf
+ENDF __fractuhasf
+#endif  /* L_fractuhasf */
+
+
+#if defined (L_fractsqsf)
+DEFUN __fractsqsf
+    XCALL   __floatsisf
+    ;; Divide non-zero results by 2^31 to move the
+    ;; decimal point into place
+    tst     r25
+    breq    0f
+    subi    r24, exp_lo (31)
+    sbci    r25, exp_hi (31)
+0:  ret
+ENDF __fractsqsf
+#endif  /* L_fractsqsf */
+
+#if defined (L_fractusqsf)
+DEFUN __fractusqsf
+    XCALL   __floatunsisf
+    ;; Divide non-zero results by 2^32 to move the
+    ;; decimal point into place
+    cpse    r25, __zero_reg__
+    subi    r25, exp_hi (32)
+    ret
+ENDF __fractusqsf
+#endif  /* L_fractusqsf */
+
+#if defined (L_fractsasf)
+DEFUN __fractsasf
+    XCALL   __floatsisf
+    ;; Divide non-zero results by 2^16 to move the
+    ;; decimal point into place
+    cpse    r25, __zero_reg__
+    subi    r25, exp_hi (16)
+    ret
+ENDF __fractsasf
+#endif  /* L_fractsasf */
+
+#if defined (L_fractusasf)
+DEFUN __fractusasf
+    XCALL   __floatunsisf
+    ;; Divide non-zero results by 2^16 to move the
+    ;; decimal point into place
+    cpse    r25, __zero_reg__
+    subi    r25, exp_hi (16)
+    ret
+ENDF __fractusasf
+#endif  /* L_fractusasf */
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Conversions from float
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+       
+#if defined (L_fractsfqq)
+DEFUN __fractsfqq
+    ;; Multiply with 2^{24+7} to get a QQ result in r25
+    subi    r24, exp_lo (-31)
+    sbci    r25, exp_hi (-31)
+    XCALL   __fixsfsi
+    mov     r24, r25
+    ret
+ENDF __fractsfqq
+#endif  /* L_fractsfqq */
+
+#if defined (L_fractsfuqq)
+DEFUN __fractsfuqq
+    ;; Multiply with 2^{24+8} to get a UQQ result in r25
+    subi    r25, exp_hi (-32)
+    XCALL   __fixunssfsi
+    mov     r24, r25
+    ret
+ENDF __fractsfuqq
+#endif  /* L_fractsfuqq */
+
+#if defined (L_fractsfha)
+DEFUN __fractsfha
+    ;; Multiply with 2^24 to get a HA result in r25:r24
+    subi    r25, exp_hi (-24)
+    XJMP    __fixsfsi
+ENDF __fractsfha
+#endif  /* L_fractsfha */
+
+#if defined (L_fractsfuha)
+DEFUN __fractsfuha
+    ;; Multiply with 2^24 to get a UHA result in r25:r24
+    subi    r25, exp_hi (-24)
+    XJMP    __fixunssfsi
+ENDF __fractsfuha
+#endif  /* L_fractsfuha */
+
+#if defined (L_fractsfhq)
+DEFUN __fractsfsq
+ENDF  __fractsfsq
+
+DEFUN __fractsfhq
+    ;; Multiply with 2^{16+15} to get a HQ result in r25:r24
+    ;; resp. with 2^31 to get a SQ result in r25:r22
+    subi    r24, exp_lo (-31)
+    sbci    r25, exp_hi (-31)
+    XJMP    __fixsfsi
+ENDF __fractsfhq
+#endif  /* L_fractsfhq */
+
+#if defined (L_fractsfuhq)
+DEFUN __fractsfusq
+ENDF  __fractsfusq
+
+DEFUN __fractsfuhq
+    ;; Multiply with 2^{16+16} to get a UHQ result in r25:r24
+    ;; resp. with 2^32 to get a USQ result in r25:r22
+    subi    r25, exp_hi (-32)
+    XJMP    __fixunssfsi
+ENDF __fractsfuhq
+#endif  /* L_fractsfuhq */
+
+#if defined (L_fractsfsa)
+DEFUN __fractsfsa
+    ;; Multiply with 2^16 to get a SA result in r25:r22
+    subi    r25, exp_hi (-16)
+    XJMP    __fixsfsi
+ENDF __fractsfsa
+#endif  /* L_fractsfsa */
+
+#if defined (L_fractsfusa)
+DEFUN __fractsfusa
+    ;; Multiply with 2^16 to get a USA result in r25:r22
+    subi    r25, exp_hi (-16)
+    XJMP    __fixunssfsi
+ENDF __fractsfusa
+#endif  /* L_fractsfusa */
+
+
+;; For multiplication the functions here are called directly from
+;; avr-fixed.md instead of using the standard libcall mechanisms.
+;; This can make better code because GCC knows exactly which
+;; of the call-used registers (not all of them) are clobbered.  */
+
+/*******************************************************
+    Fractional  Multiplication  8 x 8  without MUL
+*******************************************************/
+
+#if defined (L_mulqq3) && !defined (__AVR_HAVE_MUL__)
+;;; R23 = R24 * R25
+;;; Clobbers: __tmp_reg__, R22, R24, R25
+;;; Rounding: ???
+DEFUN __mulqq3
+    XCALL   __fmuls
+    ;; TR 18037 requires that  (-1) * (-1)  does not overflow
+    ;; The only input that can produce  -1  is  (-1)^2.
+    dec     r23
+    brvs    0f
+    inc     r23
+0:  ret
+ENDF  __mulqq3
+#endif /* L_mulqq3 && ! HAVE_MUL */
+
+/*******************************************************
+    Fractional Multiply  .16 x .16  with and without MUL
+*******************************************************/
+
+#if defined (L_mulhq3)
+;;; Same code with and without MUL, but the interfaces differ:
+;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25)
+;;;         Clobbers: ABI, called by optabs
+;;; MUL:    (R25:R24) = (R19:R18) * (R27:R26)
+;;;         Clobbers: __tmp_reg__, R22, R23
+;;; Rounding:  -0.5 LSB  <= error  <=  0.5 LSB
+DEFUN   __mulhq3
+    XCALL   __mulhisi3
+    ;; Shift result into place
+    lsl     r23
+    rol     r24
+    rol     r25
+    brvs    1f
+    ;; Round
+    sbrc    r23, 7
+    adiw    r24, 1
+    ret
+1:  ;; Overflow.  TR 18037 requires  (-1)^2  not to overflow
+    ldi     r24, lo8 (0x7fff)
+    ldi     r25, hi8 (0x7fff)
+    ret
+ENDF __mulhq3
+#endif  /* defined (L_mulhq3) */
+
+#if defined (L_muluhq3)
+;;; Same code with and without MUL, but the interfaces differ:
+;;; no MUL: (R25:R24) *= (R23:R22)
+;;;         Clobbers: ABI, called by optabs
+;;; MUL:    (R25:R24) = (R19:R18) * (R27:R26)
+;;;         Clobbers: __tmp_reg__, R22, R23
+;;; Rounding:  -0.5 LSB  <  error  <=  0.5 LSB
+DEFUN   __muluhq3
+    XCALL   __umulhisi3
+    ;; Round
+    sbrc    r23, 7
+    adiw    r24, 1
+    ret
+ENDF __muluhq3
+#endif  /* L_muluhq3 */
+
+
+/*******************************************************
+    Fixed  Multiply  8.8 x 8.8  with and without MUL
+*******************************************************/
+
+#if defined (L_mulha3)
+;;; Same code with and without MUL, but the interfaces differ:
+;;; no MUL: (R25:R24) = (R22:R23) * (R24:R25)
+;;;         Clobbers: ABI, called by optabs
+;;; MUL:    (R25:R24) = (R19:R18) * (R27:R26)
+;;;         Clobbers: __tmp_reg__, R22, R23
+;;; Rounding:  -0.5 LSB  <=  error  <=  0.5 LSB
+DEFUN   __mulha3
+    XCALL   __mulhisi3
+    XJMP    __muluha3_round
+ENDF __mulha3
+#endif  /* L_mulha3 */
+
+#if defined (L_muluha3)
+;;; Same code with and without MUL, but the interfaces differ:
+;;; no MUL: (R25:R24) *= (R23:R22)
+;;;         Clobbers: ABI, called by optabs
+;;; MUL:    (R25:R24) = (R19:R18) * (R27:R26)
+;;;         Clobbers: __tmp_reg__, R22, R23
+;;; Rounding:  -0.5 LSB  <  error  <=  0.5 LSB
+DEFUN   __muluha3
+    XCALL   __umulhisi3
+    XJMP    __muluha3_round
+ENDF __muluha3
+#endif  /* L_muluha3 */
+
+#if defined (L_muluha3_round)
+DEFUN   __muluha3_round
+    ;; Shift result into place
+    mov     r25, r24
+    mov     r24, r23
+    ;; Round
+    sbrc    r22, 7
+    adiw    r24, 1
+    ret
+ENDF __muluha3_round
+#endif  /* L_muluha3_round */
+
+
+/*******************************************************
+    Fixed  Multiplication  16.16 x 16.16
+*******************************************************/
+
+#if defined (__AVR_HAVE_MUL__)
+
+;; Multiplier
+#define A0  16
+#define A1  A0+1
+#define A2  A1+1
+#define A3  A2+1
+
+;; Multiplicand
+#define B0  20
+#define B1  B0+1
+#define B2  B1+1
+#define B3  B2+1
+
+;; Result
+#define C0  24
+#define C1  C0+1
+#define C2  C1+1
+#define C3  C2+1
+
+#if defined (L_mulusa3)
+;;; (C3:C0) = (A3:A0) * (B3:B0)
+;;; Clobbers: __tmp_reg__
+;;; Rounding:  -0.5 LSB  <  error  <=  0.5 LSB
+DEFUN   __mulusa3
+    ;; Some of the MUL instructions have LSBs outside the result.
+    ;; Don't ignore these LSBs in order to tame rounding error.
+    ;; Use C2/C3 for these LSBs.
+
+    clr C0
+    clr C1
+    mul A0, B0  $  movw C2, r0
+
+    mul A1, B0  $  add  C3, r0  $  adc C0, r1
+    mul A0, B1  $  add  C3, r0  $  adc C0, r1  $  rol C1
+    
+    ;; Round
+    sbrc C3, 7
+    adiw C0, 1
+    
+    ;; The following MULs don't have LSBs outside the result.
+    ;; C2/C3 is the high part.
+
+    mul  A0, B2  $  add C0, r0  $  adc C1, r1  $  sbc  C2, C2
+    mul  A1, B1  $  add C0, r0  $  adc C1, r1  $  sbci C2, 0
+    mul  A2, B0  $  add C0, r0  $  adc C1, r1  $  sbci C2, 0
+    neg  C2
+
+    mul  A0, B3  $  add C1, r0  $  adc C2, r1  $  sbc  C3, C3
+    mul  A1, B2  $  add C1, r0  $  adc C2, r1  $  sbci C3, 0
+    mul  A2, B1  $  add C1, r0  $  adc C2, r1  $  sbci C3, 0
+    mul  A3, B0  $  add C1, r0  $  adc C2, r1  $  sbci C3, 0
+    neg  C3
+    
+    mul  A1, B3  $  add C2, r0  $  adc C3, r1
+    mul  A2, B2  $  add C2, r0  $  adc C3, r1
+    mul  A3, B1  $  add C2, r0  $  adc C3, r1
+    
+    mul  A2, B3  $  add C3, r0
+    mul  A3, B2  $  add C3, r0
+
+    clr  __zero_reg__
+    ret
+ENDF __mulusa3
+#endif /* L_mulusa3 */
+
+#if defined (L_mulsa3)
+;;; (C3:C0) = (A3:A0) * (B3:B0)
+;;; Clobbers: __tmp_reg__
+;;; Rounding:  -0.5 LSB  <=  error  <=  0.5 LSB
+DEFUN __mulsa3
+    XCALL   __mulusa3
+    tst     B3
+    brpl    1f
+    sub     C2, A0
+    sbc     C3, A1
+1:  sbrs    A3, 7
+    ret
+    sub     C2, B0
+    sbc     C3, B1
+    ret
+ENDF __mulsa3
+#endif /* L_mulsa3 */
+
+#undef A0
+#undef A1
+#undef A2
+#undef A3
+#undef B0
+#undef B1
+#undef B2
+#undef B3
+#undef C0
+#undef C1
+#undef C2
+#undef C3
+
+#else /* __AVR_HAVE_MUL__ */
+
+#define A0 18
+#define A1 A0+1
+#define A2 A0+2
+#define A3 A0+3
+
+#define B0 22
+#define B1 B0+1
+#define B2 B0+2
+#define B3 B0+3
+
+#define C0  22
+#define C1  C0+1
+#define C2  C0+2
+#define C3  C0+3
+
+;; __tmp_reg__
+#define CC0  0
+;; __zero_reg__
+#define CC1  1
+#define CC2  16
+#define CC3  17
+
+#define AA0  26
+#define AA1  AA0+1
+#define AA2  30
+#define AA3  AA2+1
+
+#if defined (L_mulsa3)
+;;; (R25:R22)  *=  (R21:R18)
+;;; Clobbers: ABI, called by optabs
+;;; Rounding:  -1 LSB  <=  error  <=  1 LSB
+DEFUN   __mulsa3
+    push    B0
+    push    B1
+    bst     B3, 7
+    XCALL   __mulusa3
+    ;; A survived in  31:30:27:26
+    rcall 1f
+    pop     AA1
+    pop     AA0
+    bst     AA3, 7
+1:  brtc  9f
+    ;; 1-extend A/B
+    sub     C2, AA0
+    sbc     C3, AA1
+9:  ret
+ENDF __mulsa3
+#endif  /* L_mulsa3 */
+
+#if defined (L_mulusa3)
+;;; (R25:R22)  *=  (R21:R18)
+;;; Clobbers: ABI, called by optabs and __mulsua
+;;; Rounding:  -1 LSB  <=  error  <=  1 LSB
+;;; Does not clobber T and A[] survives in 26, 27, 30, 31
+DEFUN   __mulusa3
+    push    CC2
+    push    CC3
+    ; clear result
+    clr     __tmp_reg__
+    wmov    CC2, CC0
+    ; save multiplicand
+    wmov    AA0, A0
+    wmov    AA2, A2
+    rjmp 3f
+
+    ;; Loop the integral part
+
+1:  ;; CC += A * 2^n;  n >= 0
+    add  CC0,A0  $  adc CC1,A1  $  adc  CC2,A2  $  adc  CC3,A3
+
+2:  ;; A <<= 1
+    lsl  A0      $  rol A1      $  rol  A2      $  rol  A3
+
+3:  ;; IBIT(B) >>= 1
+    ;; Carry = n-th bit of B;  n >= 0
+    lsr     B3
+    ror     B2
+    brcs 1b
+    sbci    B3, 0
+    brne 2b
+
+    ;; Loop the fractional part
+    ;; B2/B3 is 0 now, use as guard bits for rounding
+    ;; Restore multiplicand
+    wmov    A0, AA0
+    wmov    A2, AA2
+    rjmp 5f
+
+4:  ;; CC += A:Guard * 2^n;  n < 0
+    add  B3,B2 $  adc  CC0,A0  $  adc  CC1,A1  $  adc  CC2,A2  $  adc  CC3,A3
+5:
+    ;; A:Guard >>= 1
+    lsr  A3   $  ror  A2  $  ror  A1  $  ror   A0  $   ror  B2
+
+    ;; FBIT(B) <<= 1
+    ;; Carry = n-th bit of B;  n < 0
+    lsl     B0
+    rol     B1
+    brcs 4b
+    sbci    B0, 0
+    brne 5b
+
+    ;; Move result into place and round
+    lsl     B3
+    wmov    C2, CC2
+    wmov    C0, CC0
+    clr     __zero_reg__
+    adc     C0, __zero_reg__
+    adc     C1, __zero_reg__
+    adc     C2, __zero_reg__
+    adc     C3, __zero_reg__
+    
+    ;; Epilogue
+    pop     CC3
+    pop     CC2
+    ret
+ENDF __mulusa3
+#endif  /* L_mulusa3 */
+
+#undef A0
+#undef A1
+#undef A2
+#undef A3
+#undef B0
+#undef B1
+#undef B2
+#undef B3
+#undef C0
+#undef C1
+#undef C2
+#undef C3
+#undef AA0
+#undef AA1
+#undef AA2
+#undef AA3
+#undef CC0
+#undef CC1
+#undef CC2
+#undef CC3
+
+#endif /* __AVR_HAVE_MUL__ */
+
+/*******************************************************
+      Fractional Division 8 / 8
+*******************************************************/
+
+#define r_divd  r25     /* dividend */
+#define r_quo   r24     /* quotient */
+#define r_div   r22     /* divisor */
+
+#if defined (L_divqq3)
+DEFUN   __divqq3
+    mov     r0, r_divd
+    eor     r0, r_div
+    sbrc    r_div, 7
+    neg     r_div
+    sbrc    r_divd, 7
+    neg     r_divd
+    cp      r_divd, r_div
+    breq    __divqq3_minus1  ; if equal return -1
+    XCALL   __udivuqq3
+    lsr     r_quo
+    sbrc    r0, 7   ; negate result if needed
+    neg     r_quo
+    ret
+__divqq3_minus1:
+    ldi     r_quo, 0x80
+    ret
+ENDF __divqq3
+#endif  /* defined (L_divqq3) */
+
+#if defined (L_udivuqq3)
+DEFUN   __udivuqq3
+    clr     r_quo           ; clear quotient
+    inc     __zero_reg__    ; init loop counter, used per shift
+__udivuqq3_loop:
+    lsl     r_divd          ; shift dividend
+    brcs    0f              ; dividend overflow
+    cp      r_divd,r_div    ; compare dividend & divisor
+    brcc    0f              ; dividend >= divisor
+    rol     r_quo           ; shift quotient (with CARRY)
+    rjmp    __udivuqq3_cont
+0:
+    sub     r_divd,r_div    ; restore dividend
+    lsl     r_quo           ; shift quotient (without CARRY)
+__udivuqq3_cont:
+    lsl     __zero_reg__    ; shift loop-counter bit
+    brne    __udivuqq3_loop
+    com     r_quo           ; complement result
+                            ; because C flag was complemented in loop
+    ret
+ENDF __udivuqq3
+#endif  /* defined (L_udivuqq3) */
+
+#undef  r_divd
+#undef  r_quo
+#undef  r_div
+
+
+/*******************************************************
+    Fractional Division 16 / 16
+*******************************************************/
+#define r_divdL 26     /* dividend Low */
+#define r_divdH 27     /* dividend Hig */
+#define r_quoL  24     /* quotient Low */
+#define r_quoH  25     /* quotient High */
+#define r_divL  22     /* divisor */
+#define r_divH  23     /* divisor */
+#define r_cnt   21
+
+#if defined (L_divhq3)
+DEFUN   __divhq3
+    mov     r0, r_divdH
+    eor     r0, r_divH
+    sbrs    r_divH, 7
+    rjmp    1f
+    NEG2    r_divL
+1:
+    sbrs    r_divdH, 7
+    rjmp    2f
+    NEG2    r_divdL
+2:
+    cp      r_divdL, r_divL
+    cpc     r_divdH, r_divH
+    breq    __divhq3_minus1  ; if equal return -1
+    XCALL   __udivuhq3
+    lsr     r_quoH
+    ror     r_quoL
+    brpl    9f
+    ;; negate result if needed
+    NEG2    r_quoL
+9:
+    ret
+__divhq3_minus1:
+    ldi     r_quoH, 0x80
+    clr     r_quoL
+    ret
+ENDF __divhq3
+#endif  /* defined (L_divhq3) */
+
+#if defined (L_udivuhq3)
+DEFUN   __udivuhq3
+    sub     r_quoH,r_quoH   ; clear quotient and carry
+    ;; FALLTHRU
+ENDF __udivuhq3
+
+DEFUN   __udivuha3_common
+    clr     r_quoL          ; clear quotient
+    ldi     r_cnt,16        ; init loop counter
+__udivuhq3_loop:
+    rol     r_divdL         ; shift dividend (with CARRY)
+    rol     r_divdH
+    brcs    __udivuhq3_ep   ; dividend overflow
+    cp      r_divdL,r_divL  ; compare dividend & divisor
+    cpc     r_divdH,r_divH
+    brcc    __udivuhq3_ep   ; dividend >= divisor
+    rol     r_quoL          ; shift quotient (with CARRY)
+    rjmp    __udivuhq3_cont
+__udivuhq3_ep:
+    sub     r_divdL,r_divL  ; restore dividend
+    sbc     r_divdH,r_divH
+    lsl     r_quoL          ; shift quotient (without CARRY)
+__udivuhq3_cont:
+    rol     r_quoH          ; shift quotient
+    dec     r_cnt           ; decrement loop counter
+    brne    __udivuhq3_loop
+    com     r_quoL          ; complement result
+    com     r_quoH          ; because C flag was complemented in loop
+    ret
+ENDF __udivuha3_common
+#endif  /* defined (L_udivuhq3) */
+
+/*******************************************************
+    Fixed Division 8.8 / 8.8
+*******************************************************/
+#if defined (L_divha3)
+DEFUN   __divha3
+    mov     r0, r_divdH
+    eor     r0, r_divH
+    sbrs    r_divH, 7
+    rjmp    1f
+    NEG2    r_divL
+1:
+    sbrs    r_divdH, 7
+    rjmp    2f
+    NEG2    r_divdL
+2:
+    XCALL   __udivuha3
+    sbrs    r0, 7   ; negate result if needed
+    ret
+    NEG2    r_quoL
+    ret
+ENDF __divha3
+#endif  /* defined (L_divha3) */
+
+#if defined (L_udivuha3)
+DEFUN   __udivuha3
+    mov     r_quoH, r_divdL
+    mov     r_divdL, r_divdH
+    clr     r_divdH
+    lsl     r_quoH     ; shift quotient into carry
+    XJMP    __udivuha3_common ; same as fractional after rearrange
+ENDF __udivuha3
+#endif  /* defined (L_udivuha3) */
+
+#undef  r_divdL
+#undef  r_divdH
+#undef  r_quoL
+#undef  r_quoH
+#undef  r_divL
+#undef  r_divH
+#undef  r_cnt
+
+/*******************************************************
+    Fixed Division 16.16 / 16.16
+*******************************************************/
+
+#define r_arg1L  24    /* arg1 gets passed already in place */
+#define r_arg1H  25
+#define r_arg1HL 26
+#define r_arg1HH 27
+#define r_divdL  26    /* dividend Low */
+#define r_divdH  27
+#define r_divdHL 30
+#define r_divdHH 31    /* dividend High */
+#define r_quoL   22    /* quotient Low */
+#define r_quoH   23
+#define r_quoHL  24
+#define r_quoHH  25    /* quotient High */
+#define r_divL   18    /* divisor Low */
+#define r_divH   19
+#define r_divHL  20
+#define r_divHH  21    /* divisor High */
+#define r_cnt  __zero_reg__  /* loop count (0 after the loop!) */
+
+#if defined (L_divsa3)
+DEFUN   __divsa3
+    mov     r0, r_arg1HH
+    eor     r0, r_divHH
+    sbrs    r_divHH, 7
+    rjmp    1f
+    NEG4    r_divL
+1:
+    sbrs    r_arg1HH, 7
+    rjmp    2f
+    NEG4    r_arg1L
+2:
+    XCALL   __udivusa3
+    sbrs    r0, 7   ; negate result if needed
+    ret
+    NEG4    r_quoL
+    ret
+ENDF __divsa3
+#endif  /* defined (L_divsa3) */
+
+#if defined (L_udivusa3)
+DEFUN   __udivusa3
+    ldi     r_divdHL, 32    ; init loop counter
+    mov     r_cnt, r_divdHL
+    clr     r_divdHL
+    clr     r_divdHH
+    wmov    r_quoL, r_divdHL
+    lsl     r_quoHL         ; shift quotient into carry
+    rol     r_quoHH
+__udivusa3_loop:
+    rol     r_divdL         ; shift dividend (with CARRY)
+    rol     r_divdH
+    rol     r_divdHL
+    rol     r_divdHH
+    brcs    __udivusa3_ep   ; dividend overflow
+    cp      r_divdL,r_divL  ; compare dividend & divisor
+    cpc     r_divdH,r_divH
+    cpc     r_divdHL,r_divHL
+    cpc     r_divdHH,r_divHH
+    brcc    __udivusa3_ep   ; dividend >= divisor
+    rol     r_quoL          ; shift quotient (with CARRY)
+    rjmp    __udivusa3_cont
+__udivusa3_ep:
+    sub     r_divdL,r_divL  ; restore dividend
+    sbc     r_divdH,r_divH
+    sbc     r_divdHL,r_divHL
+    sbc     r_divdHH,r_divHH
+    lsl     r_quoL          ; shift quotient (without CARRY)
+__udivusa3_cont:
+    rol     r_quoH          ; shift quotient
+    rol     r_quoHL
+    rol     r_quoHH
+    dec     r_cnt           ; decrement loop counter
+    brne    __udivusa3_loop
+    com     r_quoL          ; complement result
+    com     r_quoH          ; because C flag was complemented in loop
+    com     r_quoHL
+    com     r_quoHH
+    ret
+ENDF __udivusa3
+#endif  /* defined (L_udivusa3) */
+
+#undef  r_arg1L
+#undef  r_arg1H
+#undef  r_arg1HL
+#undef  r_arg1HH
+#undef  r_divdL
+#undef  r_divdH
+#undef  r_divdHL
+#undef  r_divdHH
+#undef  r_quoL
+#undef  r_quoH
+#undef  r_quoHL
+#undef  r_quoHH
+#undef  r_divL
+#undef  r_divH
+#undef  r_divHL
+#undef  r_divHH
+#undef  r_cnt
Index: libgcc/config/avr/lib1funcs.S
===================================================================
--- libgcc/config/avr/lib1funcs.S	(revision 190620)
+++ libgcc/config/avr/lib1funcs.S	(working copy)
@@ -91,6 +91,35 @@ see the files COPYING3 and COPYING.RUNTI
 .endfunc
 .endm
 
+;; Negate a 2-byte value held in consecutive registers
+.macro NEG2  reg
+    com     \reg+1
+    neg     \reg
+    sbci    \reg+1, -1
+.endm
+
+;; Negate a 4-byte value held in consecutive registers
+.macro NEG4  reg
+    com     \reg+3
+    com     \reg+2
+    com     \reg+1
+.if \reg >= 16
+    neg     \reg
+    sbci    \reg+1, -1
+    sbci    \reg+2, -1
+    sbci    \reg+3, -1
+.else
+    com     \reg
+    adc     \reg,   __zero_reg__
+    adc     \reg+1, __zero_reg__
+    adc     \reg+2, __zero_reg__
+    adc     \reg+3, __zero_reg__
+.endif
+.endm
+
+#define exp_lo(N)  hlo8 ((N) << 23)
+#define exp_hi(N)  hhi8 ((N) << 23)
+
 
 .section .text.libgcc.mul, "ax", @progbits
 
@@ -126,175 +155,246 @@ ENDF __mulqi3
 	
 #endif 	/* defined (L_mulqi3) */
 
-#if defined (L_mulqihi3)
-DEFUN __mulqihi3
-	clr	r25
-	sbrc	r24, 7
-	dec	r25
-	clr	r23
-	sbrc	r22, 7
-	dec	r22
-	XJMP	__mulhi3
-ENDF __mulqihi3:
-#endif /* defined (L_mulqihi3) */
+
+/*******************************************************
+    Widening Multiplication  16 = 8 x 8  without MUL
+    Multiplication  16 x 16  without MUL
+*******************************************************/
+
+#define A0  r22
+#define A1  r23
+#define B0  r24
+#define BB0 r20
+#define B1  r25
+;; Output overlaps input, thus expand result in CC0/1
+#define C0  r24
+#define C1  r25
+#define CC0  __tmp_reg__
+#define CC1  R21
 
 #if defined (L_umulqihi3)
+;;; R25:R24 = (unsigned int) R22 * (unsigned int) R24
+;;; (C1:C0) = (unsigned int) A0  * (unsigned int) B0
+;;; Clobbers: __tmp_reg__, R21..R23
 DEFUN __umulqihi3
-	clr	r25
-	clr	r23
-	XJMP	__mulhi3
+    clr     A1
+    clr     B1
+    XJMP    __mulhi3
 ENDF __umulqihi3
-#endif /* defined (L_umulqihi3) */
+#endif /* L_umulqihi3 */
 
-/*******************************************************
-    Multiplication  16 x 16  without MUL
-*******************************************************/
-#if defined (L_mulhi3)
-#define	r_arg1L	r24		/* multiplier Low */
-#define	r_arg1H	r25		/* multiplier High */
-#define	r_arg2L	r22		/* multiplicand Low */
-#define	r_arg2H	r23		/* multiplicand High */
-#define r_resL	__tmp_reg__	/* result Low */
-#define r_resH  r21		/* result High */
+#if defined (L_mulqihi3)
+;;; R25:R24 = (signed int) R22 * (signed int) R24
+;;; (C1:C0) = (signed int) A0  * (signed int) B0
+;;; Clobbers: __tmp_reg__, R20..R23
+DEFUN __mulqihi3
+    ;; Sign-extend B0
+    clr     B1
+    sbrc    B0, 7
+    com     B1
+    ;; The multiplication runs twice as fast if A1 is zero, thus:
+    ;; Zero-extend A0
+    clr     A1
+#ifdef __AVR_HAVE_JMP_CALL__
+    ;; Store  B0 * sign of A
+    clr     BB0
+    sbrc    A0, 7
+    mov     BB0, B0
+    call    __mulhi3
+#else /* have no CALL */
+    ;; Skip sign-extension of A if A >= 0
+    ;; Same size as with the first alternative but avoids errata skip
+    ;; and is faster if A >= 0
+    sbrs    A0, 7
+    rjmp    __mulhi3
+    ;; If  A < 0  store B
+    mov     BB0, B0
+    rcall   __mulhi3
+#endif /* HAVE_JMP_CALL */
+    ;; 1-extend A after the multiplication
+    sub     C1, BB0
+    ret
+ENDF __mulqihi3
+#endif /* L_mulqihi3 */
 
+#if defined (L_mulhi3)
+;;; R25:R24 = R23:R22 * R25:R24
+;;; (C1:C0) = (A1:A0) * (B1:B0)
+;;; Clobbers: __tmp_reg__, R21..R23
 DEFUN __mulhi3
-	clr	r_resH		; clear result
-	clr	r_resL		; clear result
-__mulhi3_loop:
-	sbrs	r_arg1L,0
-	rjmp	__mulhi3_skip1
-	add	r_resL,r_arg2L	; result + multiplicand
-	adc	r_resH,r_arg2H
-__mulhi3_skip1:	
-	add	r_arg2L,r_arg2L	; shift multiplicand
-	adc	r_arg2H,r_arg2H
-
-	cp	r_arg2L,__zero_reg__
-	cpc	r_arg2H,__zero_reg__
-	breq	__mulhi3_exit	; while multiplicand != 0
-
-	lsr	r_arg1H		; gets LSB of multiplier
-	ror	r_arg1L
-	sbiw	r_arg1L,0
-	brne	__mulhi3_loop	; exit if multiplier = 0
-__mulhi3_exit:
-	mov	r_arg1H,r_resH	; result to return register
-	mov	r_arg1L,r_resL
-	ret
-ENDF __mulhi3
 
-#undef r_arg1L
-#undef r_arg1H
-#undef r_arg2L
-#undef r_arg2H
-#undef r_resL 	
-#undef r_resH 
+    ;; Clear result
+    clr     CC0
+    clr     CC1
+    rjmp 3f
+1:
+    ;; Bit n of A is 1  -->  C += B << n
+    add     CC0, B0
+    adc     CC1, B1
+2:
+    lsl     B0
+    rol     B1
+3:
+    ;; If B == 0 we are ready
+    sbiw    B0, 0
+    breq 9f
+
+    ;; Carry = n-th bit of A
+    lsr     A1
+    ror     A0
+    ;; If bit n of A is set, then go add  B * 2^n  to  C
+    brcs 1b
+
+    ;; Carry = 0  -->  The ROR above acts like  CP A0, 0
+    ;; Thus, it is sufficient to CPC the high part to test A against 0
+    cpc     A1, __zero_reg__
+    ;; Only proceed if A != 0
+    brne    2b
+9:
+    ;; Move Result into place
+    mov     C0, CC0
+    mov     C1, CC1
+    ret
+ENDF  __mulhi3
+#endif /* L_mulhi3 */
 
-#endif /* defined (L_mulhi3) */
+#undef A0
+#undef A1
+#undef B0
+#undef BB0
+#undef B1
+#undef C0
+#undef C1
+#undef CC0
+#undef CC1
+
+
+#define A0 22
+#define A1 A0+1
+#define A2 A0+2
+#define A3 A0+3
+
+#define B0 18
+#define B1 B0+1
+#define B2 B0+2
+#define B3 B0+3
+
+#define CC0 26
+#define CC1 CC0+1
+#define CC2 30
+#define CC3 CC2+1
+
+#define C0 22
+#define C1 C0+1
+#define C2 C0+2
+#define C3 C0+3
 
 /*******************************************************
     Widening Multiplication  32 = 16 x 16  without MUL
 *******************************************************/
 
-#if defined (L_mulhisi3)
-DEFUN __mulhisi3
-;;; FIXME: This is dead code (noone calls it)
-    mov_l   r18, r24
-    mov_h   r19, r25
-    clr     r24
-    sbrc    r23, 7
-    dec     r24
-    mov     r25, r24
-    clr     r20
-    sbrc    r19, 7
-    dec     r20
-    mov     r21, r20
-    XJMP    __mulsi3
-ENDF __mulhisi3
-#endif /* defined (L_mulhisi3) */
-
 #if defined (L_umulhisi3)
 DEFUN __umulhisi3
-;;; FIXME: This is dead code (noone calls it)
-    mov_l   r18, r24
-    mov_h   r19, r25
-    clr     r24
-    clr     r25
-    mov_l   r20, r24
-    mov_h   r21, r25
+    wmov    B0, 24
+    ;; Zero-extend B
+    clr     B2
+    clr     B3
+    ;; Zero-extend A
+    wmov    A2, B2
     XJMP    __mulsi3
 ENDF __umulhisi3
-#endif /* defined (L_umulhisi3) */
+#endif /* L_umulhisi3 */
+
+#if defined (L_mulhisi3)
+DEFUN __mulhisi3
+    wmov    B0, 24
+    ;; Sign-extend B
+    lsl     r25
+    sbc     B2, B2
+    mov     B3, B2
+#ifdef __AVR_ERRATA_SKIP_JMP_CALL__
+    ;; Sign-extend A
+    clr     A2
+    sbrc    A1, 7
+    com     A2
+    mov     A3, A2
+    XJMP __mulsi3
+#else /*  no __AVR_ERRATA_SKIP_JMP_CALL__ */
+    ;; Zero-extend A and __mulsi3 will run at least twice as fast
+    ;; compared to a sign-extended A.
+    clr     A2
+    clr     A3
+    sbrs    A1, 7
+    XJMP __mulsi3
+    ;; If  A < 0  then perform the  B * 0xffff.... before the
+    ;; very multiplication by initializing the high part of the
+    ;; result CC with -B.
+    wmov    CC2, A2
+    sub     CC2, B0
+    sbc     CC3, B1
+    XJMP __mulsi3_helper
+#endif /*  __AVR_ERRATA_SKIP_JMP_CALL__ */
+ENDF __mulhisi3
+#endif /* L_mulhisi3 */
+
 
-#if defined (L_mulsi3)
 /*******************************************************
     Multiplication  32 x 32  without MUL
 *******************************************************/
-#define r_arg1L  r22		/* multiplier Low */
-#define r_arg1H  r23
-#define	r_arg1HL r24
-#define	r_arg1HH r25		/* multiplier High */
-
-#define	r_arg2L  r18		/* multiplicand Low */
-#define	r_arg2H  r19	
-#define	r_arg2HL r20
-#define	r_arg2HH r21		/* multiplicand High */
-	
-#define r_resL	 r26		/* result Low */
-#define r_resH   r27
-#define r_resHL	 r30
-#define r_resHH  r31		/* result High */
 
+#if defined (L_mulsi3)
 DEFUN __mulsi3
-	clr	r_resHH		; clear result
-	clr	r_resHL		; clear result
-	clr	r_resH		; clear result
-	clr	r_resL		; clear result
-__mulsi3_loop:
-	sbrs	r_arg1L,0
-	rjmp	__mulsi3_skip1
-	add	r_resL,r_arg2L		; result + multiplicand
-	adc	r_resH,r_arg2H
-	adc	r_resHL,r_arg2HL
-	adc	r_resHH,r_arg2HH
-__mulsi3_skip1:
-	add	r_arg2L,r_arg2L		; shift multiplicand
-	adc	r_arg2H,r_arg2H
-	adc	r_arg2HL,r_arg2HL
-	adc	r_arg2HH,r_arg2HH
-	
-	lsr	r_arg1HH	; gets LSB of multiplier
-	ror	r_arg1HL
-	ror	r_arg1H
-	ror	r_arg1L
-	brne	__mulsi3_loop
-	sbiw	r_arg1HL,0
-	cpc	r_arg1H,r_arg1L
-	brne	__mulsi3_loop		; exit if multiplier = 0
-__mulsi3_exit:
-	mov_h	r_arg1HH,r_resHH	; result to return register
-	mov_l	r_arg1HL,r_resHL
-	mov_h	r_arg1H,r_resH
-	mov_l	r_arg1L,r_resL
-	ret
-ENDF __mulsi3
+    ;; Clear result
+    clr     CC2
+    clr     CC3
+    ;; FALLTHRU
+ENDF  __mulsi3
 
-#undef r_arg1L 
-#undef r_arg1H 
-#undef r_arg1HL
-#undef r_arg1HH
-             
-#undef r_arg2L 
-#undef r_arg2H 
-#undef r_arg2HL
-#undef r_arg2HH
-             
-#undef r_resL  
-#undef r_resH  
-#undef r_resHL 
-#undef r_resHH 
+DEFUN __mulsi3_helper
+    clr     CC0
+    clr     CC1
+    rjmp 3f
+
+1:  ;; If bit n of A is set, then add  B * 2^n  to the result in CC
+    ;; CC += B
+    add  CC0,B0  $  adc  CC1,B1  $  adc  CC2,B2  $  adc  CC3,B3
+
+2:  ;; B <<= 1
+    lsl  B0      $  rol  B1      $  rol  B2      $  rol  B3
+    
+3:  ;; A >>= 1:  Carry = n-th bit of A
+    lsr  A3      $  ror  A2      $  ror  A1      $  ror  A0
+
+    brcs 1b
+    ;; Only continue if  A != 0
+    sbci    A1, 0
+    brne 2b
+    sbiw    A2, 0
+    brne 2b
+
+    ;; All bits of A are consumed:  Copy result to return register C
+    wmov    C0, CC0
+    wmov    C2, CC2
+    ret
+ENDF __mulsi3_helper
+#endif /* L_mulsi3 */
 
-#endif /* defined (L_mulsi3) */
+#undef A0
+#undef A1
+#undef A2
+#undef A3
+#undef B0
+#undef B1
+#undef B2
+#undef B3
+#undef C0
+#undef C1
+#undef C2
+#undef C3
+#undef CC0
+#undef CC1
+#undef CC2
+#undef CC3
 
 #endif /* !defined (__AVR_HAVE_MUL__) */
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
@@ -316,7 +416,7 @@ ENDF __mulsi3
 #define C3 C0+3
 
 /*******************************************************
-    Widening Multiplication  32 = 16 x 16
+    Widening Multiplication  32 = 16 x 16  with MUL
 *******************************************************/
                               
 #if defined (L_mulhisi3)
@@ -364,7 +464,17 @@ DEFUN __umulhisi3
     mul     A1, B1
     movw    C2, r0
     mul     A0, B1
+#ifdef __AVR_HAVE_JMP_CALL__
+    ;; This function is used by many other routines, often multiple times.
+    ;; Therefore, if the flash size is not too limited, avoid the RCALL
+    ;; and inverst 6 Bytes to speed things up.
+    add     C1, r0
+    adc     C2, r1
+    clr     __zero_reg__
+    adc     C3, __zero_reg__
+#else
     rcall   1f
+#endif
     mul     A1, B0
 1:  add     C1, r0
     adc     C2, r1
@@ -375,7 +485,7 @@ ENDF __umulhisi3
 #endif /* L_umulhisi3 */
 
 /*******************************************************
-    Widening Multiplication  32 = 16 x 32
+    Widening Multiplication  32 = 16 x 32  with MUL
 *******************************************************/
 
 #if defined (L_mulshisi3)
@@ -425,7 +535,7 @@ ENDF __muluhisi3
 #endif /* L_muluhisi3 */
 
 /*******************************************************
-    Multiplication  32 x 32
+    Multiplication  32 x 32  with MUL
 *******************************************************/
 
 #if defined (L_mulsi3)
@@ -468,7 +578,7 @@ ENDF __mulsi3
 #endif /* __AVR_HAVE_MUL__ */
 
 /*******************************************************
-       Multiplication 24 x 24
+       Multiplication 24 x 24 with MUL
 *******************************************************/
 
 #if defined (L_mulpsi3)
@@ -1247,6 +1357,19 @@ __divmodsi4_exit:
 ENDF __divmodsi4
 #endif /* defined (L_divmodsi4) */
 
+#undef r_remHH
+#undef r_remHL
+#undef r_remH
+#undef r_remL
+#undef r_arg1HH
+#undef r_arg1HL
+#undef r_arg1H
+#undef r_arg1L
+#undef r_arg2HH
+#undef r_arg2HL
+#undef r_arg2H
+#undef r_arg2L
+#undef r_cnt
 
 /*******************************************************
        Division 64 / 64
@@ -2757,9 +2880,7 @@ DEFUN __fmulsu_exit
     XJMP  __fmul
 1:  XCALL __fmul
     ;; C = -C iff A0.7 = 1
-    com  C1
-    neg  C0
-    sbci C1, -1
+    NEG2 C0
     ret
 ENDF __fmulsu_exit
 #endif /* L_fmulsu */
@@ -2794,3 +2915,5 @@ ENDF __fmul
 #undef B1
 #undef C0
 #undef C1
+
+#include "lib1funcs-fixed.S"
Index: libgcc/config/avr/t-avr
===================================================================
--- libgcc/config/avr/t-avr	(revision 190620)
+++ libgcc/config/avr/t-avr	(working copy)
@@ -2,6 +2,7 @@ LIB1ASMSRC = avr/lib1funcs.S
 LIB1ASMFUNCS = \
 	_mulqi3 \
 	_mulhi3 \
+	_mulqihi3 _umulqihi3 \
 	_mulpsi3 _mulsqipsi3 \
 	_mulhisi3 \
 	_umulhisi3 \
@@ -55,6 +56,24 @@ LIB1ASMFUNCS = \
 	_cmpdi2 _cmpdi2_s8 \
 	_fmul _fmuls _fmulsu
 
+# Fixed point routines in avr/lib1funcs-fixed.S
+LIB1ASMFUNCS += \
+	_fractqqsf _fractuqqsf \
+	_fracthqsf _fractuhqsf _fracthasf _fractuhasf \
+	_fractsasf _fractusasf _fractsqsf _fractusqsf \
+	\
+	_fractsfqq _fractsfuqq \
+	_fractsfhq _fractsfuhq _fractsfha _fractsfuha \
+	_fractsfsa _fractsfusa \
+	_mulqq3 \
+	_mulhq3 _muluhq3 \
+	_mulha3 _muluha3 _muluha3_round \
+	_mulsa3 _mulusa3 \
+	_divqq3 _udivuqq3 \
+	_divhq3 _udivuhq3 \
+	_divha3 _udivuha3 \
+	_divsa3 _udivusa3
+
 LIB2FUNCS_EXCLUDE = \
 	_moddi3 _umoddi3 \
 	_clz
@@ -81,3 +100,52 @@ libgcc-objects += $(patsubst %,%$(objext
 ifeq ($(enable_shared),yes)
 libgcc-s-objects += $(patsubst %,%_s$(objext),$(hiintfuncs16))
 endif
+
+
+# Filter out supported conversions from fixed-bit.c
+
+conv_XY=$(conv)$(mode1)$(mode2)
+conv_X=$(conv)$(mode)
+
+# Conversions supported by the compiler
+
+convf_modes =	 QI UQI QQ UQQ \
+		 HI UHI HQ UHQ HA UHA \
+		 SI USI SQ USQ SA USA \
+		 DI UDI DQ UDQ DA UDA \
+		 TI UTI TQ UTQ TA UTA
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach conv,_fract _fractuns,\
+	$(foreach mode1,$(convf_modes),\
+	$(foreach mode2,$(convf_modes),$(conv_XY))))
+
+# Conversions supported by lib1funcs-fixed.S
+
+conv_to_sf_modes   = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA
+conv_from_sf_modes = QQ UQQ HQ UHQ HA UHA        SA USA
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach conv,_fract, \
+	$(foreach mode1,$(conv_to_sf_modes), \
+	$(foreach mode2,SF,$(conv_XY))))
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach conv,_fract,\
+	$(foreach mode1,SF,\
+	$(foreach mode2,$(conv_from_sf_modes),$(conv_XY))))
+
+# Arithmetik suported by the compiler
+
+allfix_modes = QQ UQQ HQ UHQ HA UHA SQ USQ SA USA DA UDA DQ UDQ TQ UTQ TA UTA
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach conv,_add _sub,\
+	$(foreach mode,$(allfix_modes),$(conv_X)3))
+
+LIB2FUNCS_EXCLUDE += \
+	$(foreach conv,_lshr _ashl _ashr _cmp,\
+	$(foreach mode,$(allfix_modes),$(conv_X)))
+
+#(error $(LIB2FUNCS_EXCLUDE))
+

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]