This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

PATCH RFC: lower-subreg patch


Here is the basic lower-subreg patch which I am proposing to check in.
Before I actually do check it in, I would appreciate any feedback
anybody cares to give.

I would also be interested in the results on platforms other than
i686-pc-linux-gnu.  I would like to see it tested on at least one
other primary platform before I check it in.

The patch I sent before included several conceptually unrelated
patches.  This is the basic lower-subreg patch, and is mostly the work
of Richard Henderson.  This patch does the basic splitting of wide
registers into single registers.  To really take advantage of this
requires the register allocator changes which were included in my
earlier patch.  I plan to check those in separately.

In this patch, I get around slowness issues by first scanning the
pseudo-registers: if there are no multi-word pseudo-registers, then
the pass exits immediately without scanning any insns.  In my tests on
some real world code which uses "long long" fairly often, this reduces
the time required for lower-subreg, as shown by -ftime-report, to less
than 1% of compile time.

After the dataflow branch has been merged, my plan is to change this
code to use the incrementally collected dataflow information to only
look at the insns which define or use a multi-word pseudo-register.
However, since I don't know when the dataflow branch merge will
happen, I don't want to gate this patch on it.

This patch has been tested with bootstrap and testsuite run on
i686-pc-linux-gnu.

Ian


2006-12-29  Richard Henderson  <rth@redhat.com>
	    Ian Lance Taylor  <iant@google.com>

	* lower-subreg.c: New file.
	* rtl.def (CONCATN): Define.
	* passes.c (init_optimization_passes): Add pass_lower_subreg and
	pass_lower_subreg2.
	* emit-rtl.c (update_reg_offset): New static function, broken out
	of gen_rtx_REG_offset.
	(gen_rtx_REG_offset): Call update_reg_offset.
	(gen_reg_rtx_offset): New function.
	* regclass.c: Revert patch of 2006-03-05, restoring
	reg_scan_update.
	(clear_reg_info_regno): New function.
	* dwarf2out.c (concatn_loc_descriptor): New static function.
	(loc_descriptor): Handle CONCATN.
	* common.opt (fsplit_wide_types): New option.
	* opts.c (decode_options): Set flag_split_wide_types when
	optimizing.
	* timevar.def (TV_LOWER_SUBREG): Define.
	* rtl.h (gen_reg_rtx_offset): Declare.
	(reg_scan_update): Declare.
	* regs.h (clear_reg_info_regno): Declare.
	* tree-pass.h (pass_lower_subreg): Declare.
	(pass_lower_subreg2): Declare.
	* doc/invoke.texi (Option Summary): List -fno-split-wide-types.
	(Optimize Options): Add -fsplit-wide-types to -O1 list.  Document
	-fsplit-wide-types.
	* Makefile.in (OBJS-common): Add lower-subreg.o.
	(lower-subreg.o): New target.


Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 120281)
+++ doc/invoke.texi	(working copy)
@@ -338,7 +338,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsched2-use-superblocks @gol
 -fsched2-use-traces -fsee -freschedule-modulo-scheduled-loops @gol
 -fsection-anchors  -fsignaling-nans  -fsingle-precision-constant @gol
--fstack-protector  -fstack-protector-all @gol
+-fno-split-wide-types -fstack-protector  -fstack-protector-all @gol
 -fstrict-aliasing  -ftracer  -fthread-jumps @gol
 -funroll-all-loops  -funroll-loops  -fpeel-loops @gol
 -fsplit-ivs-in-unroller -funswitch-loops @gol
@@ -4550,6 +4550,7 @@ compilation time.
 -fcprop-registers @gol
 -fif-conversion @gol
 -fif-conversion2 @gol
+-fsplit-wide-types @gol
 -ftree-ccp @gol
 -ftree-dce @gol
 -ftree-dominator-opts @gol
@@ -4887,6 +4888,16 @@ the condition is known to be true or fal
 
 Enabled at levels @option{-O2}, @option{-O3}, @option{-Os}.
 
+@item -fsplit-wide-types
+@opindex -fsplit-wide-types
+When using a type that occupies multiple registers, such as @code{long
+long} on a 32-bit system, split the registers apart and allocate them
+independently.  This normally generates better code for those types,
+but may make debugging more difficult.
+
+Enabled at levels @option{-O}, @option{-O2}, @option{-O3},
+@option{-Os}.
+
 @item -fcse-follow-jumps
 @opindex fcse-follow-jumps
 In common subexpression elimination, scan through jump instructions
Index: tree-pass.h
===================================================================
--- tree-pass.h	(revision 120281)
+++ tree-pass.h	(working copy)
@@ -1,5 +1,5 @@
 /* Definitions for describing one tree-ssa optimization pass.
-   Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+   Copyright (C) 2004, 2005, 2006 Free Software Foundation, Inc.
    Contributed by Richard Henderson <rth@redhat.com>
 
 This file is part of GCC.
@@ -332,6 +332,7 @@ extern struct tree_opt_pass pass_instant
 extern struct tree_opt_pass pass_rtl_fwprop;
 extern struct tree_opt_pass pass_rtl_fwprop_addr;
 extern struct tree_opt_pass pass_jump2;
+extern struct tree_opt_pass pass_lower_subreg;
 extern struct tree_opt_pass pass_cse;
 extern struct tree_opt_pass pass_gcse;
 extern struct tree_opt_pass pass_jump_bypass;
@@ -355,6 +356,7 @@ extern struct tree_opt_pass pass_if_afte
 extern struct tree_opt_pass pass_partition_blocks;
 extern struct tree_opt_pass pass_regmove;
 extern struct tree_opt_pass pass_split_all_insns;
+extern struct tree_opt_pass pass_lower_subreg2;
 extern struct tree_opt_pass pass_mode_switching;
 extern struct tree_opt_pass pass_see;
 extern struct tree_opt_pass pass_recompute_reg_usage;
Index: regs.h
===================================================================
--- regs.h	(revision 120281)
+++ regs.h	(working copy)
@@ -237,6 +237,9 @@ extern int caller_save_needed;
 /* Allocate reg_n_info tables */
 extern void allocate_reg_info (size_t, int, int);
 
+/* Clear the register information for regno.  */
+extern void clear_reg_info_regno (unsigned int);
+
 /* Specify number of hard registers given machine mode occupy.  */
 extern unsigned char hard_regno_nregs[FIRST_PSEUDO_REGISTER][MAX_MACHINE_MODE];
 
Index: rtl.def
===================================================================
--- rtl.def	(revision 120281)
+++ rtl.def	(working copy)
@@ -388,6 +388,12 @@ DEF_RTL_EXPR(STRICT_LOW_PART, "strict_lo
    in DECL_RTLs and during RTL generation, but not in the insn chain.  */
 DEF_RTL_EXPR(CONCAT, "concat", "ee", RTX_OBJ)
 
+/* (CONCATN [a1 a2 ... an]) represents the virtual concatenation of
+   all An to make a value.  This is an extension of CONCAT to larger
+   number of components.  Like CONCAT, it should not appear in the
+   insn chain.  Every element of the CONCATN is the same size.  */
+DEF_RTL_EXPR(CONCATN, "concatn", "E", RTX_OBJ)
+
 /* A memory location; operand is the address.  The second operand is the
    alias set to which this MEM belongs.  We use `0' instead of `w' for this
    field so that the field need not be specified in machine descriptions.  */
Index: dwarf2out.c
===================================================================
--- dwarf2out.c	(revision 120281)
+++ dwarf2out.c	(working copy)
@@ -9043,6 +9043,32 @@ concat_loc_descriptor (rtx x0, rtx x1)
   return cc_loc_result;
 }
 
+/* Return a descriptor that describes the concatenation of N
+   locations.  */
+
+static dw_loc_descr_ref
+concatn_loc_descriptor (rtx concatn)
+{
+  unsigned int i;
+  dw_loc_descr_ref cc_loc_result = NULL;
+  unsigned int n = XVECLEN (concatn, 0);
+
+  for (i = 0; i < n; ++i)
+    {
+      dw_loc_descr_ref ref;
+      rtx x = XVECEXP (concatn, 0, i);
+
+      ref = loc_descriptor (x);
+      if (ref == NULL)
+	return NULL;
+
+      add_loc_descr (&cc_loc_result, ref);
+      add_loc_descr_op_piece (&cc_loc_result, GET_MODE_SIZE (GET_MODE (x)));
+    }
+
+  return cc_loc_result;
+}
+
 /* Output a proper Dwarf location descriptor for a variable or parameter
    which is either allocated in a register or in a memory location.  For a
    register, we just generate an OP_REG and the register number.  For a
@@ -9080,6 +9106,10 @@ loc_descriptor (rtx rtl)
       loc_result = concat_loc_descriptor (XEXP (rtl, 0), XEXP (rtl, 1));
       break;
 
+    case CONCATN:
+      loc_result = concatn_loc_descriptor (rtl);
+      break;
+
     case VAR_LOCATION:
       /* Single part.  */
       if (GET_CODE (XEXP (rtl, 1)) != PARALLEL)
Index: opts.c
===================================================================
--- opts.c	(revision 120281)
+++ opts.c	(working copy)
@@ -444,6 +444,7 @@ decode_options (unsigned int argc, const
       flag_if_conversion2 = 1;
       flag_ipa_pure_const = 1;
       flag_ipa_reference = 1;
+      flag_split_wide_types = 1;
       flag_tree_ccp = 1;
       flag_tree_dce = 1;
       flag_tree_dom = 1;
Index: timevar.def
===================================================================
--- timevar.def	(revision 120281)
+++ timevar.def	(working copy)
@@ -1,6 +1,6 @@
 /* This file contains the definitions for timing variables used to
    measure run-time performance of the compiler.
-   Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005
+   Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006
    Free Software Foundation, Inc.
    Contributed by Alex Samuel <samuel@codesourcery.com>
 
@@ -128,6 +128,7 @@ DEFTIMEVAR (TV_OVERLOAD              , "
 DEFTIMEVAR (TV_TEMPLATE_INSTANTIATION, "template instantiation")
 DEFTIMEVAR (TV_EXPAND		     , "expand")
 DEFTIMEVAR (TV_VARCONST              , "varconst")
+DEFTIMEVAR (TV_LOWER_SUBREG	     , "lower subreg")
 DEFTIMEVAR (TV_JUMP                  , "jump")
 DEFTIMEVAR (TV_FWPROP                , "forward prop")
 DEFTIMEVAR (TV_CSE                   , "CSE")
Index: lower-subreg.c
===================================================================
--- lower-subreg.c	(revision 0)
+++ lower-subreg.c	(revision 0)
@@ -0,0 +1,1021 @@
+/* Decompose multiword subregs.
+   Copyright (C) 2006 Free Software Foundation, Inc.
+   Contributed by Richard Henderson <rth@redhat.com>
+		  Ian Lance Taylor <iant@google.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 2, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING.  If not, write to the Free
+Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301, USA.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "machmode.h"
+#include "tm.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "timevar.h"
+#include "flags.h"
+#include "insn-config.h"
+#include "obstack.h"
+#include "basic-block.h"
+#include "recog.h"
+#include "bitmap.h"
+#include "expr.h"
+#include "regs.h"
+#include "tree-pass.h"
+
+#ifdef STACK_GROWS_DOWNWARD
+# undef STACK_GROWS_DOWNWARD
+# define STACK_GROWS_DOWNWARD 1
+#else
+# define STACK_GROWS_DOWNWARD 0
+#endif
+
+DEF_VEC_P (bitmap);
+DEF_VEC_ALLOC_P (bitmap,heap);
+
+/* Decompose multi-word pseudo-registers into individual
+   pseudo-registers when possible.  This is possible when all the uses
+   of a multi-word register are via SUBREG, or are copies of the
+   register to another location.  Breaking apart the register permits
+   more CSE and permits better register allocation.  */
+
+/* Bit N in this bitmap is set if regno N is used in a context in
+   which we can decompose it.  */
+static bitmap decomposable_context;
+
+/* Bit N in this bitmap is set if regno N is used in a context in
+   which it can not be decomposed.  */
+static bitmap non_decomposable_context;
+
+/* Bit N in the bitmap in element M of this array is set if there is a
+   copy from reg M to reg N.  */
+static VEC(bitmap,heap) *reg_copy_graph;
+
+/* If INSN is a single set between two objects, return the single set.
+   Such an insn can always be decomposed.  */
+
+static rtx
+simple_move (rtx insn)
+{
+  rtx x;
+  rtx set;
+
+  set = single_set (insn);
+  if (!set)
+    return NULL_RTX;
+
+  x = SET_DEST (set);
+  if (!OBJECT_P (x) && GET_CODE (x) != SUBREG)
+    return NULL_RTX;
+  if (MEM_P (x) && MEM_VOLATILE_P (x))
+    return NULL_RTX;
+
+  x = SET_SRC (set);
+  if (!OBJECT_P (x)
+      && GET_CODE (x) != SUBREG
+      && GET_CODE (x) != ASM_OPERANDS)
+    return NULL_RTX;
+  if (MEM_P (x) && MEM_VOLATILE_P (x))
+    return NULL_RTX;
+
+  return set;
+}
+
+/* If SET is a copy from one multi-word pseudo-register to another,
+   record that in reg_copy_graph.  Return whether it is such a
+   copy.  */
+
+static bool
+find_pseudo_copy (rtx set)
+{
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+  unsigned int rd, rs;
+  bitmap b;
+
+  if (!REG_P (dest) || !REG_P (src))
+    return false;
+
+  rd = REGNO (dest);
+  rs = REGNO (src);
+  if (HARD_REGISTER_NUM_P (rd) || HARD_REGISTER_NUM_P (rs))
+    return false;
+
+  if (GET_MODE_SIZE (GET_MODE (dest)) <= UNITS_PER_WORD)
+    return false;
+
+  b = VEC_index (bitmap, reg_copy_graph, rs);
+  if (b == NULL)
+    {
+      b = BITMAP_ALLOC (NULL);
+      VEC_replace (bitmap, reg_copy_graph, rs, b);
+    }
+
+  bitmap_set_bit (b, rd);
+
+  return true;
+}
+
+/* Look through the registers in DECOMPOSABLE_CONTEXT.  For each case
+   where they are copied to another register, add the register to
+   which they are copied to DECOMPOSABLE_CONTEXT.  Use
+   NON_DECOMPOSABLE_CONTEXT to limit this--we don't bother to track
+   copies of registers which are in NON_DECOMPOSABLE_CONTEXT.  */
+
+static void
+propagate_pseudo_copies (void)
+{
+  bitmap queue, propagate;
+
+  queue = BITMAP_ALLOC (NULL);
+  propagate = BITMAP_ALLOC (NULL);
+
+  bitmap_copy (queue, decomposable_context);
+  do
+    {
+      bitmap_iterator iter;
+      unsigned int i;
+
+      bitmap_clear (propagate);
+
+      EXECUTE_IF_SET_IN_BITMAP (queue, 0, i, iter)
+	{
+	  bitmap b = VEC_index (bitmap, reg_copy_graph, i);
+	  if (b)
+	    bitmap_ior_and_compl_into (propagate, b, non_decomposable_context);
+	}
+
+      bitmap_and_compl (queue, propagate, decomposable_context);
+      bitmap_ior_into (decomposable_context, propagate);
+    }
+  while (!bitmap_empty_p (queue));
+
+  BITMAP_FREE (queue);
+  BITMAP_FREE (propagate);
+}
+
+/* A pointer to one of these values is passed to
+   find_decomposable_subregs via for_each_rtx.  */
+
+enum classify_move_insn
+{
+  /* Not a simple move from one location to another.  */
+  NOT_SIMPLE_MOVE,
+  /* A simple move from one pseudo-register to another with no
+     REG_RETVAL note.  */
+  SIMPLE_PSEUDO_REG_MOVE,
+  /* A simple move involving a non-pseudo-register, or from one
+     pseudo-register to another with a REG_RETVAL note.  */
+  SIMPLE_MOVE
+};
+
+/* This is called via for_each_rtx.  If we find a SUBREG which we
+   could use to decompose a pseudo-register, set a bit in
+   DECOMPOSABLE_CONTEXT.  If we find an unadorned register which is
+   not a simple pseudo-register copy, DATA will point at the type of
+   move, and we set a bit in DECOMPOSABLE_CONTEXT or
+   NON_DECOMPOSABLE_CONTEXT as appropriate.  */
+
+static int
+find_decomposable_subregs (rtx *px, void *data)
+{
+  enum classify_move_insn *pcmi = (enum classify_move_insn *) data;
+  rtx x = *px;
+
+  if (GET_CODE (x) == SUBREG)
+    {
+      rtx inner = SUBREG_REG (x);
+      unsigned int regno, outer_size, inner_size, outer_words, inner_words;
+
+      if (!REG_P (inner))
+	return 0;
+
+      regno = REGNO (inner);
+      if (HARD_REGISTER_NUM_P (regno))
+	return -1;
+
+      outer_size = GET_MODE_SIZE (GET_MODE (x));
+      inner_size = GET_MODE_SIZE (GET_MODE (inner));
+      outer_words = (outer_size + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+      inner_words = (inner_size + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+
+      /* We only try to decompose single word subregs of multi-word
+	 registers.  When we find one, we return -1 to avoid iterating
+	 over the inner register.
+
+	 ??? This doesn't allow, e.g., DImode subregs of TImode values
+	 on 32-bit targets.  We would need to record the way the
+	 pseudo-register was used, and only decompose if all the uses
+	 were the same number and size of pieces.  Hopefully this
+	 doesn't happen much.  */
+
+      if (outer_words == 1 && inner_words > 1)
+	{
+	  bitmap_set_bit (decomposable_context, regno);
+	  return -1;
+	}
+    }
+  else if (GET_CODE (x) == REG)
+    {
+      unsigned int regno;
+
+      /* We will see an outer SUBREG before we see the inner REG, so
+	 when we see a plain REG here it means a direct reference to
+	 the register.
+
+	 If this is not a simple copy from one location to another,
+	 then we can not decompose this register.  If this is a simple
+	 copy from one pseudo-register to another, with no REG_RETVAL
+	 note, and the mode is right, then we mark the register as
+	 decomposable.  Otherwise we don't say anything about this
+	 register--it could be decomposed, but whether that would be
+	 profitable depends upon how it is used elsewhere.
+
+	 We only set bits in the bitmap for multi-word
+	 pseudo-registers, since those are the only ones we care about
+	 and it keeps the size of the bitmaps down.  */
+
+      regno = REGNO (x);
+      if (!HARD_REGISTER_NUM_P (regno)
+	  && GET_MODE_SIZE (GET_MODE (x)) > UNITS_PER_WORD)
+	{
+	  switch (*pcmi)
+	    {
+	    case NOT_SIMPLE_MOVE:
+	      bitmap_set_bit (non_decomposable_context, regno);
+	      break;
+	    case SIMPLE_PSEUDO_REG_MOVE:
+	      if (MODES_TIEABLE_P (GET_MODE (x), word_mode))
+		bitmap_set_bit (decomposable_context, regno);
+	      break;
+	    case SIMPLE_MOVE:
+	      break;
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+    }
+
+  return 0;
+}
+
+/* Decompose REGNO into word-sized components.  We smash the REG node
+   in place.  This ensures that (1) something goes wrong quickly if we
+   fail to make some replacement, and (2) the debug information inside
+   the symbol table is automatically kept up to date.  */
+
+static void
+decompose_register (unsigned int regno)
+{
+  rtx reg;
+  unsigned int words, i;
+  rtvec v;
+
+  reg = regno_reg_rtx[regno];
+
+  regno_reg_rtx[regno] = NULL_RTX;
+  clear_reg_info_regno (regno);
+
+  words = GET_MODE_SIZE (GET_MODE (reg));
+  words = (words + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+
+  v = rtvec_alloc (words);
+  for (i = 0; i < words; ++i)
+    RTVEC_ELT (v, i) = gen_reg_rtx_offset (reg, word_mode, i * UNITS_PER_WORD);
+
+  PUT_CODE (reg, CONCATN);
+  XVEC (reg, 0) = v;
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "; Splitting reg %u ->", regno);
+      for (i = 0; i < words; ++i)
+	fprintf (dump_file, " %u", REGNO (XVECEXP (reg, 0, i)));
+      fputc ('\n', dump_file);
+    }
+}
+
+/* Get a SUBREG of a CONCATN.  */
+
+static rtx
+simplify_subreg_concatn (enum machine_mode outermode, rtx op,
+			 unsigned int byte)
+{
+  unsigned int inner_size;
+  enum machine_mode innermode;
+  rtx part;
+  unsigned int final_offset;
+
+  gcc_assert (GET_CODE (op) == CONCATN);
+  gcc_assert (byte % GET_MODE_SIZE (outermode) == 0);
+
+  innermode = GET_MODE (op);
+  gcc_assert (byte < GET_MODE_SIZE (innermode));
+
+  inner_size = GET_MODE_SIZE (innermode) / XVECLEN (op, 0);
+  part = XVECEXP (op, 0, byte / inner_size);
+  final_offset = byte % inner_size;
+  if (final_offset + GET_MODE_SIZE (outermode) > inner_size)
+    return NULL_RTX;
+
+  return simplify_gen_subreg (outermode, part, GET_MODE (part), final_offset);
+}
+
+/* Wrapper around simplify_gen_subreg which handles CONCATN.  */
+
+static rtx
+simplify_gen_subreg_concatn (enum machine_mode outermode, rtx op,
+			     enum machine_mode innermode, unsigned int byte)
+{
+  /* We have to handle generating a SUBREG of a SUBREG of a CONCATN.
+     If OP is a SUBREG of a CONCATN, then it must be a simple mode
+     change with the same size and offset 0, or it must extract a
+     part.  We shouldn't see anything else here.  */
+  if (GET_CODE (op) == SUBREG && GET_CODE (SUBREG_REG (op)) == CONCATN)
+    {
+      if ((GET_MODE_SIZE (GET_MODE (op))
+	   == GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))))
+	  && SUBREG_BYTE (op) == 0)
+	return simplify_gen_subreg_concatn (outermode, SUBREG_REG (op),
+					    GET_MODE (SUBREG_REG (op)), byte);
+
+      op = simplify_subreg_concatn (GET_MODE (op), SUBREG_REG (op),
+				    SUBREG_BYTE (op));
+      gcc_assert (op != NULL_RTX);
+      gcc_assert (innermode == GET_MODE (op));
+    }
+  if (GET_CODE (op) == CONCATN)
+    return simplify_subreg_concatn (outermode, op, byte);
+  return simplify_gen_subreg (outermode, op, innermode, byte);
+}
+
+/* Return whether we should resolve X into the registers into which it
+   was decomposed.  */
+
+static bool
+resolve_reg_p (rtx x)
+{
+  return GET_CODE (x) == CONCATN;
+}
+
+/* Return whether X is a SUBREG of a register which we need to
+   resolve.  */
+
+static bool
+resolve_subreg_p (rtx x)
+{
+  if (GET_CODE (x) != SUBREG)
+    return false;
+  return resolve_reg_p (SUBREG_REG (x));
+}
+
+/* This is called via for_each_rtx.  Look for SUBREGs which need to be
+   decomposed.  */
+
+static int
+resolve_subreg_use (rtx *px, void *data)
+{
+  rtx insn = (rtx) data;
+  rtx x = *px;
+
+  if (x == NULL_RTX)
+    return 0;
+
+  if (resolve_subreg_p (x))
+    {
+      x = simplify_subreg_concatn (GET_MODE (x), SUBREG_REG (x),
+				   SUBREG_BYTE (x));
+      gcc_assert (x);
+
+      validate_change (insn, px, x, 1);
+      return -1;
+    }
+
+  if (resolve_reg_p (x))
+    {
+      /* Return 1 to the caller to indicate that we found a direct
+	 reference to a register which is being decomposed.  This can
+	 happen inside notes.  */
+      gcc_assert (!insn);
+      return 1;
+    }
+
+  return 0;
+}
+
+/* If there is a REG_LIBCALL note on OLD_START, move it to NEW_START,
+   and link the corresponding REG_RETVAL note to NEW_START.  */
+
+static void
+move_libcall_note (rtx old_start, rtx new_start)
+{
+  rtx note0, note1, end;
+
+  note0 = find_reg_note (old_start, REG_LIBCALL, NULL);
+  if (note0 == NULL_RTX)
+    return;
+
+  remove_note (old_start, note0);
+  end = XEXP (note0, 0);
+  note1 = find_reg_note (end, REG_RETVAL, NULL);
+
+  XEXP (note0, 1) = REG_NOTES (new_start);
+  REG_NOTES (new_start) = note0;
+  XEXP (note1, 0) = new_start;
+}
+
+/* Remove any REG_RETVAL note, the corresponding REG_LIBCALL note, and
+   any markers for a no-conflict block.  We have decomposed the
+   registers so the non-conflict is now obvious.  */
+
+static void
+remove_retval_note (rtx insn1)
+{
+  rtx note0, insn0, note1, insn;
+
+  note1 = find_reg_note (insn1, REG_RETVAL, NULL);
+  if (note1 == NULL_RTX)
+    return;
+
+  insn0 = XEXP (note1, 0);
+  note0 = find_reg_note (insn0, REG_LIBCALL, NULL);
+
+  remove_note (insn0, note0);
+  remove_note (insn1, note1);
+
+  for (insn = insn0; insn != insn1; insn = NEXT_INSN (insn))
+    {
+      while (1)
+	{
+	  rtx note;
+
+	  note = find_reg_note (insn, REG_NO_CONFLICT, NULL);
+	  if (note == NULL_RTX)
+	    break;
+	  remove_note (insn, note);
+	}
+    }
+}
+
+/* Resolve any decomposed registers which appear in register notes on
+   INSN.  */
+
+static void
+resolve_reg_notes (rtx insn)
+{
+  rtx *pnote, note;
+
+  note = find_reg_equal_equiv_note (insn);
+  if (note)
+    {
+      if (for_each_rtx (&XEXP (note, 0), resolve_subreg_use, NULL))
+	{
+	  remove_note (insn, note);
+	  remove_retval_note (insn);
+	}
+    }
+
+  pnote = &REG_NOTES (insn);
+  while (*pnote != NULL_RTX)
+    {
+      bool delete = false;
+
+      note = *pnote;
+      switch (REG_NOTE_KIND (note))
+	{
+	case REG_NO_CONFLICT:
+	  if (resolve_reg_p (XEXP (note, 0)))
+	    delete = true;
+	  break;
+
+	default:
+	  break;
+	}
+
+      if (delete)
+	*pnote = XEXP (note, 1);
+      else
+	pnote = &XEXP (note, 1);
+    }
+}
+
+/* Return whether X can not be decomposed into subwords.  */
+
+static bool
+cannot_decompose_p (rtx x)
+{
+  if (REG_P (x))
+    {
+      unsigned int regno = REGNO (x);
+
+      if (HARD_REGISTER_NUM_P (regno))
+	return !validate_subreg (word_mode, GET_MODE (x), x, UNITS_PER_WORD);
+      else
+	return bitmap_bit_p (non_decomposable_context, regno);
+    }
+
+  return false;
+}
+
+/* Decompose the registers used in a simple move SET within INSN.  If
+   we don't change anything, return INSN, otherwise return the start
+   of the sequence of moves.  */
+
+static rtx
+resolve_simple_move (rtx set, rtx insn)
+{
+  rtx src, dest, real_dest, insns;
+  enum machine_mode orig_mode;
+  unsigned int words;
+  bool pushing;
+
+  src = SET_SRC (set);
+  dest = SET_DEST (set);
+  orig_mode = GET_MODE (dest);
+
+  words = (GET_MODE_SIZE (orig_mode) + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+  if (words <= 1)
+    return insn;
+
+  start_sequence ();
+
+  /* We have to handle copying from a SUBREG of a decomposed reg where
+     the SUBREG is larger than word size.  Rather than assume that we
+     can take a word_mode SUBREG of the destination, we copy to a new
+     register and then copy that to the destination.  */
+
+  real_dest = NULL_RTX;
+
+  if (GET_CODE (src) == SUBREG && resolve_reg_p (SUBREG_REG (src)))
+    {
+      real_dest = dest;
+      dest = gen_reg_rtx (orig_mode);
+    }
+
+  /* Similarly if we are copying to a SUBREG of a decomposed reg where
+     the SUBREG is larger than word size.  */
+
+  if (GET_CODE (dest) == SUBREG && resolve_reg_p (SUBREG_REG (dest)))
+    {
+      rtx reg, minsn;
+
+      reg = gen_reg_rtx (orig_mode);
+      minsn = emit_move_insn (reg, src);
+      resolve_simple_move (simple_move (minsn), minsn);
+
+      src = reg;
+    }
+
+  /* If we didn't have any big SUBREGS of decomposed registers, and
+     neither side of the move is a register we are decomposing, then
+     we don't have to do anything here.  */
+
+  if (src == SET_SRC (set)
+      && dest == SET_DEST (set)
+      && !resolve_reg_p (src)
+      && !resolve_reg_p (dest))
+    {
+      end_sequence ();
+      return insn;
+    }
+
+  /* If SRC is a register which we can't decompose, or has side
+     effects, we need to move via a temporary register.  */
+
+  if (cannot_decompose_p (src)
+      || side_effects_p (src)
+      || GET_CODE (src) == ASM_OPERANDS)
+    {
+      rtx reg;
+
+      reg = gen_reg_rtx (orig_mode);
+      emit_move_insn (reg, src);
+      src = reg;
+    }
+
+  /* If DEST is a register which we can't decompose, or has side
+     effects, we need to first move to a temporary register.  We
+     handle the common case of pushing an operand directly.  */
+
+  pushing = push_operand (dest, orig_mode);
+  if (cannot_decompose_p (dest)
+      || (side_effects_p (dest) && !pushing))
+    {
+      gcc_assert (real_dest == NULL_RTX);
+      real_dest = dest;
+      dest = gen_reg_rtx (orig_mode);
+    }
+
+  if (pushing)
+    {
+      unsigned int i, j, jinc;
+
+      gcc_assert (GET_MODE_SIZE (orig_mode) % UNITS_PER_WORD == 0);
+      gcc_assert (GET_CODE (XEXP (dest, 0)) != PRE_MODIFY);
+      gcc_assert (GET_CODE (XEXP (dest, 0)) != POST_MODIFY);
+
+      if (WORDS_BIG_ENDIAN == STACK_GROWS_DOWNWARD)
+	{
+	  j = 0;
+	  jinc = 1;
+	}
+      else
+	{
+	  j = words - 1;
+	  jinc = -1;
+	}
+
+      for (i = 0; i < words; ++i, j += jinc)
+	{
+	  rtx temp;
+
+	  temp = copy_rtx (XEXP (dest, 0));
+	  temp = adjust_automodify_address_nv (dest, word_mode, temp,
+					       j * UNITS_PER_WORD);
+	  emit_move_insn (temp,
+			  simplify_gen_subreg_concatn (word_mode, src,
+						       orig_mode,
+						       j * UNITS_PER_WORD));
+	}
+    }
+  else
+    {
+      unsigned int i;
+
+      if (REG_P (dest) && !HARD_REGISTER_NUM_P (REGNO (dest)))
+	emit_insn (gen_rtx_CLOBBER (VOIDmode, dest));
+
+      for (i = 0; i < words; ++i)
+	emit_move_insn (simplify_gen_subreg_concatn (word_mode, dest,
+						     orig_mode,
+						     i * UNITS_PER_WORD),
+			simplify_gen_subreg_concatn (word_mode, src,
+						     orig_mode,
+						     i * UNITS_PER_WORD));
+    }
+
+  if (real_dest != NULL_RTX)
+    {
+      rtx minsn;
+
+      minsn = emit_move_insn (real_dest, dest);
+      resolve_simple_move (simple_move (minsn), minsn);
+    }
+
+  insns = get_insns ();
+  end_sequence ();
+
+  emit_insn_before (insns, insn);
+
+  move_libcall_note (insn, insns);
+  remove_retval_note (insn);
+  delete_insn (insn);
+
+  return insns;
+}
+
+/* Change a CLOBBER of a decomposed register into a CLOBBER of the
+   component registers.  Return whether we changed something.  */
+
+static bool
+resolve_clobber (rtx pat, rtx insn)
+{
+  rtx reg;
+  enum machine_mode orig_mode;
+  unsigned int words, i;
+
+  reg = XEXP (pat, 0);
+  if (!resolve_reg_p (reg))
+    return false;
+
+  orig_mode = GET_MODE (reg);
+  words = GET_MODE_SIZE (orig_mode);
+  words = (words + UNITS_PER_WORD - 1) / UNITS_PER_WORD;
+
+  XEXP (pat, 0) = simplify_subreg_concatn (word_mode, reg, 0);
+  for (i = words - 1; i > 0; --i)
+    {
+      rtx x;
+
+      x = simplify_subreg_concatn (word_mode, reg, i * UNITS_PER_WORD);
+      x = gen_rtx_CLOBBER (VOIDmode, x);
+      emit_insn_after (x, insn);
+    }
+
+  return true;
+}
+
+/* A USE of a decomposed register is no longer meaningful.  Return
+   whether we changed something.  */
+
+static bool
+resolve_use (rtx pat, rtx insn)
+{
+  if (resolve_reg_p (XEXP (pat, 0)) || resolve_subreg_p (XEXP (pat, 0)))
+    {
+      delete_insn (insn);
+      return true;
+    }
+  return false;
+}
+
+/* Look for registers which are always accessed via word-sized SUBREGs
+   or via copies.  Decompose these registers into several word-sized
+   pseudo-registers.  */
+
+static void
+decompose_multiword_subregs (bool update_life)
+{
+  unsigned int max;
+  basic_block bb;
+
+  max = max_reg_num ();
+
+  /* First see if there are any multi-word pseudo-registers.  If there
+     aren't, there is nothing we can do.  This should speed up this
+     pass in the normal case, since it should be faster than scanning
+     all the insns.  */
+  {
+    unsigned int i;
+
+    for (i = FIRST_PSEUDO_REGISTER; i < max; ++i)
+      {
+	if (regno_reg_rtx[i] != NULL
+	    && GET_MODE_SIZE (GET_MODE (regno_reg_rtx[i])) > UNITS_PER_WORD)
+	  break;
+      }
+    if (i == max)
+      return;
+  }
+
+  /* FIXME: When the dataflow branch is merged, we can change this
+     code to look for each multi-word pseudo-register and to find each
+     insn which sets or uses that register.  That should be faster
+     than scanning all the insns.  */
+
+  decomposable_context = BITMAP_ALLOC (NULL);
+  non_decomposable_context = BITMAP_ALLOC (NULL);
+
+  reg_copy_graph = VEC_alloc (bitmap, heap, max);
+  VEC_safe_grow (bitmap, heap, reg_copy_graph, max);
+  memset (VEC_address (bitmap, reg_copy_graph), 0, sizeof (bitmap) * max);
+
+  FOR_EACH_BB (bb)
+    {
+      rtx insn;
+
+      FOR_BB_INSNS (bb, insn)
+	{
+	  rtx set;
+	  enum classify_move_insn cmi;
+	  int i, n;
+
+	  if (!INSN_P (insn)
+	      || GET_CODE (PATTERN (insn)) == CLOBBER
+	      || GET_CODE (PATTERN (insn)) == USE)
+	    continue;
+
+	  set = simple_move (insn);
+
+	  if (!set)
+	    cmi = NOT_SIMPLE_MOVE;
+	  else
+	    {
+	      bool retval;
+
+	      retval = find_reg_note (insn, REG_RETVAL, NULL_RTX) != NULL_RTX;
+
+	      if (find_pseudo_copy (set) && !retval)
+		cmi = SIMPLE_PSEUDO_REG_MOVE;
+	      else if (retval
+		       && REG_P (SET_SRC (set))
+		       && HARD_REGISTER_P (SET_SRC (set)))
+		{
+		  /* We don't want to decompose an assignment which
+		     copies the value returned by a libcall to a
+		     pseudo-register.  Doing that will lose the RETVAL
+		     note with no real gain.  */
+		  cmi = NOT_SIMPLE_MOVE;
+		}
+	      else
+		cmi = SIMPLE_MOVE;
+	    }
+
+	  recog_memoized (insn);
+	  extract_insn (insn);
+	  n = recog_data.n_operands;
+	  for (i = 0; i < n; ++i)
+	    {
+	      for_each_rtx (&recog_data.operand[i],
+			    find_decomposable_subregs,
+			    &cmi);
+
+	      /* We handle ASM_OPERANDS as a special case to support
+		 things like x86 rdtsc which returns a DImode value.
+		 We can decompose the output, which will certainly be
+		 operand 0, but not the inputs.  */
+
+	      if (cmi == SIMPLE_MOVE
+		  && GET_CODE (SET_SRC (set)) == ASM_OPERANDS)
+		{
+		  gcc_assert (i == 0);
+		  cmi = NOT_SIMPLE_MOVE;
+		}
+	    }
+	}
+    }
+
+  bitmap_and_compl_into (decomposable_context, non_decomposable_context);
+  if (!bitmap_empty_p (decomposable_context))
+    {
+      int hold_no_new_pseudos = no_new_pseudos;
+      int max_regno = max_reg_num ();
+      sbitmap blocks;
+      bitmap_iterator iter;
+      unsigned int regno;
+
+      propagate_pseudo_copies ();
+
+      no_new_pseudos = 0;
+      blocks = sbitmap_alloc (last_basic_block);
+      sbitmap_zero (blocks);
+
+      EXECUTE_IF_SET_IN_BITMAP (decomposable_context, 0, regno, iter)
+	decompose_register (regno);
+
+      FOR_EACH_BB (bb)
+	{
+	  rtx insn;
+
+	  FOR_BB_INSNS (bb, insn)
+	    {
+	      rtx next, pat;
+	      bool changed;
+
+	      if (!INSN_P (insn))
+		continue;
+
+	      next = NEXT_INSN (insn);
+	      changed = false;
+
+	      pat = PATTERN (insn);
+	      if (GET_CODE (pat) == CLOBBER)
+		{
+		  if (resolve_clobber (pat, insn))
+		    changed = true;
+		}
+	      else if (GET_CODE (pat) == USE)
+		{
+		  if (resolve_use (pat, insn))
+		    changed = true;
+		}
+	      else
+		{
+		  rtx set;
+		  int i;
+
+		  set = simple_move (insn);
+		  if (set)
+		    {
+		      rtx orig_insn = insn;
+
+		      insn = resolve_simple_move (set, insn);
+		      if (insn != orig_insn)
+			changed = true;
+		    }
+
+		  recog_memoized (insn);
+		  extract_insn (insn);
+		  for (i = recog_data.n_operands - 1; i >= 0; --i)
+		    for_each_rtx (recog_data.operand_loc[i],
+				  resolve_subreg_use,
+				  insn);
+
+		  resolve_reg_notes (insn);
+
+		  if (num_validated_changes () > 0)
+		    {
+		      for (i = recog_data.n_dups - 1; i >= 0; --i)
+			{
+			  rtx *pl = recog_data.dup_loc[i];
+			  int dup_num = recog_data.dup_num[i];
+			  rtx *px = recog_data.operand_loc[dup_num];
+
+			  validate_change (insn, pl, *px, 1);
+			}
+
+		      i = apply_change_group ();
+		      gcc_assert (i);
+
+		      changed = true;
+		    }
+		}
+
+	      if (changed)
+		{
+		  SET_BIT (blocks, bb->index);
+		  reg_scan_update (insn, next, max_regno);
+		}
+	    }
+	}
+
+      no_new_pseudos = hold_no_new_pseudos;
+
+      if (update_life)
+	update_life_info (blocks, UPDATE_LIFE_GLOBAL_RM_NOTES,
+			  PROP_DEATH_NOTES);
+
+      sbitmap_free (blocks);
+    }
+
+  {
+    unsigned int i;
+    bitmap b;
+
+    for (i = 0; VEC_iterate (bitmap, reg_copy_graph, i, b); ++i)
+      if (b)
+	BITMAP_FREE (b);
+  }
+
+  VEC_free (bitmap, heap, reg_copy_graph);  
+
+  BITMAP_FREE (decomposable_context);
+  BITMAP_FREE (non_decomposable_context);
+}
+
+/* Gate function for lower subreg pass.  */
+
+static bool
+gate_handle_lower_subreg (void)
+{
+  return flag_split_wide_types != 0;
+}
+
+/* Implement first lower subreg pass.  */
+
+static unsigned int
+rest_of_handle_lower_subreg (void)
+{
+  decompose_multiword_subregs (false);
+  return 0;
+}
+
+/* Implement second lower subreg pass.  */
+
+static unsigned int
+rest_of_handle_lower_subreg2 (void)
+{
+  decompose_multiword_subregs (true);
+  return 0;
+}
+
+struct tree_opt_pass pass_lower_subreg =
+{
+  "subreg",	                        /* name */
+  gate_handle_lower_subreg,             /* gate */
+  rest_of_handle_lower_subreg,          /* execute */
+  NULL,                                 /* sub */
+  NULL,                                 /* next */
+  0,                                    /* static_pass_number */
+  TV_LOWER_SUBREG,                      /* tv_id */
+  0,                                    /* properties_required */
+  0,                                    /* properties_provided */
+  0,                                    /* properties_destroyed */
+  0,                                    /* todo_flags_start */
+  TODO_dump_func |
+  TODO_ggc_collect,                     /* todo_flags_finish */
+  'u'                                   /* letter */
+};
+
+struct tree_opt_pass pass_lower_subreg2 =
+{
+  "subreg2",	                        /* name */
+  gate_handle_lower_subreg,             /* gate */
+  rest_of_handle_lower_subreg2,          /* execute */
+  NULL,                                 /* sub */
+  NULL,                                 /* next */
+  0,                                    /* static_pass_number */
+  TV_LOWER_SUBREG,                      /* tv_id */
+  0,                                    /* properties_required */
+  0,                                    /* properties_provided */
+  0,                                    /* properties_destroyed */
+  0,                                    /* todo_flags_start */
+  TODO_dump_func |
+  TODO_ggc_collect,                     /* todo_flags_finish */
+  'U'                                   /* letter */
+};
Index: emit-rtl.c
===================================================================
--- emit-rtl.c	(revision 120281)
+++ emit-rtl.c	(working copy)
@@ -812,13 +812,12 @@ gen_reg_rtx (enum machine_mode mode)
   return val;
 }
 
-/* Generate a register with same attributes as REG, but offsetted by OFFSET.
+/* Update NEW with the same attributes as REG, but offsetted by OFFSET.
    Do the big endian correction if needed.  */
 
-rtx
-gen_rtx_REG_offset (rtx reg, enum machine_mode mode, unsigned int regno, int offset)
+static void
+update_reg_offset (rtx new, rtx reg, int offset)
 {
-  rtx new = gen_rtx_REG (mode, regno);
   tree decl;
   HOST_WIDE_INT var_size;
 
@@ -860,7 +859,7 @@ gen_rtx_REG_offset (rtx reg, enum machin
   if ((BYTES_BIG_ENDIAN || WORDS_BIG_ENDIAN)
       && decl != NULL
       && offset > 0
-      && GET_MODE_SIZE (GET_MODE (reg)) > GET_MODE_SIZE (mode)
+      && GET_MODE_SIZE (GET_MODE (reg)) > GET_MODE_SIZE (GET_MODE (new))
       && ((var_size = int_size_in_bytes (TREE_TYPE (decl))) > 0
 	  && var_size < GET_MODE_SIZE (GET_MODE (reg))))
     {
@@ -904,6 +903,30 @@ gen_rtx_REG_offset (rtx reg, enum machin
 
   REG_ATTRS (new) = get_reg_attrs (REG_EXPR (reg),
 				   REG_OFFSET (reg) + offset);
+}
+
+/* Generate a register with same attributes as REG, but offsetted by
+   OFFSET.  */
+
+rtx
+gen_rtx_REG_offset (rtx reg, enum machine_mode mode, unsigned int regno,
+		    int offset)
+{
+  rtx new = gen_rtx_REG (mode, regno);
+
+  update_reg_offset (new, reg, offset);
+  return new;
+}
+
+/* Generate a new pseudo-register with the same attributes as REG, but
+   offsetted by OFFSET.  */
+
+rtx
+gen_reg_rtx_offset (rtx reg, enum machine_mode mode, int offset)
+{
+  rtx new = gen_reg_rtx (mode);
+
+  update_reg_offset (new, reg, offset);
   return new;
 }
 
Index: common.opt
===================================================================
--- common.opt	(revision 120281)
+++ common.opt	(working copy)
@@ -849,6 +849,10 @@ fsplit-ivs-in-unroller
 Common Report Var(flag_split_ivs_in_unroller) Init(1)
 Split lifetimes of induction variables when loops are unrolled
 
+fsplit-wide-types
+Common Report Var(flag_split_wide_types)
+Split wide types into independent registers
+
 fvariable-expansion-in-unroller
 Common Report Var(flag_variable_expansion_in_unroller)
 Apply variable expansion when loops are unrolled
Index: regclass.c
===================================================================
--- regclass.c	(revision 120281)
+++ regclass.c	(working copy)
@@ -1,6 +1,6 @@
 /* Compute register class preferences for pseudo-registers.
    Copyright (C) 1987, 1988, 1991, 1992, 1993, 1994, 1995, 1996
-   1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005
+   1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -859,7 +859,7 @@ static void record_address_regs (enum ma
 #ifdef FORBIDDEN_INC_DEC_CLASSES
 static int auto_inc_dec_reg_p (rtx, enum machine_mode);
 #endif
-static void reg_scan_mark_refs (rtx, rtx, int);
+static void reg_scan_mark_refs (rtx, rtx, int, unsigned int);
 
 /* Wrapper around REGNO_OK_FOR_INDEX_P, to allow pseudo registers.  */
 
@@ -2296,6 +2296,14 @@ free_reg_info (void)
   regno_allocated = 0;
   reg_n_max = 0;
 }
+
+/* Clear the information stored for REGNO.  */
+void
+clear_reg_info_regno (unsigned int regno)
+{
+  if (regno < regno_allocated)
+    memset (VEC_index (reg_info_p, reg_n_info, regno), 0, sizeof (reg_info));
+}
 
 /* This is the `regscan' pass of the compiler, run just before cse
    and again just before loop.
@@ -2337,10 +2345,10 @@ reg_scan (rtx f, unsigned int nregs)
 	if (GET_CODE (pat) == PARALLEL
 	    && XVECLEN (pat, 0) > max_parallel)
 	  max_parallel = XVECLEN (pat, 0);
-	reg_scan_mark_refs (pat, insn, 0);
+	reg_scan_mark_refs (pat, insn, 0, 0);
 
 	if (REG_NOTES (insn))
-	  reg_scan_mark_refs (REG_NOTES (insn), insn, 1);
+	  reg_scan_mark_refs (REG_NOTES (insn), insn, 1, 0);
       }
 
   max_parallel += max_set_parallel;
@@ -2348,11 +2356,39 @@ reg_scan (rtx f, unsigned int nregs)
   timevar_pop (TV_REG_SCAN);
 }
 
+/* Update 'regscan' information by looking at the insns
+   from FIRST to LAST.  Some new REGs have been created,
+   and any REG with number greater than OLD_MAX_REGNO is
+   such a REG.  We only update information for those.  */
+
+void
+reg_scan_update (rtx first, rtx last, unsigned int old_max_regno)
+{
+  rtx insn;
+
+  allocate_reg_info (max_reg_num (), FALSE, FALSE);
+
+  for (insn = first; insn != last; insn = NEXT_INSN (insn))
+    if (INSN_P (insn))
+      {
+	rtx pat = PATTERN (insn);
+	if (GET_CODE (pat) == PARALLEL
+	    && XVECLEN (pat, 0) > max_parallel)
+	  max_parallel = XVECLEN (pat, 0);
+	reg_scan_mark_refs (pat, insn, 0, old_max_regno);
+
+	if (REG_NOTES (insn))
+	  reg_scan_mark_refs (REG_NOTES (insn), insn, 1, old_max_regno);
+      }
+}
+
 /* X is the expression to scan.  INSN is the insn it appears in.
-   NOTE_FLAG is nonzero if X is from INSN's notes rather than its body.  */
+   NOTE_FLAG is nonzero if X is from INSN's notes rather than its body.
+   We should only record information for REGs with numbers
+   greater than or equal to MIN_REGNO.  */
 
 static void
-reg_scan_mark_refs (rtx x, rtx insn, int note_flag)
+reg_scan_mark_refs (rtx x, rtx insn, int note_flag, unsigned int min_regno)
 {
   enum rtx_code code;
   rtx dest;
@@ -2379,35 +2415,43 @@ reg_scan_mark_refs (rtx x, rtx insn, int
       {
 	unsigned int regno = REGNO (x);
 
-	if (!note_flag)
-	  REGNO_LAST_UID (regno) = INSN_UID (insn);
-	if (REGNO_FIRST_UID (regno) == 0)
-	  REGNO_FIRST_UID (regno) = INSN_UID (insn);
+	if (regno >= min_regno)
+	  {
+	    if (!note_flag)
+	      REGNO_LAST_UID (regno) = INSN_UID (insn);
+	    if (REGNO_FIRST_UID (regno) == 0)
+	      REGNO_FIRST_UID (regno) = INSN_UID (insn);
+	    /* If we are called by reg_scan_update() (indicated by min_regno
+	       being set), we also need to update the reference count.  */
+	    if (min_regno)
+	      REG_N_REFS (regno)++;
+	  }
       }
       break;
 
     case EXPR_LIST:
       if (XEXP (x, 0))
-	reg_scan_mark_refs (XEXP (x, 0), insn, note_flag);
+	reg_scan_mark_refs (XEXP (x, 0), insn, note_flag, min_regno);
       if (XEXP (x, 1))
-	reg_scan_mark_refs (XEXP (x, 1), insn, note_flag);
+	reg_scan_mark_refs (XEXP (x, 1), insn, note_flag, min_regno);
       break;
 
     case INSN_LIST:
       if (XEXP (x, 1))
-	reg_scan_mark_refs (XEXP (x, 1), insn, note_flag);
+	reg_scan_mark_refs (XEXP (x, 1), insn, note_flag, min_regno);
       break;
 
     case CLOBBER:
       {
 	rtx reg = XEXP (x, 0);
-	if (REG_P (reg))
+	if (REG_P (reg)
+	    && REGNO (reg) >= min_regno)
 	  {
 	    REG_N_SETS (REGNO (reg))++;
 	    REG_N_REFS (REGNO (reg))++;
 	  }
 	else if (MEM_P (reg))
-	  reg_scan_mark_refs (XEXP (reg, 0), insn, note_flag);
+	  reg_scan_mark_refs (XEXP (reg, 0), insn, note_flag, min_regno);
       }
       break;
 
@@ -2424,7 +2468,8 @@ reg_scan_mark_refs (rtx x, rtx insn, int
       if (GET_CODE (dest) == PARALLEL)
 	max_set_parallel = MAX (max_set_parallel, XVECLEN (dest, 0) - 1);
 
-      if (REG_P (dest))
+      if (REG_P (dest)
+	  && REGNO (dest) >= min_regno)
 	{
 	  REG_N_SETS (REGNO (dest))++;
 	  REG_N_REFS (REGNO (dest))++;
@@ -2444,6 +2489,7 @@ reg_scan_mark_refs (rtx x, rtx insn, int
 
       if (REG_P (SET_DEST (x))
 	  && REGNO (SET_DEST (x)) >= FIRST_PSEUDO_REGISTER
+	  && REGNO (SET_DEST (x)) >= min_regno
 	  /* If the destination pseudo is set more than once, then other
 	     sets might not be to a pointer value (consider access to a
 	     union in two threads of control in the presence of global
@@ -2504,12 +2550,12 @@ reg_scan_mark_refs (rtx x, rtx insn, int
 	for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--)
 	  {
 	    if (fmt[i] == 'e')
-	      reg_scan_mark_refs (XEXP (x, i), insn, note_flag);
+	      reg_scan_mark_refs (XEXP (x, i), insn, note_flag, min_regno);
 	    else if (fmt[i] == 'E' && XVEC (x, i) != 0)
 	      {
 		int j;
 		for (j = XVECLEN (x, i) - 1; j >= 0; j--)
-		  reg_scan_mark_refs (XVECEXP (x, i, j), insn, note_flag);
+		  reg_scan_mark_refs (XVECEXP (x, i, j), insn, note_flag, min_regno);
 	      }
 	  }
       }
Index: rtl.h
===================================================================
--- rtl.h	(revision 120281)
+++ rtl.h	(working copy)
@@ -1471,6 +1471,7 @@ extern int rtx_equal_p (rtx, rtx);
 extern rtvec gen_rtvec_v (int, rtx *);
 extern rtx gen_reg_rtx (enum machine_mode);
 extern rtx gen_rtx_REG_offset (rtx, enum machine_mode, unsigned int, int);
+extern rtx gen_reg_rtx_offset (rtx, enum machine_mode, int);
 extern rtx gen_label_rtx (void);
 extern rtx gen_lowpart_common (enum machine_mode, rtx);
 
@@ -2162,6 +2163,7 @@ extern void init_reg_sets (void);
 extern void regclass_init (void);
 extern void regclass (rtx, int);
 extern void reg_scan (rtx, unsigned int);
+extern void reg_scan_update (rtx, rtx, unsigned int);
 extern void fix_register (const char *, int, int);
 extern void init_subregs_of_mode (void);
 extern void record_subregs_of_mode (rtx);
Index: Makefile.in
===================================================================
--- Makefile.in	(revision 120281)
+++ Makefile.in	(working copy)
@@ -1018,7 +1018,7 @@ OBJS-common = \
  lambda-trans.o	lambda-code.o tree-loop-linear.o tree-ssa-sink.o 	   \
  tree-vrp.o tree-stdarg.o tree-cfgcleanup.o tree-ssa-reassoc.o		   \
  tree-ssa-structalias.o tree-object-size.o 				   \
- rtl-factoring.o
+ rtl-factoring.o lower-subreg.o
 
 
 OBJS-md = $(out_object_file)
@@ -2647,6 +2647,10 @@ hooks.o: hooks.c $(CONFIG_H) $(SYSTEM_H)
 pretty-print.o: $(CONFIG_H) $(SYSTEM_H) coretypes.h intl.h $(PRETTY_PRINT_H) \
    $(TREE_H)
 errors.o : errors.c $(CONFIG_H) $(SYSTEM_H) errors.h $(BCONFIG_H)
+lower-subreg.o : lower-subreg.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \
+   $(MACHMODE_H) $(TM_H) $(RTL_H) $(TM_P_H) $(TIMEVAR_H) $(FLAGS_H) \
+   insn-config.h $(BASIC_BLOCK_H) $(RECOG_H) $(OBSTACK_H) bitmap.h \
+   $(EXPR_H) $(REGS_H) tree-pass.h
 
 $(out_object_file): $(out_file) $(CONFIG_H) coretypes.h $(TM_H) $(TREE_H) \
    $(RTL_H) $(REGS_H) hard-reg-set.h insn-config.h conditions.h \
Index: passes.c
===================================================================
--- passes.c	(revision 120281)
+++ passes.c	(working copy)
@@ -633,6 +633,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_unshare_all_rtl);
   NEXT_PASS (pass_instantiate_virtual_regs);
   NEXT_PASS (pass_jump2);
+  NEXT_PASS (pass_lower_subreg);
   NEXT_PASS (pass_cse);
   NEXT_PASS (pass_rtl_fwprop);
   NEXT_PASS (pass_gcse);
@@ -652,6 +653,7 @@ init_optimization_passes (void)
   NEXT_PASS (pass_partition_blocks);
   NEXT_PASS (pass_regmove);
   NEXT_PASS (pass_split_all_insns);
+  NEXT_PASS (pass_lower_subreg2);
   NEXT_PASS (pass_mode_switching);
   NEXT_PASS (pass_see);
   NEXT_PASS (pass_recompute_reg_usage);


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]