This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR64164] drop copyrename, integrate into expand


On Sep 18, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote:

> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
> branch aoliva/pr64164, I am now able to build a cross toolchain for
> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
> the branch.

Thanks for the confirmation.  I've made one further tweak for cris and
lm32, dropping the assert that caused build failures for libstdc++
atomics parms that required more alignment than
MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
with a more recent baseline (r228019), with native regstraps on
x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
powerpc64le-linux-gnu, and cross toolchain builds for the following 73
platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
in building, but those that didn't failed at the very same spots before
and after this patch.


This patch doesn't really add much functionality.  It rather
reimplements a lot of the ugly and fragile stuff I put in in the
previous big patchset in a far more robust and pleasant way.  It fixes a
number of regressions in the process, mainly because, instead of
modifying assign_parms so as to let cfgexpand do part of its job, it
reverts all of the RTL assignment for parameters and results to
assign_parms.  cfgexpand now leaves the RTL assignment of partitions
containing default defs or parms and results to assign_parms, and
assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
assignment for the partition containing the default def of each
parameter.

This required introducing default defs for all parms and results, even
if unused; we could refrain from creating them, and refrain from
initializing those parameters (at least when optimizing), but that would
require messing with the fragile bits in assign_parms again, and it
would bring little benefit, since RTL optimization will likely notice
the initialization is unused and drop it anyway.  Besides, adding the
default defs was actually needed to fix a regression in the previous
patch, and even with the current patch it helps make sure we don't
assign more than one default def to the same SSA partition (the previous
patch attempted to do that, but there was a bug, fixed in the current
patch).  Having unused default defs makes it easier for us to decide
whether to use an entry_value rtx for the initial debug insn of a parm.
We track partitions holding default defs for parms and results with a
bitmap; we used to have a bitmap that tracked partitions holding default
defs, but it was unused!  I just renamed it and repurposed it.

I've also added checking asserts to set_rtl, to verify that, when we
expect a REG, we get a REG, and that it has the expected mode.  set_rtl
was also adjusted to record anonymous SSA names or their base types in
attrs of REGs or MEMs, respectively, so that code that relied on the
attrs to detect properties of the decl types no longer regress just
because we no longer generate decls for anonymous SSA names.  Since
there were prior uses of types in MEM attrs, that was expected to go
smoothly, but I was surprised at how smoothly adding SSA names to REG
attrs went.  No adjustments required!

I also tightened a bit the conditions for coalescing: we used to require
the same canonical type; I've added tests for same alignment
requirements, and for same signedness.  OTOH, I've added a few more
coalesce candidates for RESULT_DECLs and the newly-added default defs of
parms and results.

Other relevant changes were in mode promotion.  TYPE_MODE would often
return BLKmode for some vector types, which was fine for some return
decl RTL with PARALLEL, but that didn't quite work for SSA partitions.
There were other cases of mode promotion of result decls that failed the
asserts in set_rtl, that revealed promote_decl_mode didn't call
promote_function_mode as expected for results.

The new assers brought additional requirements: promoting the mode of
the RTL generated for the static chain, arranging for result decls to be
assigned to a pseudo where it would formerly have got a BLKmode PARALLEL
(as mentioned above), and arranging for parms set up by
assign_parm_setup_block, that would always get a MEM, to instead get a
REG when use_register_for_decl called for it.  In a few cases involving
complex parms, I couldn't figure out how to avoid a temporary MEM, used
to adjust padding of the parms, but although undesired, this is not a
regression, for we used to use the MEM, we'll just load them to
(coalescible) pseudos and use the pseudos instead, instead of coalescing
other vars that expected pseudos to the same MEM.

Is this ok to install?



revert to assign_parms assignments using default defs

From: Alexandre Oliva <aoliva@redhat.com>

Revert the fragile and complicated changes to assign_parms designed to
enable it to use RTL assigments chosen by cfgexpand, and instead have
cfgexpand use the RTL assignments by assign_parms, keying them off of
the default defs that are now necessarily introduced for each parm and
result.  The possible lack of a default def was already a problem, and
the fallbacks in place were not enough, as shown by PR67312.  We now
have checking asserts in set_rtl that verify that we're assigning to
each var a piece of RTL that matches the expectations set forth by
use_register_for_decl.

for  gcc/ChangeLog

	PR rtl-optimization/64164
	PR tree-optimization/67312
	PR middle-end/67340
	PR middle-end/67490
	PR bootstrap/67597
	* cfgexpand.c (parm_in_stack_slot_p): Remove.
	(ssa_default_def_partition): Remove.
	(get_rtl_for_parm_ssa_default_def): Remove.
	(set_rtl): Check that RTL assignments match expectations.
	Loop on SUBREGs, CONCATs and PARALLELs subexprs.  Set only the
	default def location for params and results.  Record SSA names
	or types in REG and MEM attrs, respectively.
	(set_parm_rtl): New.
	(expand_one_ssa_partition): Drop logic that assigned MEMs with
	unassigned addresses.
	(adjust_one_expanded_partition_var): Don't accept NULL RTL on
	deferred stack alloc vars.
	(expand_used_vars): Skip partitions holding parm default defs.
	Move adjust_one_expanded_partition_var loop...
	(pass_expand::execute): ... here.  Drop redundant assert.
	Adjust comments before the final loop over all ssa names.
	Require assigned rtl of parms and results to match exactly.
	Reset its attributes to match them, not any other variables in
	the same partition.
	(expand_debug_expr): Use entry value for PARM's default defs
	only iff they have zero nondebug uses.
	* cfgexpand.h (parm_in_stack_slot_p): Remove.
	(get_rtl_for_parm_ssa_default_def): Remove.
	(set_parm_rtl): Declare.
	* doc/invoke.texi: Improve wording.
	* explow.c (promote_decl_mode): Fix promote_function_mode for
	result decls not by reference.
	(promote_ssa_mode): Disregard BLKmode from promote_decl, and
	bypass TYPE_MODE to get the actual vector mode.
	* function.c: Include tree-dfa.h.  Revert 2015-08-14's and
	2015-08-19's changes as follows.  Drop include of
	basic-block.h and df.h.
	(rtl_for_parm): Remove.
	(maybe_reset_rtl_for_parm): Remove.
	(parm_in_unassigned_mem_p): Remove.
	(use_register_for_decl): Add logic for RESULT_DECLs matching
	assign_parms' behavior.
	(split_complex_args): Revert.
	(assign_parms_augmented_arg_list): Revert.  Add comment
	referencing the logic above.
	(assign_parm_adjust_stack_rtl): Revert.
	(assign_parm_setup_block): Revert.  Use set_parm_rtl instead
	of SET_DECL_RTL.  Set up a REG if the parm demands so.
	(assign_parm_setup_reg): Revert.  Consolidated SET_DECL_RTL
	calls into a single set_parm_rtl.  Set up a temporary RTL
	temporarily for expand_assignment.
	(assign_parm_setup_stack): Revert.  Use set_parm_rtl.
	(assign_parms_unsplit_complex): Revert.  Use set_parm_rtl.
	(assign_bounds): Revert.
	(assign_parms): Revert.  Use set_parm_rtl.
	(allocate_struct_function): Relayout result and parms of
	non-abstruct functions.
	(expand_function_start): Revert.  Use set_parm_rtl.  If the
	result is not a hard reg, create a pseudo from the promoted
	mode of the default def.  Promote static chain mode.
	* tree-outof-ssa.c (remove_ssa_form): Drop unused
	partition_has_default_def.  Set up
	partitions_for_parm_default_defs.
	(finish_out_of_ssa): Remove partition_has_default_def.
	Release partitions_for_parm_default_defs.
	* tree-outof-ssa.h (struct ssaexpand): Remove
	partition_has_default_def.  Add
	partitions_for_parm_default_defs.
	* tree-ssa-coalesce.c: Include tree-dfa.h, tm_p.h and
	stor-layout.h.
	(build_ssa_conflict_graph): Fix conflict-detection of default
	defs of even unused default defs of params and results.
	(for_all_parms): New.
	(create_default_def): New.
	(register_default_def): New.
	(coalesce_with_default): New.
	(create_outofssa_var_map): Create default defs for all parms
	and results, and register their partitions.  Add GIMPLE_RETURN
	operands as coalesce candidates with results.  Add default
	defs of each parm or result as coalesce candidates with its
	other defs.  Mark each result def, and each default def of
	parms, as used_in_copy.
	(gimple_can_coalesce_p): Call it.  Call use_register_for_decl
	with the ssa names, even anonymous ones.  Drop
	parm_in_stack_slot_p calls.  Require same signedness and
	alignment.
	(coalesce_ssa_name): Add coalesce candidates for all defs of
	each parm and result, even unused ones.
	(parm_default_def_partition_arg): New type.
	(set_parm_default_def_partition): New.
	(get_parm_default_def_partitions): New.
	* tree-ssa-coalesce.h (get_parm_default_def_partitions): New.
	* tree-ssa-live.c (partition_view_init): Regard unused defs of
	parms and results as used.
	(verify_live_on_entry): Don't error out just because they're
	not live.

for  gcc/testsuite/ChangeLog

	PR rtl-optimization/64164
	PR tree-optimization/67312
	* gcc.dg/pr67312.c: New.  From Zdenek Sojka.
	* gcc.target/i386/stackalign/return-4.c: Add -O.
---
 gcc/cfgexpand.c                                    |  332 +++++++-------
 gcc/cfgexpand.h                                    |    3 
 gcc/doc/invoke.texi                                |    9 
 gcc/explow.c                                       |   19 +
 gcc/function.c                                     |  477 +++++++-------------
 gcc/testsuite/gcc.dg/pr67312.c                     |    7 
 .../gcc.target/i386/stackalign/return-4.c          |    9 
 gcc/tree-outof-ssa.c                               |   15 -
 gcc/tree-outof-ssa.h                               |    6 
 gcc/tree-ssa-coalesce.c                            |  231 ++++++++--
 gcc/tree-ssa-coalesce.h                            |    1 
 gcc/tree-ssa-live.c                                |   10 
 12 files changed, 582 insertions(+), 537 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr67312.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 6c9284f..58e55d2 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -99,6 +99,8 @@ static rtx expand_debug_expr (tree);
 
 static bool defer_stack_allocation (tree, bool);
 
+static void record_alignment_for_reg_var (unsigned int);
+
 /* Return an expression tree corresponding to the RHS of GIMPLE
    statement STMT.  */
 
@@ -172,111 +174,86 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
-/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
-   assigned to a stack slot.  We can't have expand_one_ssa_partition
-   choose their address: the pseudo holding the address would be set
-   up too late for assign_params to copy the parameter if needed.
-
-   Such parameters are likely passed as a pointer to the value, rather
-   than as a value, and so we must not coalesce them, nor allocate
-   stack space for them before determining the calling conventions for
-   them.
-
-   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
-   with pc_rtx as the address, and then it replaces the pc_rtx with
-   NULL so as to make sure the MEM is not used before it is adjusted
-   in assign_parm_setup_reg.  */
-
-bool
-parm_in_stack_slot_p (tree var)
-{
-  if (!var || VAR_P (var))
-    return false;
-
-  gcc_assert (TREE_CODE (var) == PARM_DECL
-	      || TREE_CODE (var) == RESULT_DECL);
-
-  return !use_register_for_decl (var);
-}
-
-/* Return the partition of the default SSA_DEF for decl VAR.  */
-
-static int
-ssa_default_def_partition (tree var)
-{
-  tree name = ssa_default_def (cfun, var);
-
-  if (!name)
-    return NO_PARTITION;
-
-  return var_to_partition (SA.map, name);
-}
-
-/* Return the RTL for the default SSA def of a PARM or RESULT, if
-   there is one.  */
-
-rtx
-get_rtl_for_parm_ssa_default_def (tree var)
-{
-  gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL);
-
-  if (!is_gimple_reg (var))
-    return NULL_RTX;
-
-  /* If we've already determined RTL for the decl, use it.  This is
-     not just an optimization: if VAR is a PARM whose incoming value
-     is unused, we won't find a default def to use its partition, but
-     we still want to use the location of the parm, if it was used at
-     all.  During assign_parms, until a location is assigned for the
-     VAR, RTL can only for a parm or result if we're not coalescing
-     across variables, when we know we're coalescing all SSA_NAMEs of
-     each parm or result, and we're not coalescing them with names
-     pertaining to other variables, such as other parms' default
-     defs.  */
-  if (DECL_RTL_SET_P (var))
-    {
-      gcc_assert (DECL_RTL (var) != pc_rtx);
-      return DECL_RTL (var);
-    }
-
-  int part = ssa_default_def_partition (var);
-  if (part == NO_PARTITION)
-    return NULL_RTX;
-
-  return SA.partition_to_pseudo[part];
-}
-
 /* Associate declaration T with storage space X.  If T is no
    SSA name this is exactly SET_DECL_RTL, otherwise make the
    partition of T associated with X.  */
 static inline void
 set_rtl (tree t, rtx x)
 {
-  if (x && SSAVAR (t))
+  gcc_checking_assert (!x
+		       || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))
+		       || (use_register_for_decl (t)
+			   ? (REG_P (x)
+			      || (GET_CODE (x) == CONCAT
+				  && (REG_P (XEXP (x, 0))
+				      || SUBREG_P (XEXP (x, 0)))
+				  && (REG_P (XEXP (x, 1))
+				      || SUBREG_P (XEXP (x, 1))))
+			      || (GET_CODE (x) == PARALLEL
+				  && SSAVAR (t)
+				  && TREE_CODE (SSAVAR (t)) == RESULT_DECL
+				  && !flag_tree_coalesce_vars))
+			   : (MEM_P (x) || x == pc_rtx
+			      || (GET_CODE (x) == CONCAT
+				  && MEM_P (XEXP (x, 0))
+				  && MEM_P (XEXP (x, 1))))));
+  /* Check that the RTL for SSA_NAMEs and gimple-reg PARM_DECLs and
+     RESULT_DECLs has the expected mode.  For memory, we accept
+     unpromoted modes, since that's what we're likely to get.  For
+     PARM_DECLs and RESULT_DECLs, we'll have been called by
+     set_parm_rtl, which will give us the default def, so we don't
+     have to compute it ourselves.  For RESULT_DECLs, we accept mode
+     mismatches too, as long as we're not coalescing across variables,
+     so that we don't reject BLKmode PARALLELs or unpromoted REGs.  */
+  gcc_checking_assert (!x || x == pc_rtx || TREE_CODE (t) != SSA_NAME
+		       || (SSAVAR (t) && TREE_CODE (SSAVAR (t)) == RESULT_DECL
+			   && !flag_tree_coalesce_vars)
+		       || !use_register_for_decl (t)
+		       || GET_MODE (x) == promote_ssa_mode (t, NULL));
+
+  if (x)
     {
       bool skip = false;
       tree cur = NULL_TREE;
-
-      if (MEM_P (x))
-	cur = MEM_EXPR (x);
-      else if (REG_P (x))
-	cur = REG_EXPR (x);
-      else if (GET_CODE (x) == CONCAT
-	       && REG_P (XEXP (x, 0)))
-	cur = REG_EXPR (XEXP (x, 0));
-      else if (GET_CODE (x) == PARALLEL)
-	cur = REG_EXPR (XVECEXP (x, 0, 0));
-      else if (x == pc_rtx)
+      rtx xm = x;
+
+    retry:
+      if (MEM_P (xm))
+	cur = MEM_EXPR (xm);
+      else if (REG_P (xm))
+	cur = REG_EXPR (xm);
+      else if (SUBREG_P (xm))
+	{
+	  gcc_assert (subreg_lowpart_p (xm));
+	  xm = SUBREG_REG (xm);
+	  goto retry;
+	}
+      else if (GET_CODE (xm) == CONCAT)
+	{
+	  xm = XEXP (xm, 0);
+	  goto retry;
+	}
+      else if (GET_CODE (xm) == PARALLEL)
+	{
+	  xm = XVECEXP (xm, 0, 0);
+	  gcc_assert (GET_CODE (xm) == EXPR_LIST);
+	  xm = XEXP (xm, 0);
+	  goto retry;
+	}
+      else if (xm == pc_rtx)
 	skip = true;
       else
 	gcc_unreachable ();
 
-      tree next = skip ? cur : leader_merge (cur, SSAVAR (t));
+      tree next = skip ? cur : leader_merge (cur, SSAVAR (t) ? SSAVAR (t) : t);
 
       if (cur != next)
 	{
 	  if (MEM_P (x))
-	    set_mem_attributes (x, next, true);
+	    set_mem_attributes (x,
+				next && TREE_CODE (next) == SSA_NAME
+				? TREE_TYPE (next)
+				: next, true);
 	  else
 	    set_reg_attrs_for_decl_rtl (next, x);
 	}
@@ -294,13 +271,11 @@ set_rtl (tree t, rtx x)
 	}
       /* For the benefit of debug information at -O0 (where
          vartracking doesn't run) record the place also in the base
-         DECL.  For PARMs and RESULTs, we may end up resetting these
-         in function.c:maybe_reset_rtl_for_parm, but in some rare
-         cases we may need them (unused and overwritten incoming
-         value, that at -O0 must share the location with the other
-         uses in spite of the missing default def), and this may be
-         the only chance to preserve them.  */
-      if (x && x != pc_rtx && SSA_NAME_VAR (t))
+         DECL.  For PARMs and RESULTs, do so only when setting the
+         default def.  */
+      if (x && x != pc_rtx && SSA_NAME_VAR (t)
+	  && (VAR_P (SSA_NAME_VAR (t))
+	      || SSA_NAME_IS_DEFAULT_DEF (t)))
 	{
 	  tree var = SSA_NAME_VAR (t);
 	  /* If we don't yet have something recorded, just record it now.  */
@@ -1242,6 +1217,49 @@ account_stack_vars (void)
   return size;
 }
 
+/* Record the RTL assignment X for the default def of PARM.  */
+
+extern void
+set_parm_rtl (tree parm, rtx x)
+{
+  gcc_assert (TREE_CODE (parm) == PARM_DECL
+	      || TREE_CODE (parm) == RESULT_DECL);
+
+  if (x && !MEM_P (x))
+    {
+      unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (parm),
+					      TYPE_MODE (TREE_TYPE (parm)),
+					      TYPE_ALIGN (TREE_TYPE (parm)));
+
+      /* If the variable alignment is very large we'll dynamicaly
+	 allocate it, which means that in-frame portion is just a
+	 pointer.  ??? We've got a pseudo for sure here, do we
+	 actually dynamically allocate its spilling area if needed?
+	 ??? Isn't it a problem when POINTER_SIZE also exceeds
+	 MAX_SUPPORTED_STACK_ALIGNMENT, as on cris and lm32?  */
+      if (align > MAX_SUPPORTED_STACK_ALIGNMENT)
+	align = POINTER_SIZE;
+
+      record_alignment_for_reg_var (align);
+    }
+
+  if (!is_gimple_reg (parm))
+    return set_rtl (parm, x);
+
+  tree ssa = ssa_default_def (cfun, parm);
+  if (!ssa)
+    return set_rtl (parm, x);
+
+  int part = var_to_partition (SA.map, ssa);
+  gcc_assert (part != NO_PARTITION);
+
+  bool changed = bitmap_bit_p (SA.partitions_for_parm_default_defs, part);
+  gcc_assert (changed);
+
+  set_rtl (ssa, x);
+  gcc_assert (DECL_RTL (parm) == x);
+}
+
 /* A subroutine of expand_one_var.  Called to immediately assign rtl
    to a variable to be allocated in the stack frame.  */
 
@@ -1349,37 +1367,7 @@ expand_one_ssa_partition (tree var)
 
   if (!use_register_for_decl (var))
     {
-      /* We can't risk having the parm assigned to a MEM location
-	 whose address references a pseudo, for the pseudo will only
-	 be set up after arguments are copied to the stack slot.
-
-	 If the parm doesn't have a default def (e.g., because its
-	 incoming value is unused), then we want to let assign_params
-	 do the allocation, too.  In this case we want to make sure
-	 SSA_NAMEs associated with the parm don't get assigned to more
-	 than one partition, lest we'd create two unassigned stac
-	 slots for the same parm, thus the assert at the end of the
-	 block.  */
-      if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
-	  && (ssa_default_def_partition (SSA_NAME_VAR (var)) == part
-	      || !ssa_default_def (cfun, SSA_NAME_VAR (var))))
-	{
-	  expand_one_stack_var_at (var, pc_rtx, 0, 0);
-	  rtx x = SA.partition_to_pseudo[part];
-	  gcc_assert (GET_CODE (x) == MEM);
-	  gcc_assert (XEXP (x, 0) == pc_rtx);
-	  /* Reset the address, so that any attempt to use it will
-	     ICE.  It will be adjusted in assign_parm_setup_reg.  */
-	  XEXP (x, 0) = NULL_RTX;
-	  /* If the RTL associated with the parm is not what we have
-	     just created, the parm has been split over multiple
-	     partitions.  In order for this to work, we must have a
-	     default def for the parm, otherwise assign_params won't
-	     know what to do.  */
-	  gcc_assert (DECL_RTL_IF_SET (SSA_NAME_VAR (var)) == x
-		      || ssa_default_def (cfun, SSA_NAME_VAR (var)));
-	}
-      else if (defer_stack_allocation (var, true))
+      if (defer_stack_allocation (var, true))
 	add_stack_var (var);
       else
 	expand_one_stack_var_1 (var);
@@ -1393,8 +1381,8 @@ expand_one_ssa_partition (tree var)
   set_rtl (var, x);
 }
 
-/* Record the association between the RTL generated for a partition
-   and the underlying variable of the SSA_NAME.  */
+/* Record the association between the RTL generated for partition PART
+   and the underlying variable of the SSA_NAME VAR.  */
 
 static void
 adjust_one_expanded_partition_var (tree var)
@@ -1410,12 +1398,7 @@ adjust_one_expanded_partition_var (tree var)
 
   rtx x = SA.partition_to_pseudo[part];
 
-  if (!x)
-    {
-      /* This var will get a stack slot later.  */
-      gcc_assert (defer_stack_allocation (var, true));
-      return;
-    }
+  gcc_assert (x);
 
   set_rtl (var, x);
 
@@ -2040,6 +2023,9 @@ expand_used_vars (void)
 
   for (i = 0; i < SA.map->num_partitions; i++)
     {
+      if (bitmap_bit_p (SA.partitions_for_parm_default_defs, i))
+	continue;
+
       tree var = partition_to_var (SA.map, i);
 
       gcc_assert (!virtual_operand_p (var));
@@ -2047,9 +2033,6 @@ expand_used_vars (void)
       expand_one_ssa_partition (var);
     }
 
-  for (i = 1; i < num_ssa_names; i++)
-    adjust_one_expanded_partition_var (ssa_name (i));
-
   if (flag_stack_protect == SPCT_FLAG_STRONG)
       gen_stack_protect_signal
 	= stack_protect_decl_p () || stack_protect_return_slot_p ();
@@ -4947,26 +4930,27 @@ expand_debug_expr (tree exp)
 	  }
 	else
 	  {
+	    /* If this is a reference to an incoming value of
+	       parameter that is never used in the code or where the
+	       incoming value is never used in the code, use
+	       PARM_DECL's DECL_RTL if set.  */
+	    if (SSA_NAME_IS_DEFAULT_DEF (exp)
+		&& SSA_NAME_VAR (exp)
+		&& TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL
+		&& has_zero_uses (exp))
+	      {
+		op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
+		if (op0)
+		  goto adjust_mode;
+		op0 = expand_debug_expr (SSA_NAME_VAR (exp));
+		if (op0)
+		  goto adjust_mode;
+	      }
+
 	    int part = var_to_partition (SA.map, exp);
 
 	    if (part == NO_PARTITION)
-	      {
-		/* If this is a reference to an incoming value of parameter
-		   that is never used in the code or where the incoming
-		   value is never used in the code, use PARM_DECL's
-		   DECL_RTL if set.  */
-		if (SSA_NAME_IS_DEFAULT_DEF (exp)
-		    && TREE_CODE (SSA_NAME_VAR (exp)) == PARM_DECL)
-		  {
-		    op0 = expand_debug_parm_decl (SSA_NAME_VAR (exp));
-		    if (op0)
-		      goto adjust_mode;
-		    op0 = expand_debug_expr (SSA_NAME_VAR (exp));
-		    if (op0)
-		      goto adjust_mode;
-		  }
-		return NULL;
-	      }
+	      return NULL;
 
 	    gcc_assert (part >= 0 && (unsigned)part < SA.map->num_partitions);
 
@@ -6216,9 +6200,26 @@ pass_expand::execute (function *fun)
       parm_birth_insn = var_seq;
     }
 
-  /* If we have a class containing differently aligned pointers
-     we need to merge those into the corresponding RTL pointer
-     alignment.  */
+  /* Now propagate the RTL assignment of each partition to the
+     underlying var of each SSA_NAME.  */
+  for (i = 1; i < num_ssa_names; i++)
+    {
+      tree name = ssa_name (i);
+
+      if (!name
+	  /* We might have generated new SSA names in
+	     update_alias_info_with_stack_vars.  They will have a NULL
+	     defining statements, and won't be part of the partitioning,
+	     so ignore those.  */
+	  || !SSA_NAME_DEF_STMT (name))
+	continue;
+
+      adjust_one_expanded_partition_var (name);
+    }
+
+  /* Clean up RTL of variables that straddle across multiple
+     partitions, and check that the rtl of any PARM_DECLs that are not
+     cleaned up is that of their default defs.  */
   for (i = 1; i < num_ssa_names; i++)
     {
       tree name = ssa_name (i);
@@ -6235,9 +6236,6 @@ pass_expand::execute (function *fun)
       if (part == NO_PARTITION)
 	continue;
 
-      gcc_assert (SA.partition_to_pseudo[part]
-		  || defer_stack_allocation (name, true));
-
       /* If this decl was marked as living in multiple places, reset
 	 this now to NULL.  */
       tree var = SSA_NAME_VAR (name);
@@ -6252,7 +6250,19 @@ pass_expand::execute (function *fun)
 	  rtx in = DECL_RTL_IF_SET (var);
 	  gcc_assert (in);
 	  rtx out = SA.partition_to_pseudo[part];
-	  gcc_assert (in == out || rtx_equal_p (in, out));
+	  gcc_assert (in == out);
+
+	  /* Now reset VAR's RTL to IN, so that the _EXPR attrs match
+	     those expected by debug backends for each parm and for
+	     the result.  This is particularly important for stabs,
+	     whose register elimination from parm's DECL_RTL may cause
+	     -fcompare-debug differences as SET_DECL_RTL changes reg's
+	     attrs.  So, make sure the RTL already has the parm as the
+	     EXPR, so that it won't change.  */
+	  SET_DECL_RTL (var, NULL_RTX);
+	  if (MEM_P (in))
+	    set_mem_attributes (in, var, true);
+	  SET_DECL_RTL (var, in);
 	}
     }
 
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index ff7f4bef..8852411 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,8 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple *);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
-extern bool parm_in_stack_slot_p (tree);
-extern rtx get_rtl_for_parm_ssa_default_def (tree var);
+extern void set_parm_rtl (tree, rtx);
 
 
 #endif /* GCC_CFGEXPAND_H */
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 09c58ee..aefb061 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8866,12 +8866,13 @@ profitable to parallelize the loops.
 
 @item -ftree-coalesce-vars
 @opindex ftree-coalesce-vars
-Tell the compiler to attempt to combine small user-defined variables
-too, instead of just compiler temporaries.  This may severely limit the
-ability to debug an optimized program compiled with
+While transforming the program out of the SSA representation, attempt to
+reduce copying by coalescing versions of different user-defined
+variables, instead of just compiler temporaries.  This may severely
+limit the ability to debug an optimized program compiled with
 @option{-fno-var-tracking-assignments}.  In the negated form, this flag
 prevents SSA coalescing of user variables.  This option is enabled by
-default if optimization is enabled.
+default if optimization is enabled, and it does very little otherwise.
 
 @item -ftree-loop-if-convert
 @opindex ftree-loop-if-convert
diff --git a/gcc/explow.c b/gcc/explow.c
index 6941f4e..d104a79 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -830,8 +830,10 @@ promote_decl_mode (const_tree decl, int *punsignedp)
   machine_mode mode = DECL_MODE (decl);
   machine_mode pmode;
 
-  if (TREE_CODE (decl) == RESULT_DECL
-      || TREE_CODE (decl) == PARM_DECL)
+  if (TREE_CODE (decl) == RESULT_DECL && !DECL_BY_REFERENCE (decl))
+    pmode = promote_function_mode (type, mode, &unsignedp,
+                                   TREE_TYPE (current_function_decl), 1);
+  else if (TREE_CODE (decl) == RESULT_DECL || TREE_CODE (decl) == PARM_DECL)
     pmode = promote_function_mode (type, mode, &unsignedp,
                                    TREE_TYPE (current_function_decl), 2);
   else
@@ -857,12 +859,23 @@ promote_ssa_mode (const_tree name, int *punsignedp)
   if (SSA_NAME_VAR (name)
       && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL
 	  || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL))
-    return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+    {
+      machine_mode mode = promote_decl_mode (SSA_NAME_VAR (name), punsignedp);
+      if (mode != BLKmode)
+	return mode;
+    }
 
   tree type = TREE_TYPE (name);
   int unsignedp = TYPE_UNSIGNED (type);
   machine_mode mode = TYPE_MODE (type);
 
+  /* Bypass TYPE_MODE when it maps vector modes to BLKmode.  */
+  if (mode == BLKmode)
+    {
+      gcc_assert (VECTOR_TYPE_P (type));
+      mode = type->type_common.mode;
+    }
+
   machine_mode pmode = promote_mode (type, mode, &unsignedp);
   if (punsignedp)
     *punsignedp = unsignedp;
diff --git a/gcc/function.c b/gcc/function.c
index 9b4c2b9..21304689 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -74,8 +74,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
 #include "cfgexpand.h"
-#include "basic-block.h"
-#include "df.h"
 #include "params.h"
 #include "bb-reorder.h"
 #include "shrink-wrap.h"
@@ -83,6 +81,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "rtl-iter.h"
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
+#include "tree-dfa.h"
 
 /* So we can assign to cfun in this file.  */
 #undef cfun
@@ -152,9 +151,6 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
-static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
-static void maybe_reset_rtl_for_parm (tree);
-static bool parm_in_unassigned_mem_p (tree, rtx);
 
 
 /* Stack of nested functions.  */
@@ -2145,6 +2141,47 @@ use_register_for_decl (const_tree decl)
   if (TREE_ADDRESSABLE (decl))
     return false;
 
+  /* RESULT_DECLs are a bit special in that they're assigned without
+     regard to use_register_for_decl, but we generally only store in
+     them.  If we coalesce their SSA NAMEs, we'd better return a
+     result that matches the assignment in expand_function_start.  */
+  if (TREE_CODE (decl) == RESULT_DECL)
+    {
+      /* If it's not an aggregate, we're going to use a REG or a
+	 PARALLEL containing a REG.  */
+      if (!aggregate_value_p (decl, current_function_decl))
+	return true;
+
+      /* If expand_function_start determines the return value, we'll
+	 use MEM if it's not by reference.  */
+      if (cfun->returns_pcc_struct
+	  || (targetm.calls.struct_value_rtx
+	      (TREE_TYPE (current_function_decl), 1)))
+	return DECL_BY_REFERENCE (decl);
+
+      /* Otherwise, we're taking an extra all.function_result_decl
+	 argument.  It's set up in assign_parms_augmented_arg_list,
+	 under the (negated) conditions above, and then it's used to
+	 set up the RESULT_DECL rtl in assign_params, after looping
+	 over all parameters.  Now, if the RESULT_DECL is not by
+	 reference, we'll use a MEM either way.  */
+      if (!DECL_BY_REFERENCE (decl))
+	return false;
+
+      /* Otherwise, if RESULT_DECL is DECL_BY_REFERENCE, it will take
+	 the function_result_decl's assignment.  Since it's a pointer,
+	 we can short-circuit a number of the tests below, and we must
+	 duplicat e them because we don't have the
+	 function_result_decl to test.  */
+      if (!targetm.calls.allocate_stack_slots_for_args ())
+	return true;
+      /* We don't set DECL_IGNORED_P for the function_result_decl.  */
+      if (optimize)
+	return true;
+      /* We don't set DECL_REGISTER for the function_result_decl.  */
+      return false;
+    }
+
   /* Decl is implicitly addressible by bound stores and loads
      if it is an aggregate holding bounds.  */
   if (chkp_function_instrumented_p (current_function_decl)
@@ -2272,7 +2309,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all)
    needed, else the old list.  */
 
 static void
-split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
+split_complex_args (vec<tree> *args)
 {
   unsigned i;
   tree p;
@@ -2283,7 +2320,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
       if (TREE_CODE (type) == COMPLEX_TYPE
 	  && targetm.calls.split_complex_arg (type))
 	{
-	  tree cparm = p;
 	  tree decl;
 	  tree subtype = TREE_TYPE (type);
 	  bool addressable = TREE_ADDRESSABLE (p);
@@ -2302,9 +2338,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 	  DECL_ARTIFICIAL (p) = addressable;
 	  DECL_IGNORED_P (p) = addressable;
 	  TREE_ADDRESSABLE (p) = 0;
-	  /* Reset the RTL before layout_decl, or it may change the
-	     mode of the RTL of the original argument copied to P.  */
-	  SET_DECL_RTL (p, NULL_RTX);
 	  layout_decl (p, 0);
 	  (*args)[i] = p;
 
@@ -2316,41 +2349,6 @@ split_complex_args (struct assign_parm_data_all *all, vec<tree> *args)
 	  DECL_IGNORED_P (decl) = addressable;
 	  layout_decl (decl, 0);
 	  args->safe_insert (++i, decl);
-
-	  /* If we are expanding a function, rather than gimplifying
-	     it, propagate the RTL of the complex parm to the split
-	     declarations, and set their contexts so that
-	     maybe_reset_rtl_for_parm can recognize them and refrain
-	     from resetting their RTL.  */
-	  if (currently_expanding_to_rtl)
-	    {
-	      maybe_reset_rtl_for_parm (cparm);
-	      rtx rtl = rtl_for_parm (all, cparm);
-	      if (rtl)
-		{
-		  /* If this is parm is unassigned, assign it now: the
-		     newly-created decls wouldn't expect the need for
-		     assignment, and if they were assigned
-		     independently, they might not end up in adjacent
-		     slots, so unsplit wouldn't be able to fill in the
-		     unassigned address of the complex MEM.  */
-		  if (parm_in_unassigned_mem_p (cparm, rtl))
-		    {
-		      int align = STACK_SLOT_ALIGNMENT
-			(TREE_TYPE (cparm), GET_MODE (rtl), MEM_ALIGN (rtl));
-		      rtx loc = assign_stack_local
-			(GET_MODE (rtl), GET_MODE_SIZE (GET_MODE (rtl)),
-			 align);
-		      XEXP (rtl, 0) = XEXP (loc, 0);
-		    }
-
-		  SET_DECL_RTL (p, read_complex_part (rtl, false));
-		  SET_DECL_RTL (decl, read_complex_part (rtl, true));
-
-		  DECL_CONTEXT (p) = cparm;
-		  DECL_CONTEXT (decl) = cparm;
-		}
-	    }
 	}
     }
 }
@@ -2386,6 +2384,9 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
       DECL_ARTIFICIAL (decl) = 1;
       DECL_NAMELESS (decl) = 1;
       TREE_CONSTANT (decl) = 1;
+      /* We don't set DECL_IGNORED_P or DECL_REGISTER here.  If this
+	 changes, the end of the RESULT_DECL handling block in
+	 use_register_for_decl must be adjusted to match.  */
 
       DECL_CHAIN (decl) = all->orig_fnargs;
       all->orig_fnargs = decl;
@@ -2413,7 +2414,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-    split_complex_args (all, &fnargs);
+    split_complex_args (&fnargs);
 
   return fnargs;
 }
@@ -2816,98 +2817,23 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data)
   data->entry_parm = entry_parm;
 }
 
-/* Wrapper for use_register_for_decl, that special-cases the
-   .result_ptr as the function's RESULT_DECL when the RESULT_DECL is
-   passed by reference.  */
-
-static bool
-use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm)
-{
-  if (parm == all->function_result_decl)
-    {
-      tree result = DECL_RESULT (current_function_decl);
-
-      if (DECL_BY_REFERENCE (result))
-	parm = result;
-    }
-
-  return use_register_for_decl (parm);
-}
-
-/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases
-   the .result_ptr as the function's RESULT_DECL when the RESULT_DECL
-   is passed by reference.  */
-
-static rtx
-rtl_for_parm (struct assign_parm_data_all *all, tree parm)
-{
-  if (parm == all->function_result_decl)
-    {
-      tree result = DECL_RESULT (current_function_decl);
-
-      if (!DECL_BY_REFERENCE (result))
-	return NULL_RTX;
-
-      parm = result;
-    }
-
-  return get_rtl_for_parm_ssa_default_def (parm);
-}
-
-/* Reset the location of PARM_DECLs and RESULT_DECLs that had
-   SSA_NAMEs in multiple partitions, so that assign_parms will choose
-   the default def, if it exists, or create new RTL to hold the unused
-   entry value.  If we are coalescing across variables, we want to
-   reset the location too, because a parm without a default def
-   (incoming value unused) might be coalesced with one with a default
-   def, and then assign_parms would copy both incoming values to the
-   same location, which might cause the wrong value to survive.  */
-static void
-maybe_reset_rtl_for_parm (tree parm)
-{
-  gcc_assert (TREE_CODE (parm) == PARM_DECL
-	      || TREE_CODE (parm) == RESULT_DECL);
-
-  /* This is a split complex parameter, and its context was set to its
-     original PARM_DECL in split_complex_args so that we could
-     recognize it here and not reset its RTL.  */
-  if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
-    {
-      DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
-      return;
-    }
-
-  if ((flag_tree_coalesce_vars
-       || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx))
-      && is_gimple_reg (parm))
-    SET_DECL_RTL (parm, NULL_RTX);
-}
-
 /* A subroutine of assign_parms.  Adjust DATA->STACK_RTL such that it's
    always valid and properly aligned.  */
 
 static void
-assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm,
-			      struct assign_parm_data_one *data)
+assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data)
 {
   rtx stack_parm = data->stack_parm;
 
-  /* If out-of-SSA assigned RTL to the parm default def, make sure we
-     don't use what we might have computed before.  */
-  rtx ssa_assigned = rtl_for_parm (all, parm);
-  if (ssa_assigned)
-    stack_parm = NULL;
-
   /* If we can't trust the parm stack slot to be aligned enough for its
      ultimate type, don't use that slot after entry.  We'll make another
      stack slot, if we need one.  */
-  else if (stack_parm
-	   && ((STRICT_ALIGNMENT
-		&& (GET_MODE_ALIGNMENT (data->nominal_mode)
-		    > MEM_ALIGN (stack_parm)))
-	       || (data->nominal_type
-		   && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
-		   && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
+  if (stack_parm
+      && ((STRICT_ALIGNMENT
+	   && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm))
+	  || (data->nominal_type
+	      && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm)
+	      && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY)))
     stack_parm = NULL;
 
   /* If parm was passed in memory, and we need to convert it on entry,
@@ -2952,27 +2878,6 @@ assign_parm_setup_block_p (struct assign_parm_data_one *data)
   return false;
 }
 
-/* Return true if FROM_EXPAND is a MEM with an address to be filled in
-   by assign_params.  This should be the case if, and only if,
-   parm_in_stack_slot_p holds for the parm DECL that expanded to
-   FROM_EXPAND, so we check that, too.  */
-
-static bool
-parm_in_unassigned_mem_p (tree decl, rtx from_expand)
-{
-  bool result = MEM_P (from_expand) && !XEXP (from_expand, 0);
-
-  gcc_assert (result == parm_in_stack_slot_p (decl)
-	      /* Maybe it was already assigned.  That's ok, especially
-		 for split complex args.  */
-	      || (!result && MEM_P (from_expand)
-		  && (XEXP (from_expand, 0) == virtual_stack_vars_rtx
-		      || (GET_CODE (XEXP (from_expand, 0)) == PLUS
-			  && XEXP (XEXP (from_expand, 0), 0) == virtual_stack_vars_rtx))));
-
-  return result;
-}
-
 /* A subroutine of assign_parms.  Arrange for the parameter to be
    present and valid in DATA->STACK_RTL.  */
 
@@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 {
   rtx entry_parm = data->entry_parm;
   rtx stack_parm = data->stack_parm;
+  rtx target_reg = NULL_RTX;
   HOST_WIDE_INT size;
   HOST_WIDE_INT size_stored;
 
   if (GET_CODE (entry_parm) == PARALLEL)
     entry_parm = emit_group_move_into_temps (entry_parm);
 
+  /* If we want the parameter in a pseudo, don't use a stack slot.  */
+  if (is_gimple_reg (parm) && use_register_for_decl (parm))
+    {
+      tree def = ssa_default_def (cfun, parm);
+      gcc_assert (def);
+      machine_mode mode = promote_ssa_mode (def, NULL);
+      rtx reg = gen_reg_rtx (mode);
+      if (GET_CODE (reg) != CONCAT)
+	stack_parm = reg;
+      else
+	/* This will use or allocate a stack slot that we'd rather
+	   avoid.  FIXME: Could we avoid it in more cases?  */
+	target_reg = reg;
+      data->stack_parm = NULL;
+    }
+
   size = int_size_in_bytes (data->passed_type);
   size_stored = CEIL_ROUND (size, UNITS_PER_WORD);
-
   if (stack_parm == 0)
     {
       DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
-      rtx from_expand = rtl_for_parm (all, parm);
-      if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
-	stack_parm = copy_rtx (from_expand);
-      else
-	{
-	  stack_parm = assign_stack_local (BLKmode, size_stored,
-					   DECL_ALIGN (parm));
-	  if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
-	    PUT_MODE (stack_parm, GET_MODE (entry_parm));
-	  if (from_expand)
-	    {
-	      gcc_assert (GET_CODE (stack_parm) == MEM);
-	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
-	      XEXP (from_expand, 0) = XEXP (stack_parm, 0);
-	      PUT_MODE (from_expand, GET_MODE (stack_parm));
-	      stack_parm = copy_rtx (from_expand);
-	    }
-	  else
-	    set_mem_attributes (stack_parm, parm, 1);
-	}
+      stack_parm = assign_stack_local (BLKmode, size_stored,
+				       DECL_ALIGN (parm));
+      if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
+	PUT_MODE (stack_parm, GET_MODE (entry_parm));
+      set_mem_attributes (stack_parm, parm, 1);
     }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle
@@ -3054,11 +2960,6 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       else if (size == 0)
 	;
 
-      /* MEM may be a REG if coalescing assigns the param's partition
-	 to a pseudo.  */
-      else if (REG_P (mem))
-	emit_move_insn (mem, entry_parm);
-
       /* If SIZE is that of a mode no bigger than a word, just use
 	 that mode's store operation.  */
       else if (size <= UNITS_PER_WORD)
@@ -3113,10 +3014,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 	      tem = change_address (mem, word_mode, 0);
 	      emit_move_insn (tem, x);
 	    }
+	  else if (!MEM_P (mem))
+	    emit_move_insn (mem, entry_parm);
 	  else
 	    move_block_from_reg (REGNO (entry_parm), mem,
 				 size_stored / UNITS_PER_WORD);
 	}
+      else if (!MEM_P (mem))
+	emit_move_insn (mem, entry_parm);
       else
 	move_block_from_reg (REGNO (entry_parm), mem,
 			     size_stored / UNITS_PER_WORD);
@@ -3131,8 +3036,14 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
       end_sequence ();
     }
 
+  if (target_reg)
+    {
+      emit_move_insn (target_reg, stack_parm);
+      stack_parm = target_reg;
+    }
+
   data->stack_parm = stack_parm;
-  SET_DECL_RTL (parm, stack_parm);
+  set_parm_rtl (parm, stack_parm);
 }
 
 /* A subroutine of assign_parms.  Allocate a pseudo to hold the current
@@ -3148,6 +3059,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
   int unsignedp = TYPE_UNSIGNED (TREE_TYPE (parm));
   bool did_conversion = false;
   bool need_conversion, moved;
+  rtx rtl;
 
   /* Store the parm in a pseudoregister during the function, but we may
      need to do it in a wider mode.  Using 2 here makes the result
@@ -3156,40 +3068,19 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
     = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp,
 			     TREE_TYPE (current_function_decl), 2);
 
-  rtx from_expand = parmreg = rtl_for_parm (all, parm);
-
-  if (from_expand && !data->passed_pointer)
-    {
-      if (GET_MODE (parmreg) != promoted_nominal_mode)
-	parmreg = gen_lowpart (promoted_nominal_mode, parmreg);
-    }
-  else if (!from_expand || parm_in_unassigned_mem_p (parm, from_expand))
-    {
-      parmreg = gen_reg_rtx (promoted_nominal_mode);
-      if (!DECL_ARTIFICIAL (parm))
-	mark_user_reg (parmreg);
-
-      if (from_expand)
-	{
-	  gcc_assert (data->passed_pointer);
-	  gcc_assert (GET_CODE (from_expand) == MEM
-		      && XEXP (from_expand, 0) == NULL_RTX);
-	  XEXP (from_expand, 0) = parmreg;
-	}
-    }
+  parmreg = gen_reg_rtx (promoted_nominal_mode);
+  if (!DECL_ARTIFICIAL (parm))
+    mark_user_reg (parmreg);
 
   /* If this was an item that we received a pointer to,
-     set DECL_RTL appropriately.  */
-  if (from_expand)
-    SET_DECL_RTL (parm, from_expand);
-  else if (data->passed_pointer)
+     set rtl appropriately.  */
+  if (data->passed_pointer)
     {
-      rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
-      set_mem_attributes (x, parm, 1);
-      SET_DECL_RTL (parm, x);
+      rtl = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg);
+      set_mem_attributes (rtl, parm, 1);
     }
   else
-    SET_DECL_RTL (parm, parmreg);
+    rtl = parmreg;
 
   assign_parm_remove_parallels (data);
 
@@ -3197,13 +3088,10 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      assign_parm_find_data_types and expand_expr_real_1.  */
 
   equiv_stack_parm = data->stack_parm;
-  if (!equiv_stack_parm)
-    equiv_stack_parm = data->entry_parm;
   validated_mem = validize_mem (copy_rtx (data->entry_parm));
 
   need_conversion = (data->nominal_mode != data->passed_mode
 		     || promoted_nominal_mode != data->promoted_mode);
-  gcc_assert (!(need_conversion && data->passed_pointer && from_expand));
   moved = false;
 
   if (need_conversion
@@ -3327,7 +3215,9 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       /* TREE_USED gets set erroneously during expand_assignment.  */
       save_tree_used = TREE_USED (parm);
+      SET_DECL_RTL (parm, rtl);
       expand_assignment (parm, make_tree (data->nominal_type, tempreg), false);
+      SET_DECL_RTL (parm, NULL_RTX);
       TREE_USED (parm) = save_tree_used;
       all->first_conversion_insn = get_insns ();
       all->last_conversion_insn = get_last_insn ();
@@ -3335,28 +3225,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
       did_conversion = true;
     }
-  /* We don't want to copy the incoming pointer to a parmreg expected
-     to hold the value rather than the pointer.  */
-  else if (!data->passed_pointer || parmreg != from_expand)
+  else
     emit_move_insn (parmreg, validated_mem);
 
   /* If we were passed a pointer but the actual value can safely live
      in a register, retrieve it and use it directly.  */
-  if (data->passed_pointer
-      && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode))
+  if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode)
     {
-      rtx src = DECL_RTL (parm);
-
       /* We can't use nominal_mode, because it will have been set to
 	 Pmode above.  We must use the actual mode of the parm.  */
-      if (from_expand)
-	{
-	  parmreg = from_expand;
-	  gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm)));
-	  src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem);
-	  set_mem_attributes (src, parm, 1);
-	}
-      else if (use_register_for_decl (parm))
+      if (use_register_for_decl (parm))
 	{
 	  parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm)));
 	  mark_user_reg (parmreg);
@@ -3373,14 +3251,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	  set_mem_attributes (parmreg, parm, 1);
 	}
 
-      if (GET_MODE (parmreg) != GET_MODE (src))
+      if (GET_MODE (parmreg) != GET_MODE (rtl))
 	{
-	  rtx tempreg = gen_reg_rtx (GET_MODE (src));
+	  rtx tempreg = gen_reg_rtx (GET_MODE (rtl));
 	  int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm));
 
 	  push_to_sequence2 (all->first_conversion_insn,
 			     all->last_conversion_insn);
-	  emit_move_insn (tempreg, src);
+	  emit_move_insn (tempreg, rtl);
 	  tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p);
 	  emit_move_insn (parmreg, tempreg);
 	  all->first_conversion_insn = get_insns ();
@@ -3389,18 +3267,18 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 
 	  did_conversion = true;
 	}
-      else if (GET_MODE (parmreg) == BLKmode)
-	gcc_assert (parm_in_stack_slot_p (parm));
       else
-	emit_move_insn (parmreg, src);
+	emit_move_insn (parmreg, rtl);
 
-      SET_DECL_RTL (parm, parmreg);
+      rtl = parmreg;
 
       /* STACK_PARM is the pointer, not the parm, and PARMREG is
 	 now the parm.  */
-      data->stack_parm = equiv_stack_parm = NULL;
+      data->stack_parm = NULL;
     }
 
+  set_parm_rtl (parm, rtl);
+
   /* Mark the register as eliminable if we did no conversion and it was
      copied from memory at a fixed offset, and the arg pointer was not
      copied to a pseudo-reg.  If the arg pointer is a pseudo reg or the
@@ -3408,11 +3286,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
      make here would screw up life analysis for it.  */
   if (data->nominal_mode == data->passed_mode
       && !did_conversion
-      && equiv_stack_parm != 0
-      && MEM_P (equiv_stack_parm)
+      && data->stack_parm != 0
+      && MEM_P (data->stack_parm)
       && data->locate.offset.var == 0
       && reg_mentioned_p (virtual_incoming_args_rtx,
-			  XEXP (equiv_stack_parm, 0)))
+			  XEXP (data->stack_parm, 0)))
     {
       rtx_insn *linsn = get_last_insn ();
       rtx_insn *sinsn;
@@ -3425,8 +3303,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 	    = GET_MODE_INNER (GET_MODE (parmreg));
 	  int regnor = REGNO (XEXP (parmreg, 0));
 	  int regnoi = REGNO (XEXP (parmreg, 1));
-	  rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0);
-	  rtx stacki = adjust_address_nv (equiv_stack_parm, submode,
+	  rtx stackr = adjust_address_nv (data->stack_parm, submode, 0);
+	  rtx stacki = adjust_address_nv (data->stack_parm, submode,
 					  GET_MODE_SIZE (submode));
 
 	  /* Scan backwards for the set of the real and
@@ -3444,7 +3322,7 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm,
 		set_unique_reg_note (sinsn, REG_EQUIV, stackr);
 	    }
 	}
-      else 
+      else
 	set_dst_reg_note (linsn, REG_EQUIV, equiv_stack_parm, parmreg);
     }
 
@@ -3496,16 +3374,6 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
   if (data->entry_parm != data->stack_parm)
     {
       rtx src, dest;
-      rtx from_expand = NULL_RTX;
-
-      if (data->stack_parm == 0)
-	{
-	  from_expand = rtl_for_parm (all, parm);
-	  if (from_expand)
-	    gcc_assert (GET_MODE (from_expand) == GET_MODE (data->entry_parm));
-	  if (from_expand && !parm_in_unassigned_mem_p (parm, from_expand))
-	    data->stack_parm = from_expand;
-	}
 
       if (data->stack_parm == 0)
 	{
@@ -3516,16 +3384,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
 	    = assign_stack_local (GET_MODE (data->entry_parm),
 				  GET_MODE_SIZE (GET_MODE (data->entry_parm)),
 				  align);
-	  if (!from_expand)
-	    set_mem_attributes (data->stack_parm, parm, 1);
-	  else
-	    {
-	      gcc_assert (GET_CODE (data->stack_parm) == MEM);
-	      gcc_assert (parm_in_unassigned_mem_p (parm, from_expand));
-	      XEXP (from_expand, 0) = XEXP (data->stack_parm, 0);
-	      PUT_MODE (from_expand, GET_MODE (data->stack_parm));
-	      data->stack_parm = copy_rtx (from_expand);
-	    }
+	  set_mem_attributes (data->stack_parm, parm, 1);
 	}
 
       dest = validize_mem (copy_rtx (data->stack_parm));
@@ -3554,7 +3413,7 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm,
       end_sequence ();
     }
 
-  SET_DECL_RTL (parm, data->stack_parm);
+  set_parm_rtl (parm, data->stack_parm);
 }
 
 /* A subroutine of assign_parms.  If the ABI splits complex arguments, then
@@ -3580,21 +3439,11 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	  imag = DECL_RTL (fnargs[i + 1]);
 	  if (inner != GET_MODE (real))
 	    {
-	      real = simplify_gen_subreg (inner, real, GET_MODE (real),
-					  subreg_lowpart_offset
-					  (inner, GET_MODE (real)));
-	      imag = simplify_gen_subreg (inner, imag, GET_MODE (imag),
-					  subreg_lowpart_offset
-					  (inner, GET_MODE (imag)));
+	      real = gen_lowpart_SUBREG (inner, real);
+	      imag = gen_lowpart_SUBREG (inner, imag);
 	    }
 
-	  if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX
-	      && rtx_equal_p (real,
-			      read_complex_part (tmp, false))
-	      && rtx_equal_p (imag,
-			      read_complex_part (tmp, true)))
-	    ; /* We now have the right rtl in tmp.  */
-	  else if (TREE_ADDRESSABLE (parm))
+	  if (TREE_ADDRESSABLE (parm))
 	    {
 	      rtx rmem, imem;
 	      HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm));
@@ -3618,7 +3467,7 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all,
 	    }
 	  else
 	    tmp = gen_rtx_CONCAT (DECL_MODE (parm), real, imag);
-	  SET_DECL_RTL (parm, tmp);
+	  set_parm_rtl (parm, tmp);
 
 	  real = DECL_INCOMING_RTL (fnargs[i]);
 	  imag = DECL_INCOMING_RTL (fnargs[i + 1]);
@@ -3740,7 +3589,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs,
 	  assign_parm_setup_block (&all, pbdata->bounds_parm,
 				   &pbdata->parm_data);
 	else if (pbdata->parm_data.passed_pointer
-		 || use_register_for_parm_decl (&all, pbdata->bounds_parm))
+		 || use_register_for_decl (pbdata->bounds_parm))
 	  assign_parm_setup_reg (&all, pbdata->bounds_parm,
 				 &pbdata->parm_data);
 	else
@@ -3784,8 +3633,6 @@ assign_parms (tree fndecl)
 	  DECL_INCOMING_RTL (parm) = DECL_RTL (parm);
 	  continue;
 	}
-      else
-	maybe_reset_rtl_for_parm (parm);
 
       /* Estimate stack alignment from parameter alignment.  */
       if (SUPPORTS_STACK_ALIGNMENT)
@@ -3835,7 +3682,7 @@ assign_parms (tree fndecl)
       else
 	set_decl_incoming_rtl (parm, data.entry_parm, false);
 
-      assign_parm_adjust_stack_rtl (&all, parm, &data);
+      assign_parm_adjust_stack_rtl (&data);
 
       /* Bounds should be loaded in the particular order to
 	 have registers allocated correctly.  Collect info about
@@ -3856,8 +3703,7 @@ assign_parms (tree fndecl)
 	{
 	  if (assign_parm_setup_block_p (&data))
 	    assign_parm_setup_block (&all, parm, &data);
-	  else if (data.passed_pointer
-		   || use_register_for_parm_decl (&all, parm))
+	  else if (data.passed_pointer || use_register_for_decl (parm))
 	    assign_parm_setup_reg (&all, parm, &data);
 	  else
 	    assign_parm_setup_stack (&all, parm, &data);
@@ -3954,7 +3800,7 @@ assign_parms (tree fndecl)
 
       DECL_HAS_VALUE_EXPR_P (result) = 1;
 
-      SET_DECL_RTL (result, x);
+      set_parm_rtl (result, x);
     }
 
   /* We have aligned all the args, so add space for the pretend args.  */
@@ -4986,6 +4832,18 @@ allocate_struct_function (tree fndecl, bool abstract_p)
   if (fndecl != NULL_TREE)
     {
       tree result = DECL_RESULT (fndecl);
+
+      if (!abstract_p)
+	{
+	  /* Now that we have activated any function-specific attributes
+	     that might affect layout, particularly vector modes, relayout
+	     each of the parameters and the result.  */
+	  relayout_decl (result);
+	  for (tree parm = DECL_ARGUMENTS (fndecl); parm;
+	       parm = DECL_CHAIN (parm))
+	    relayout_decl (parm);
+	}
+
       if (!abstract_p && aggregate_value_p (result, fndecl))
 	{
 #ifdef PCC_STATIC_STRUCT_RETURN
@@ -5189,7 +5047,6 @@ expand_function_start (tree subr)
 
   /* Decide whether to return the value in memory or in a register.  */
   tree res = DECL_RESULT (subr);
-  maybe_reset_rtl_for_parm (res);
   if (aggregate_value_p (res, subr))
     {
       /* Returning something that won't go in a register.  */
@@ -5210,10 +5067,7 @@ expand_function_start (tree subr)
 	     it.  */
 	  if (sv)
 	    {
-	      if (DECL_BY_REFERENCE (res))
-		value_address = get_rtl_for_parm_ssa_default_def (res);
-	      if (!value_address)
-		value_address = gen_reg_rtx (Pmode);
+	      value_address = gen_reg_rtx (Pmode);
 	      emit_move_insn (value_address, sv);
 	    }
 	}
@@ -5222,33 +5076,35 @@ expand_function_start (tree subr)
 	  rtx x = value_address;
 	  if (!DECL_BY_REFERENCE (res))
 	    {
-	      x = get_rtl_for_parm_ssa_default_def (res);
-	      if (!x)
-		{
-		  x = gen_rtx_MEM (DECL_MODE (res), value_address);
-		  set_mem_attributes (x, res, 1);
-		}
+	      x = gen_rtx_MEM (DECL_MODE (res), x);
+	      set_mem_attributes (x, res, 1);
 	    }
-	  SET_DECL_RTL (res, x);
+	  set_parm_rtl (res, x);
 	}
     }
   else if (DECL_MODE (res) == VOIDmode)
     /* If return mode is void, this decl rtl should not be used.  */
-    SET_DECL_RTL (res, NULL_RTX);
-  else
+    set_parm_rtl (res, NULL_RTX);
+  else 
     {
       /* Compute the return values into a pseudo reg, which we will copy
 	 into the true return register after the cleanups are done.  */
       tree return_type = TREE_TYPE (res);
-      rtx x = get_rtl_for_parm_ssa_default_def (res);
-      if (x)
-	/* Use it.  */;
+      /* If we may coalesce this result, make sure it has the expected
+	 mode.  */
+      if (flag_tree_coalesce_vars && is_gimple_reg (res))
+	{
+	  tree def = ssa_default_def (cfun, res);
+	  gcc_assert (def);
+	  machine_mode mode = promote_ssa_mode (def, NULL);
+	  set_parm_rtl (res, gen_reg_rtx (mode));
+	}
       else if (TYPE_MODE (return_type) != BLKmode
 	       && targetm.calls.return_in_msb (return_type))
 	/* expand_function_end will insert the appropriate padding in
 	   this case.  Use the return value's natural (unpadded) mode
 	   within the function proper.  */
-	x = gen_reg_rtx (TYPE_MODE (return_type));
+	set_parm_rtl (res, gen_reg_rtx (TYPE_MODE (return_type)));
       else
 	{
 	  /* In order to figure out what mode to use for the pseudo, we
@@ -5259,16 +5115,14 @@ expand_function_start (tree subr)
 	  /* Structures that are returned in registers are not
 	     aggregate_value_p, so we may see a PARALLEL or a REG.  */
 	  if (REG_P (hard_reg))
-	    x = gen_reg_rtx (GET_MODE (hard_reg));
+	    set_parm_rtl (res, gen_reg_rtx (GET_MODE (hard_reg)));
 	  else
 	    {
 	      gcc_assert (GET_CODE (hard_reg) == PARALLEL);
-	      x = gen_group_rtx (hard_reg);
+	      set_parm_rtl (res, gen_group_rtx (hard_reg));
 	    }
 	}
 
-      SET_DECL_RTL (res, x);
-
       /* Set DECL_REGISTER flag so that expand_function_end will copy the
 	 result to the real return register(s).  */
       DECL_REGISTER (res) = 1;
@@ -5291,22 +5145,23 @@ expand_function_start (tree subr)
     {
       tree parm = cfun->static_chain_decl;
       rtx local, chain;
-     rtx_insn *insn;
+      rtx_insn *insn;
+      int unsignedp;
 
-      local = get_rtl_for_parm_ssa_default_def (parm);
-      if (!local)
-	local = gen_reg_rtx (Pmode);
+      local = gen_reg_rtx (promote_decl_mode (parm, &unsignedp));
       chain = targetm.calls.static_chain (current_function_decl, true);
 
       set_decl_incoming_rtl (parm, chain, false);
-      SET_DECL_RTL (parm, local);
+      set_parm_rtl (parm, local);
       mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm))));
 
-      if (GET_MODE (local) != Pmode)
-	local = convert_to_mode (Pmode, local,
-				 TYPE_UNSIGNED (TREE_TYPE (parm)));
-
-      insn = emit_move_insn (local, chain);
+      if (GET_MODE (local) != GET_MODE (chain))
+	{
+	  convert_move (local, chain, unsignedp);
+	  insn = get_last_insn ();
+	}
+      else
+	insn = emit_move_insn (local, chain);
 
       /* Mark the register as eliminable, similar to parameters.  */
       if (MEM_P (chain)
diff --git a/gcc/testsuite/gcc.dg/pr67312.c b/gcc/testsuite/gcc.dg/pr67312.c
new file mode 100644
index 0000000..f1c9fde
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr67312.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -ftree-coalesce-vars" } */
+
+void foo (int x, int y)
+{
+    y = x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
index a1e35dc..d14eb2f 100644
--- a/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
+++ b/gcc/testsuite/gcc.target/i386/stackalign/return-4.c
@@ -1,6 +1,13 @@
 /* { dg-do compile } */
-/* { dg-options "-mpreferred-stack-boundary=4" } */
+/* { dg-options "-mpreferred-stack-boundary=4 -O" } */
 /* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-64,\[^\\n\]*sp" } } */
+/* We only guarantee we won't generate the stack alignment when
+   optimizing.  When not optimizing, the return value will be assigned
+   to a pseudo with the specified alignment, which in turn will force
+   stack alignment since the pseudo might have to be spilled.  Without
+   optimization, we wouldn't compute the actual stack requirements
+   after register allocation and reload, and just use the conservative
+   estimate.  */
 
 /* This compile only test is to detect an assertion failure in stack branch
    development.  */
diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c
index fd00883..8dc4908 100644
--- a/gcc/tree-outof-ssa.c
+++ b/gcc/tree-outof-ssa.c
@@ -980,7 +980,6 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 {
   bitmap values = NULL;
   var_map map;
-  unsigned i;
 
   map = coalesce_ssa_name ();
 
@@ -1005,17 +1004,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa)
 
   sa->map = map;
   sa->values = values;
-  sa->partition_has_default_def = BITMAP_ALLOC (NULL);
-  for (i = 1; i < num_ssa_names; i++)
-    {
-      tree t = ssa_name (i);
-      if (t && SSA_NAME_IS_DEFAULT_DEF (t))
-	{
-	  int p = var_to_partition (map, t);
-	  if (p != NO_PARTITION)
-	    bitmap_set_bit (sa->partition_has_default_def, p);
-	}
-    }
+  sa->partitions_for_parm_default_defs = get_parm_default_def_partitions (map);
 }
 
 
@@ -1190,7 +1179,7 @@ finish_out_of_ssa (struct ssaexpand *sa)
   if (sa->values)
     BITMAP_FREE (sa->values);
   delete_var_map (sa->map);
-  BITMAP_FREE (sa->partition_has_default_def);
+  BITMAP_FREE (sa->partitions_for_parm_default_defs);
   memset (sa, 0, sizeof *sa);
 }
 
diff --git a/gcc/tree-outof-ssa.h b/gcc/tree-outof-ssa.h
index 687e5a5..60b6379 100644
--- a/gcc/tree-outof-ssa.h
+++ b/gcc/tree-outof-ssa.h
@@ -39,9 +39,9 @@ struct ssaexpand
      a pseudos REG).  */
   rtx *partition_to_pseudo;
 
-  /* If partition I contains an SSA name that has a default def,
-     bit I will be set in this bitmap.  */
-  bitmap partition_has_default_def;
+  /* If partition I contains an SSA name that has a default def for a
+     parameter, bit I will be set in this bitmap.  */
+  bitmap partitions_for_parm_default_defs;
 };
 
 /* This is the singleton described above.  */
diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c
index 8af6583..ff75877 100644
--- a/gcc/tree-ssa-coalesce.c
+++ b/gcc/tree-ssa-coalesce.c
@@ -39,7 +39,9 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgexpand.h"
 #include "explow.h"
 #include "diagnostic-core.h"
-
+#include "tree-dfa.h"
+#include "tm_p.h"
+#include "stor-layout.h"
 
 /* This set of routines implements a coalesce_list.  This is an object which
    is used to track pairs of ssa_names which are desirable to coalesce
@@ -877,26 +879,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo)
 	}
 
       /* Pretend there are defs for params' default defs at the start
-	 of the (post-)entry block.  */
+	 of the (post-)entry block.  This will prevent PARM_DECLs from
+	 coalescing into the same partition.  Although RESULT_DECLs'
+	 default defs don't have a useful initial value, we have to
+	 prevent them from coalescing with PARM_DECLs' default defs
+	 too, otherwise assign_parms would attempt to assign different
+	 RTL to the same partition.  */
       if (bb == entry)
 	{
-	  unsigned base;
-	  bitmap_iterator bi;
-	  EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi)
+	  unsigned i;
+	  for (i = 1; i < num_ssa_names; i++)
 	    {
-	      bitmap_iterator bi2;
-	      unsigned part;
-	      EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base],
-					0, part, bi2)
-		{
-		  tree var = partition_to_var (map, part);
-		  if (!SSA_NAME_VAR (var)
-		      || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL
-			  && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL)
-		      || !SSA_NAME_IS_DEFAULT_DEF (var))
-		    continue;
-		  live_track_process_def (live, var, graph);
-		}
+	      tree var = ssa_name (i);
+
+	      if (!var
+		  || !SSA_NAME_IS_DEFAULT_DEF (var)
+		  || !SSA_NAME_VAR (var)
+		  || VAR_P (SSA_NAME_VAR (var)))
+		continue;
+
+	      live_track_process_def (live, var, graph);
+	      /* Process a use too, so that it remains live and
+		 conflicts with other parms' default defs, even unused
+		 ones.  */
+	      live_track_process_use (live, var);
 	    }
 	}
 
@@ -937,6 +943,71 @@ fail_abnormal_edge_coalesce (int x, int y)
   internal_error ("SSA corruption");
 }
 
+/* Call CALLBACK for all PARM_DECLs and RESULT_DECLs for which
+   assign_parms may ask for a default partition.  */
+
+static void
+for_all_parms (void (*callback)(tree var, void *arg), void *arg)
+{
+  for (tree var = DECL_ARGUMENTS (current_function_decl); var;
+       var = DECL_CHAIN (var))
+    callback (var, arg);
+  if (!VOID_TYPE_P (TREE_TYPE (DECL_RESULT (current_function_decl))))
+    callback (DECL_RESULT (current_function_decl), arg);
+  if (cfun->static_chain_decl)
+    callback (cfun->static_chain_decl, arg);
+}
+
+/* Create a default def for VAR.  */
+
+static void
+create_default_def (tree var, void *arg ATTRIBUTE_UNUSED)
+{
+  if (!is_gimple_reg (var))
+    return;
+
+  tree ssa = get_or_create_ssa_default_def (cfun, var);
+  gcc_assert (ssa);
+}
+
+/* Register VAR's default def in MAP.  */
+
+static void
+register_default_def (tree var, void *map_)
+{
+  var_map map = (var_map)map_;
+
+  if (!is_gimple_reg (var))
+    return;
+
+  tree ssa = ssa_default_def (cfun, var);
+  gcc_assert (ssa);
+
+  register_ssa_partition (map, ssa);
+}
+
+/* If VAR is an SSA_NAME associated with a PARM_DECL or a RESULT_DECL,
+   and the DECL's default def is unused (i.e., it was introduced by
+   create_default_def), mark VAR and the default def for
+   coalescing.  */
+
+static void
+coalesce_with_default (tree var, coalesce_list_p cl, bitmap used_in_copy)
+{
+  if (SSA_NAME_IS_DEFAULT_DEF (var)
+      || !SSA_NAME_VAR (var)
+      || VAR_P (SSA_NAME_VAR (var)))
+    return;
+
+  tree ssa = ssa_default_def (cfun, SSA_NAME_VAR (var));
+  if (!has_zero_uses (ssa))
+    return;
+
+  add_cost_one_coalesce (cl, SSA_NAME_VERSION (ssa), SSA_NAME_VERSION (var));
+  bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
+  /* Default defs will have their used_in_copy bits set at the end of
+     create_outofssa_var_map.  */
+}
 
 /* This function creates a var_map for the current function as well as creating
    a coalesce list for use later in the out of ssa process.  */
@@ -954,8 +1025,12 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
   int v1, v2, cost;
   unsigned i;
 
+  for_all_parms (create_default_def, NULL);
+
   map = init_var_map (num_ssa_names);
 
+  for_all_parms (register_default_def, map);
+
   FOR_EACH_BB_FN (bb, cfun)
     {
       tree arg;
@@ -1034,6 +1109,30 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 	      }
 	      break;
 
+	    case GIMPLE_RETURN:
+	      {
+		tree res = DECL_RESULT (current_function_decl);
+		if (VOID_TYPE_P (TREE_TYPE (res))
+		    || !is_gimple_reg (res))
+		  break;
+		tree rhs1 = gimple_return_retval (as_a <greturn *> (stmt));
+		if (!rhs1)
+		  break;
+		tree lhs = ssa_default_def (cfun, res);
+		gcc_assert (lhs);
+		if (TREE_CODE (rhs1) == SSA_NAME
+		    && gimple_can_coalesce_p (lhs, rhs1))
+		  {
+		    v1 = SSA_NAME_VERSION (lhs);
+		    v2 = SSA_NAME_VERSION (rhs1);
+		    cost = coalesce_cost_bb (bb);
+		    add_coalesce (cl, v1, v2, cost);
+		    bitmap_set_bit (used_in_copy, v1);
+		    bitmap_set_bit (used_in_copy, v2);
+		  }
+		break;
+	      }
+
 	    case GIMPLE_ASM:
 	      {
 		gasm *asm_stmt = as_a <gasm *> (stmt);
@@ -1100,10 +1199,13 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
       var = ssa_name (i);
       if (var != NULL_TREE && !virtual_operand_p (var))
         {
+	  coalesce_with_default (var, cl, used_in_copy);
+
 	  /* Add coalesces between all the result decls.  */
 	  if (SSA_NAME_VAR (var)
 	      && TREE_CODE (SSA_NAME_VAR (var)) == RESULT_DECL)
 	    {
+	      bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
 	      if (first == NULL_TREE)
 		first = var;
 	      else
@@ -1111,8 +1213,6 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 		  gcc_assert (gimple_can_coalesce_p (var, first));
 		  v1 = SSA_NAME_VERSION (first);
 		  v2 = SSA_NAME_VERSION (var);
-		  bitmap_set_bit (used_in_copy, v1);
-		  bitmap_set_bit (used_in_copy, v2);
 		  cost = coalesce_cost_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
 		  add_coalesce (cl, v1, v2, cost);
 		}
@@ -1121,7 +1221,9 @@ create_outofssa_var_map (coalesce_list_p cl, bitmap used_in_copy)
 	     since they will have to be coalesced with the base variable.  If
 	     not marked as present, they won't be in the coalesce view. */
 	  if (SSA_NAME_IS_DEFAULT_DEF (var)
-	      && !has_zero_uses (var))
+	      && (!has_zero_uses (var)
+		  || (SSA_NAME_VAR (var)
+		      && !VAR_P (SSA_NAME_VAR (var)))))
 	    bitmap_set_bit (used_in_copy, SSA_NAME_VERSION (var));
 	}
     }
@@ -1367,30 +1469,38 @@ gimple_can_coalesce_p (tree name1, tree name2)
 
       /* We don't want to coalesce two SSA names if one of the base
 	 variables is supposed to be a register while the other is
-	 supposed to be on the stack.  Anonymous SSA names take
-	 registers, but when not optimizing, user variables should go
-	 on the stack, so coalescing them with the anonymous variable
-	 as the partition leader would end up assigning the user
-	 variable to a register.  Don't do that!  */
-      bool reg1 = !var1 || use_register_for_decl (var1);
-      bool reg2 = !var2 || use_register_for_decl (var2);
+	 supposed to be on the stack.  Anonymous SSA names most often
+	 take registers, but when not optimizing, user variables
+	 should go on the stack, so coalescing them with the anonymous
+	 variable as the partition leader would end up assigning the
+	 user variable to a register.  Don't do that!  */
+      bool reg1 = use_register_for_decl (name1);
+      bool reg2 = use_register_for_decl (name2);
       if (reg1 != reg2)
 	return false;
 
-      /* Check that the promoted modes are the same.  We don't want to
-	 coalesce if the promoted modes would be different.  Only
+      /* Check that the promoted modes and unsignedness are the same.
+	 We don't want to coalesce if the promoted modes would be
+	 different, or if they would sign-extend differently.  Only
 	 PARM_DECLs and RESULT_DECLs have different promotion rules,
 	 so skip the test if both are variables, or both are anonymous
-	 SSA_NAMEs.  Now, if a parm or result has BLKmode, do not
-	 coalesce its SSA versions with those of any other variables,
-	 because it may be passed by reference.  */
+	 SSA_NAMEs.  */
+      int unsigned1, unsigned2;
       return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2)))
-	|| (/* The case var1 == var2 is already covered above.  */
-	    !parm_in_stack_slot_p (var1)
-	    && !parm_in_stack_slot_p (var2)
-	    && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL));
+	|| ((promote_ssa_mode (name1, &unsigned1)
+	     == promote_ssa_mode (name2, &unsigned2))
+	    && unsigned1 == unsigned2);
     }
 
+  /* If alignment requirements are different, we can't coalesce.  */
+  if (MINIMUM_ALIGNMENT (t1,
+			 var1 ? DECL_MODE (var1) : TYPE_MODE (t1),
+			 var1 ? LOCAL_DECL_ALIGNMENT (var1) : TYPE_ALIGN (t1))
+      != MINIMUM_ALIGNMENT (t2,
+			    var2 ? DECL_MODE (var2) : TYPE_MODE (t2),
+			    var2 ? LOCAL_DECL_ALIGNMENT (var2) : TYPE_ALIGN (t2)))
+    return false;
+
   /* If the types are not the same, check for a canonical type match.  This
      (for example) allows coalescing when the types are fundamentally the
      same, but just have different names. 
@@ -1639,7 +1749,8 @@ coalesce_ssa_name (void)
 	  if (a
 	      && SSA_NAME_VAR (a)
 	      && !DECL_IGNORED_P (SSA_NAME_VAR (a))
-	      && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)))
+	      && (!has_zero_uses (a) || !SSA_NAME_IS_DEFAULT_DEF (a)
+		  || !VAR_P (SSA_NAME_VAR (a))))
 	    {
 	      tree *slot = ssa_name_hash.find_slot (a, INSERT);
 
@@ -1721,3 +1832,47 @@ coalesce_ssa_name (void)
 
   return map;
 }
+
+/* We need to pass two arguments to set_parm_default_def_partition,
+   but for_all_parms only supports one.  Use a pair.  */
+
+typedef std::pair<var_map, bitmap> parm_default_def_partition_arg;
+
+/* Set in ARG's PARTS bitmap the bit corresponding to the partition in
+   ARG's MAP containing VAR's default def.  */
+
+static void
+set_parm_default_def_partition (tree var, void *arg_)
+{
+  parm_default_def_partition_arg *arg = (parm_default_def_partition_arg *)arg_;
+  var_map map = arg->first;
+  bitmap parts = arg->second;
+
+  if (!is_gimple_reg (var))
+    return;
+
+  tree ssa = ssa_default_def (cfun, var);
+  gcc_assert (ssa);
+
+  int version = var_to_partition (map, ssa);
+  gcc_assert (version != NO_PARTITION);
+
+  bool changed = bitmap_set_bit (parts, version);
+  gcc_assert (changed);
+}
+
+/* Allocate and return a bitmap that has a bit set for each partition
+   that contains a default def for a parameter.  */
+
+extern bitmap
+get_parm_default_def_partitions (var_map map)
+{
+  bitmap parm_default_def_parts = BITMAP_ALLOC (NULL);
+
+  parm_default_def_partition_arg
+    arg = std::make_pair (map, parm_default_def_parts);
+
+  for_all_parms (set_parm_default_def_partition, &arg);
+
+  return parm_default_def_parts;
+}
diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h
index ae289b4..8316f34 100644
--- a/gcc/tree-ssa-coalesce.h
+++ b/gcc/tree-ssa-coalesce.h
@@ -22,5 +22,6 @@ along with GCC; see the file COPYING3.  If not see
 
 extern var_map coalesce_ssa_name (void);
 extern bool gimple_can_coalesce_p (tree, tree);
+extern bitmap get_parm_default_def_partitions (var_map);
 
 #endif /* GCC_TREE_SSA_COALESCE_H */
diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c
index e031725..25b548b 100644
--- a/gcc/tree-ssa-live.c
+++ b/gcc/tree-ssa-live.c
@@ -200,7 +200,9 @@ partition_view_init (var_map map)
       tmp = partition_find (map->var_partition, x);
       if (ssa_name (tmp) != NULL_TREE && !virtual_operand_p (ssa_name (tmp))
 	  && (!has_zero_uses (ssa_name (tmp))
-	      || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))))
+	      || !SSA_NAME_IS_DEFAULT_DEF (ssa_name (tmp))
+	      || (SSA_NAME_VAR (ssa_name (tmp))
+		  && !VAR_P (SSA_NAME_VAR (ssa_name (tmp))))))
 	bitmap_set_bit (used, tmp);
     }
 
@@ -1404,6 +1406,12 @@ verify_live_on_entry (tree_live_info_p live)
 		  }
 		if (ok)
 		  continue;
+		/* Expand adds unused default defs for PARM_DECLs and
+		   RESULT_DECLs.  They're ok.  */
+		if (has_zero_uses (var)
+		    && SSA_NAME_VAR (var)
+		    && !VAR_P (SSA_NAME_VAR (var)))
+		  continue;
 	        num++;
 		print_generic_expr (stderr, var, TDF_SLIM);
 		fprintf (stderr, " is not marked live-on-entry to entry BB%d ",


-- 
Alexandre Oliva, freedom fighter    http://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]