Bug 9997 - Coelesce stack slots for disjoint scopes.
Summary: Coelesce stack slots for disjoint scopes.
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: unknown
: P3 enhancement
Target Milestone: 4.0.0
Assignee: Richard Henderson
URL:
Keywords: missed-optimization
: 15400 (view as bug list)
Depends on:
Blocks: 16269
  Show dependency treegraph
 
Reported: 2003-03-08 01:06 UTC by Richard Henderson
Modified: 2004-11-15 00:02 UTC (History)
10 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2004-04-04 07:32:46


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Henderson 2003-03-08 01:06:00 UTC
Given

struct big { char x[1024]; };
extern void bar(struct big *);

void foo(void)
{
  {
    struct big one;
    bar(&one);
  }
  {
    struct big one;
    bar(&one);
  }
}

we should allocate only one structure on the stack.

Release:
all

Environment:
any
Comment 1 Wolfgang Bangerth 2003-03-08 06:11:52 UTC
State-Changed-From-To: open->analyzed
State-Changed-Why: Confirmed. I think it is a usual observation that gcc often
    allocates far too much stack space, which often enough isn't
    even used -- see, for example, PRs 8935 and 8936 (in
    particular the latter, without optimizations).
    
    W.
Comment 2 Daniel Jacobowitz 2003-03-08 12:54:07 UTC
From: Daniel Jacobowitz <drow@mvista.com>
To: rth@redhat.com
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: Sat, 8 Mar 2003 12:54:07 -0500

 On Sat, Mar 08, 2003 at 12:56:24AM -0000, rth@redhat.com wrote:
 > 
 > >Number:         9997
 > >Category:       middle-end
 > >Synopsis:       Coelesce stack slots for disjoint scopes.
 > >Confidential:   no
 > >Severity:       non-critical
 > >Priority:       medium
 > >Responsible:    unassigned
 > >State:          open
 > >Class:          change-request
 > >Submitter-Id:   net
 > >Arrival-Date:   Sat Mar 08 01:06:00 UTC 2003
 > >Closed-Date:
 > >Last-Modified:
 > >Originator:     Richard Henderson
 > >Release:        all
 > >Organization:
 > >Environment:
 > any
 > >Description:
 > Given
 > 
 > struct big { char x[1024]; };
 > extern void bar(struct big *);
 > 
 > void foo(void)
 > {
 >   {
 >     struct big one;
 >     bar(&one);
 >   }
 >   {
 >     struct big one;
 >     bar(&one);
 >   }
 > }
 > 
 > we should allocate only one structure on the stack.
 
 Here's some interesting numbers I got for this testcase.  HEAD+stack is
 the -fstack-reorg patch posted some time ago.
 
 Compiler	Flags			Stack size
 
 gcc 3.2.2	-O0			2068
 gcc 3.2.2	-O2			2068
 gcc HEAD+stack	-O0			2072
 gcc HEAD+stack	-O2			2072
 gcc HEAD+stack	-O0 -fstack-reorg	2072
 gcc HEAD+stack	-O2 -fstack-reorg	2072
 
 Here's a patch I've been playing with:
 gcc HEAD+patch	-O0			1048
 gcc HEAD+patch	-O2			1044
 
 The patch bootstrapped on i686-pc-linux-gnu, and I'm running make check
 now.  I think it's correct; basically, when we close the SCOPE_EXPR,
 mark all stack slots associated with locals as dead.
 
 -- 
 Daniel Jacobowitz
 MontaVista Software                         Debian GNU/Linux Developer
 
 2003-03-08  Daniel Jacobowitz  <drow@mvista.com>
 
 	* c-semantics.c (genrtl_scope_stmt): Call free_named_temp_slots.
 	* expr.c (expand_expr): Update calls to assign_stack_temp_for_type.
 	* function.c (struct temp_slot): Add DECL.
 	(assign_stack_temp_for_type): Add DECL argument; save it in the
 	temp_slot.
 	(assign_stack_temp): Update call to assign_stack_temp_for_type.
 	(assign_temp): Pass the DECL to assign_stack_temp_for_type.
 	(free_named_stack_slots): New function.
 	* rtl.h (assign_stack_temp_for_type): Update prototype.
 	* tree.h (assign_stack_temp_for_type): Add prototype.
 
 Index: c-semantics.c
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/c-semantics.c,v
 retrieving revision 1.53
 diff -u -p -r1.53 c-semantics.c
 --- c-semantics.c	22 Feb 2003 00:32:02 -0000	1.53
 +++ c-semantics.c	8 Mar 2003 17:48:53 -0000
 @@ -622,7 +622,11 @@ genrtl_scope_stmt (t)
        if (SCOPE_BEGIN_P (t))
  	expand_start_bindings_and_block (2 * SCOPE_NULLIFIED_P (t), block);
        else if (SCOPE_END_P (t))
 -	expand_end_bindings (NULL_TREE, !SCOPE_NULLIFIED_P (t), 0);
 +	{
 +	  expand_end_bindings (NULL_TREE, !SCOPE_NULLIFIED_P (t), 0);
 +	  if (block)
 +	    free_named_temp_slots (block);
 +	}
      }
    else if (!SCOPE_NULLIFIED_P (t))
      {
 Index: expr.c
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/expr.c,v
 retrieving revision 1.510
 diff -u -p -r1.510 expr.c
 --- expr.c	5 Mar 2003 00:12:40 -0000	1.510
 +++ expr.c	8 Mar 2003 17:48:55 -0000
 @@ -7863,7 +7863,8 @@ expand_expr (exp, target, tmode, modifie
  	    target
  	      = assign_stack_temp_for_type
  		(TYPE_MODE (inner_type),
 -		 GET_MODE_SIZE (TYPE_MODE (inner_type)), 0, inner_type);
 +		 GET_MODE_SIZE (TYPE_MODE (inner_type)), 0, inner_type,
 +		 NULL_TREE);
  
  	  emit_move_insn (target, op0);
  	  op0 = target;
 @@ -7887,7 +7888,8 @@ expand_expr (exp, target, tmode, modifie
  		= MAX (int_size_in_bytes (inner_type),
  		       (HOST_WIDE_INT) GET_MODE_SIZE (TYPE_MODE (type)));
  	      rtx new = assign_stack_temp_for_type (TYPE_MODE (type),
 -						    temp_size, 0, type);
 +						    temp_size, 0, type,
 +						    NULL_TREE);
  	      rtx new_with_op0_mode = adjust_address (new, GET_MODE (op0), 0);
  
  	      if (TREE_ADDRESSABLE (exp))
 @@ -9189,7 +9191,8 @@ expand_expr (exp, target, tmode, modifie
  		   : int_size_in_bytes (inner_type),
  		   1, build_qualified_type (inner_type,
  					    (TYPE_QUALS (inner_type)
 -					     | TYPE_QUAL_CONST)));
 +					     | TYPE_QUAL_CONST)),
 +		   NULL_TREE);
  
  	      if (TYPE_ALIGN_OK (inner_type))
  		abort ();
 Index: function.c
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/function.c,v
 retrieving revision 1.405
 diff -u -p -r1.405 function.c
 --- function.c	4 Mar 2003 06:20:12 -0000	1.405
 +++ function.c	8 Mar 2003 17:48:57 -0000
 @@ -165,7 +165,8 @@ static GTY(()) varray_type sibcall_epilo
  
     Automatic variables are also assigned temporary slots, at the nesting
     level where they are defined.  They are marked a "kept" so that
 -   free_temp_slots will not free them.  */
 +   free_temp_slots will not free them, although a decl is associated
 +   with each so that it can be freed when the decl goes out of scope.  */
  
  struct temp_slot GTY(())
  {
 @@ -201,6 +202,9 @@ struct temp_slot GTY(())
    /* The size of the slot, including extra space for alignment.  This
       info is for combine_temp_slots.  */
    HOST_WIDE_INT full_size;
 +  /* The decl associated with this stack slot, for automatic variables.  This
 +     info is for free_named_temp_slots.  */
 +  tree decl;
  };
  
  /* This structure is used to record MEMs or pseudos used to replace VAR, any
 @@ -649,14 +653,17 @@ assign_stack_local (mode, size, align)
     if we are to allocate something at an inner level to be treated as
     a variable in the block (e.g., a SAVE_EXPR).
  
 -   TYPE is the type that will be used for the stack slot.  */
 +   TYPE is the type that will be used for the stack slot.
 +
 +   DECL is the decl that will be associated with the stack slot.  */
  
  rtx
 -assign_stack_temp_for_type (mode, size, keep, type)
 +assign_stack_temp_for_type (mode, size, keep, type, decl)
       enum machine_mode mode;
       HOST_WIDE_INT size;
       int keep;
       tree type;
 +     tree decl;
  {
    unsigned int align;
    struct temp_slot *p, *best_p = 0;
 @@ -789,6 +796,7 @@ assign_stack_temp_for_type (mode, size, 
    p->addr_taken = 0;
    p->rtl_expr = seq_rtl_expr;
    p->type = type;
 +  p->decl = decl;
  
    if (keep == 2)
      {
 @@ -838,7 +846,7 @@ assign_stack_temp (mode, size, keep)
       HOST_WIDE_INT size;
       int keep;
  {
 -  return assign_stack_temp_for_type (mode, size, keep, NULL_TREE);
 +  return assign_stack_temp_for_type (mode, size, keep, NULL_TREE, NULL_TREE);
  }
  
  /* Assign a temporary.
 @@ -904,7 +912,7 @@ assign_temp (type_or_decl, keep, memory_
  	  size = 1;
  	}
  
 -      tmp = assign_stack_temp_for_type (mode, size, keep, type);
 +      tmp = assign_stack_temp_for_type (mode, size, keep, type, decl);
        return tmp;
      }
  
 @@ -1177,6 +1185,23 @@ preserve_temp_slots (x)
    for (p = temp_slots; p; p = p->next)
      if (p->in_use && p->level == temp_slot_level && ! p->keep)
        p->level--;
 +}
 +
 +/* For every VAR_DECL in BLOCK, mark the associated temporary slot as dead
 +   so that it can be reused.  */
 +
 +void
 +free_named_temp_slots (block)
 +     tree block;
 +{
 +  struct temp_slot *p;
 +  tree decl;
 +
 +  for (decl = BLOCK_VARS (block); decl; decl = TREE_CHAIN (decl))
 +    if (TREE_CODE (decl) == VAR_DECL)
 +      for (p = temp_slots; p; p = p->next)
 +	if (p->decl == decl)
 +	    p->in_use = 0;
  }
  
  /* X is the result of an RTL_EXPR.  If it is a temporary slot associated
 Index: rtl.h
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/rtl.h,v
 retrieving revision 1.386
 diff -u -p -r1.386 rtl.h
 --- rtl.h	28 Feb 2003 07:06:33 -0000	1.386
 +++ rtl.h	8 Mar 2003 17:48:57 -0000
 @@ -1459,7 +1459,8 @@ extern rtx assign_stack_local		PARAMS ((
  extern rtx assign_stack_temp		PARAMS ((enum machine_mode,
  					       HOST_WIDE_INT, int));
  extern rtx assign_stack_temp_for_type	PARAMS ((enum machine_mode,
 -						 HOST_WIDE_INT, int, tree));
 +						 HOST_WIDE_INT, int, tree,
 +						 tree));
  extern rtx assign_temp			PARAMS ((tree, int, int, int));
  /* In emit-rtl.c */
  extern rtx emit_insn_before		PARAMS ((rtx, rtx));
 Index: tree.h
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/tree.h,v
 retrieving revision 1.385
 diff -u -p -r1.385 tree.h
 --- tree.h	2 Mar 2003 21:18:15 -0000	1.385
 +++ tree.h	8 Mar 2003 17:49:00 -0000
 @@ -2999,6 +2999,7 @@ extern void pop_temp_slots		PARAMS ((voi
  extern void push_temp_slots		PARAMS ((void));
  extern void preserve_temp_slots		PARAMS ((rtx));
  extern void preserve_rtl_expr_temps	PARAMS ((tree));
 +extern void free_named_temp_slots	PARAMS ((tree));
  extern int aggregate_value_p		PARAMS ((tree));
  extern void free_temps_for_rtl_expr	PARAMS ((tree));
  extern void instantiate_virtual_regs	PARAMS ((tree, rtx));

Comment 3 Daniel Jacobowitz 2003-03-08 15:22:00 UTC
From: Daniel Jacobowitz <drow@mvista.com>
To: rth@redhat.com, gcc-gnats@gcc.gnu.org, gcc-bugs@gcc.gnu.org
Cc:  
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: Sat, 8 Mar 2003 15:22:00 -0500

 On Sat, Mar 08, 2003 at 12:54:06PM -0500, Daniel Jacobowitz wrote:
 > On Sat, Mar 08, 2003 at 12:56:24AM -0000, rth@redhat.com wrote:
 > > 
 > > >Number:         9997
 > > >Category:       middle-end
 > > >Synopsis:       Coelesce stack slots for disjoint scopes.
 > > >Confidential:   no
 > > >Severity:       non-critical
 > > >Priority:       medium
 > > >Responsible:    unassigned
 > > >State:          open
 > > >Class:          change-request
 > > >Submitter-Id:   net
 > > >Arrival-Date:   Sat Mar 08 01:06:00 UTC 2003
 > > >Closed-Date:
 > > >Last-Modified:
 > > >Originator:     Richard Henderson
 > > >Release:        all
 > > >Organization:
 > > >Environment:
 > > any
 > > >Description:
 > > Given
 > > 
 > > struct big { char x[1024]; };
 > > extern void bar(struct big *);
 > > 
 > > void foo(void)
 > > {
 > >   {
 > >     struct big one;
 > >     bar(&one);
 > >   }
 > >   {
 > >     struct big one;
 > >     bar(&one);
 > >   }
 > > }
 > > 
 > > we should allocate only one structure on the stack.
 > 
 > Here's some interesting numbers I got for this testcase.  HEAD+stack is
 > the -fstack-reorg patch posted some time ago.
 > 
 > Compiler	Flags			Stack size
 > 
 > gcc 3.2.2	-O0			2068
 > gcc 3.2.2	-O2			2068
 > gcc HEAD+stack	-O0			2072
 > gcc HEAD+stack	-O2			2072
 > gcc HEAD+stack	-O0 -fstack-reorg	2072
 > gcc HEAD+stack	-O2 -fstack-reorg	2072
 > 
 > Here's a patch I've been playing with:
 > gcc HEAD+patch	-O0			1048
 > gcc HEAD+patch	-O2			1044
 > 
 > The patch bootstrapped on i686-pc-linux-gnu, and I'm running make check
 > now.  I think it's correct; basically, when we close the SCOPE_EXPR,
 > mark all stack slots associated with locals as dead.
 
 It's not quite right, since it re-introduces the failure at
 gcc.c-torture/execute/20020320-1.c.  It probably just needs to be
 tweaked for statement expressions though.
 
 > 
 > -- 
 > Daniel Jacobowitz
 > MontaVista Software                         Debian GNU/Linux Developer
 > 
 > 2003-03-08  Daniel Jacobowitz  <drow@mvista.com>
 > 
 > 	* c-semantics.c (genrtl_scope_stmt): Call free_named_temp_slots.
 > 	* expr.c (expand_expr): Update calls to assign_stack_temp_for_type.
 > 	* function.c (struct temp_slot): Add DECL.
 > 	(assign_stack_temp_for_type): Add DECL argument; save it in the
 > 	temp_slot.
 > 	(assign_stack_temp): Update call to assign_stack_temp_for_type.
 > 	(assign_temp): Pass the DECL to assign_stack_temp_for_type.
 > 	(free_named_stack_slots): New function.
 > 	* rtl.h (assign_stack_temp_for_type): Update prototype.
 > 	* tree.h (assign_stack_temp_for_type): Add prototype.
 > 
 > Index: c-semantics.c
 > ===================================================================
 > RCS file: /cvs/gcc/gcc/gcc/c-semantics.c,v
 > retrieving revision 1.53
 > diff -u -p -r1.53 c-semantics.c
 > --- c-semantics.c	22 Feb 2003 00:32:02 -0000	1.53
 > +++ c-semantics.c	8 Mar 2003 17:48:53 -0000
 > @@ -622,7 +622,11 @@ genrtl_scope_stmt (t)
 >        if (SCOPE_BEGIN_P (t))
 >  	expand_start_bindings_and_block (2 * SCOPE_NULLIFIED_P (t), block);
 >        else if (SCOPE_END_P (t))
 > -	expand_end_bindings (NULL_TREE, !SCOPE_NULLIFIED_P (t), 0);
 > +	{
 > +	  expand_end_bindings (NULL_TREE, !SCOPE_NULLIFIED_P (t), 0);
 > +	  if (block)
 > +	    free_named_temp_slots (block);
 > +	}
 >      }
 >    else if (!SCOPE_NULLIFIED_P (t))
 >      {
 > Index: expr.c
 > ===================================================================
 > RCS file: /cvs/gcc/gcc/gcc/expr.c,v
 > retrieving revision 1.510
 > diff -u -p -r1.510 expr.c
 > --- expr.c	5 Mar 2003 00:12:40 -0000	1.510
 > +++ expr.c	8 Mar 2003 17:48:55 -0000
 > @@ -7863,7 +7863,8 @@ expand_expr (exp, target, tmode, modifie
 >  	    target
 >  	      = assign_stack_temp_for_type
 >  		(TYPE_MODE (inner_type),
 > -		 GET_MODE_SIZE (TYPE_MODE (inner_type)), 0, inner_type);
 > +		 GET_MODE_SIZE (TYPE_MODE (inner_type)), 0, inner_type,
 > +		 NULL_TREE);
 >  
 >  	  emit_move_insn (target, op0);
 >  	  op0 = target;
 > @@ -7887,7 +7888,8 @@ expand_expr (exp, target, tmode, modifie
 >  		= MAX (int_size_in_bytes (inner_type),
 >  		       (HOST_WIDE_INT) GET_MODE_SIZE (TYPE_MODE (type)));
 >  	      rtx new = assign_stack_temp_for_type (TYPE_MODE (type),
 > -						    temp_size, 0, type);
 > +						    temp_size, 0, type,
 > +						    NULL_TREE);
 >  	      rtx new_with_op0_mode = adjust_address (new, GET_MODE (op0), 0);
 >  
 >  	      if (TREE_ADDRESSABLE (exp))
 > @@ -9189,7 +9191,8 @@ expand_expr (exp, target, tmode, modifie
 >  		   : int_size_in_bytes (inner_type),
 >  		   1, build_qualified_type (inner_type,
 >  					    (TYPE_QUALS (inner_type)
 > -					     | TYPE_QUAL_CONST)));
 > +					     | TYPE_QUAL_CONST)),
 > +		   NULL_TREE);
 >  
 >  	      if (TYPE_ALIGN_OK (inner_type))
 >  		abort ();
 > Index: function.c
 > ===================================================================
 > RCS file: /cvs/gcc/gcc/gcc/function.c,v
 > retrieving revision 1.405
 > diff -u -p -r1.405 function.c
 > --- function.c	4 Mar 2003 06:20:12 -0000	1.405
 > +++ function.c	8 Mar 2003 17:48:57 -0000
 > @@ -165,7 +165,8 @@ static GTY(()) varray_type sibcall_epilo
 >  
 >     Automatic variables are also assigned temporary slots, at the nesting
 >     level where they are defined.  They are marked a "kept" so that
 > -   free_temp_slots will not free them.  */
 > +   free_temp_slots will not free them, although a decl is associated
 > +   with each so that it can be freed when the decl goes out of scope.  */
 >  
 >  struct temp_slot GTY(())
 >  {
 > @@ -201,6 +202,9 @@ struct temp_slot GTY(())
 >    /* The size of the slot, including extra space for alignment.  This
 >       info is for combine_temp_slots.  */
 >    HOST_WIDE_INT full_size;
 > +  /* The decl associated with this stack slot, for automatic variables.  This
 > +     info is for free_named_temp_slots.  */
 > +  tree decl;
 >  };
 >  
 >  /* This structure is used to record MEMs or pseudos used to replace VAR, any
 > @@ -649,14 +653,17 @@ assign_stack_local (mode, size, align)
 >     if we are to allocate something at an inner level to be treated as
 >     a variable in the block (e.g., a SAVE_EXPR).
 >  
 > -   TYPE is the type that will be used for the stack slot.  */
 > +   TYPE is the type that will be used for the stack slot.
 > +
 > +   DECL is the decl that will be associated with the stack slot.  */
 >  
 >  rtx
 > -assign_stack_temp_for_type (mode, size, keep, type)
 > +assign_stack_temp_for_type (mode, size, keep, type, decl)
 >       enum machine_mode mode;
 >       HOST_WIDE_INT size;
 >       int keep;
 >       tree type;
 > +     tree decl;
 >  {
 >    unsigned int align;
 >    struct temp_slot *p, *best_p = 0;
 > @@ -789,6 +796,7 @@ assign_stack_temp_for_type (mode, size, 
 >    p->addr_taken = 0;
 >    p->rtl_expr = seq_rtl_expr;
 >    p->type = type;
 > +  p->decl = decl;
 >  
 >    if (keep == 2)
 >      {
 > @@ -838,7 +846,7 @@ assign_stack_temp (mode, size, keep)
 >       HOST_WIDE_INT size;
 >       int keep;
 >  {
 > -  return assign_stack_temp_for_type (mode, size, keep, NULL_TREE);
 > +  return assign_stack_temp_for_type (mode, size, keep, NULL_TREE, NULL_TREE);
 >  }
 >  
 >  /* Assign a temporary.
 > @@ -904,7 +912,7 @@ assign_temp (type_or_decl, keep, memory_
 >  	  size = 1;
 >  	}
 >  
 > -      tmp = assign_stack_temp_for_type (mode, size, keep, type);
 > +      tmp = assign_stack_temp_for_type (mode, size, keep, type, decl);
 >        return tmp;
 >      }
 >  
 > @@ -1177,6 +1185,23 @@ preserve_temp_slots (x)
 >    for (p = temp_slots; p; p = p->next)
 >      if (p->in_use && p->level == temp_slot_level && ! p->keep)
 >        p->level--;
 > +}
 > +
 > +/* For every VAR_DECL in BLOCK, mark the associated temporary slot as dead
 > +   so that it can be reused.  */
 > +
 > +void
 > +free_named_temp_slots (block)
 > +     tree block;
 > +{
 > +  struct temp_slot *p;
 > +  tree decl;
 > +
 > +  for (decl = BLOCK_VARS (block); decl; decl = TREE_CHAIN (decl))
 > +    if (TREE_CODE (decl) == VAR_DECL)
 > +      for (p = temp_slots; p; p = p->next)
 > +	if (p->decl == decl)
 > +	    p->in_use = 0;
 >  }
 >  
 >  /* X is the result of an RTL_EXPR.  If it is a temporary slot associated
 > Index: rtl.h
 > ===================================================================
 > RCS file: /cvs/gcc/gcc/gcc/rtl.h,v
 > retrieving revision 1.386
 > diff -u -p -r1.386 rtl.h
 > --- rtl.h	28 Feb 2003 07:06:33 -0000	1.386
 > +++ rtl.h	8 Mar 2003 17:48:57 -0000
 > @@ -1459,7 +1459,8 @@ extern rtx assign_stack_local		PARAMS ((
 >  extern rtx assign_stack_temp		PARAMS ((enum machine_mode,
 >  					       HOST_WIDE_INT, int));
 >  extern rtx assign_stack_temp_for_type	PARAMS ((enum machine_mode,
 > -						 HOST_WIDE_INT, int, tree));
 > +						 HOST_WIDE_INT, int, tree,
 > +						 tree));
 >  extern rtx assign_temp			PARAMS ((tree, int, int, int));
 >  /* In emit-rtl.c */
 >  extern rtx emit_insn_before		PARAMS ((rtx, rtx));
 > Index: tree.h
 > ===================================================================
 > RCS file: /cvs/gcc/gcc/gcc/tree.h,v
 > retrieving revision 1.385
 > diff -u -p -r1.385 tree.h
 > --- tree.h	2 Mar 2003 21:18:15 -0000	1.385
 > +++ tree.h	8 Mar 2003 17:49:00 -0000
 > @@ -2999,6 +2999,7 @@ extern void pop_temp_slots		PARAMS ((voi
 >  extern void push_temp_slots		PARAMS ((void));
 >  extern void preserve_temp_slots		PARAMS ((rtx));
 >  extern void preserve_rtl_expr_temps	PARAMS ((tree));
 > +extern void free_named_temp_slots	PARAMS ((tree));
 >  extern int aggregate_value_p		PARAMS ((tree));
 >  extern void free_temps_for_rtl_expr	PARAMS ((tree));
 >  extern void instantiate_virtual_regs	PARAMS ((tree, rtx));
 
 -- 
 Daniel Jacobowitz
 MontaVista Software                         Debian GNU/Linux Developer

Comment 4 Jim Wilson 2003-03-09 10:52:33 UTC
From: Jim Wilson <wilson@tuliptree.org>
To: gcc-gnats@gcc.gnu.org
Cc:  
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: Sun, 09 Mar 2003 10:52:33 -0500

 I believe we have tried this in the past, and ran into aliasing
 problems.  Suppose you have two variables of different types defined in
 different scopes.  Suppose the scheduler reorders instructions causing
 the scopes to overlap.  Now you do an alias test, and the alias code
 says that two references can't possibly alias because they have
 different types, but they do, because they were allocated to the same
 stack slot.  You can't rely on a test for the same address, because we
 may have lost address info in the RTL.
 
 We have different and better alias code now, so perhaps there is a way
 to address this.  Maybe if we add scope nesting level info to the alias
 info, and then assume that locals alias if they are in different scopes
 with a common parent, and neither is a parent of the other.
 
 Jim
 

Comment 5 Jim Wilson 2003-03-09 11:32:47 UTC
From: Jim Wilson <wilson@tuliptree.org>
To: Daniel Jacobowitz <drow@mvista.com>
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: Sun, 09 Mar 2003 11:32:47 -0500

 This patch seems more complicated than necessary.  expand_end_bindings 
 already calls pop_temp_slots.  So all you have to do is arrange for 
 locals to have the right temp_slot_level.
 
 I tried looking at this.  We are calling preserve_stack_temps, and that 
 preserves all temp slots whose address was taken.  This means a local of 
 array type won't be freed when its scope terminates.  The question then 
 is whether this is really necessary to preserve values whose address was 
       taken.
 
 This code is used to preserve values that might be needed by an 
 expression statement result.  In this case, the result is const0_rtx. 
 This code has a test for !MEM.  This is apparently to handle the case 
 where the result is in a REG that might contain an address.  However, 
 !MEM also includes constants, and constants can't possibly contain the 
 address of a stack local.  I think this problem would go away if we 
 fixed this code to do nothing for constant results.  The problematic 
 code is here:
    /* If X is not in memory or is at a constant address, it cannot be in
       a temporary slot, but it can contain something whose address was
       taken.  */
    if (p == 0 && (GET_CODE (x) != MEM || CONSTANT_P (XEXP (x, 0))))
 We should do nothing here if x is a constant, but there is no code for that.
 
 Jim
 

Comment 6 Daniel Jacobowitz 2003-03-09 15:45:47 UTC
From: Daniel Jacobowitz <drow@mvista.com>
To: Jim Wilson <wilson@tuliptree.org>
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: Sun, 9 Mar 2003 15:45:47 -0500

 On Sun, Mar 09, 2003 at 11:32:47AM -0500, Jim Wilson wrote:
 > This patch seems more complicated than necessary.  expand_end_bindings 
 > already calls pop_temp_slots.  So all you have to do is arrange for 
 > locals to have the right temp_slot_level.
 > 
 > I tried looking at this.  We are calling preserve_stack_temps, and that 
 > preserves all temp slots whose address was taken.  This means a local of 
 > array type won't be freed when its scope terminates.  The question then 
 > is whether this is really necessary to preserve values whose address was 
 >      taken.
 > 
 > This code is used to preserve values that might be needed by an 
 > expression statement result.  In this case, the result is const0_rtx. 
 > This code has a test for !MEM.  This is apparently to handle the case 
 > where the result is in a REG that might contain an address.  However, 
 > !MEM also includes constants, and constants can't possibly contain the 
 > address of a stack local.  I think this problem would go away if we 
 > fixed this code to do nothing for constant results.  The problematic 
 > code is here:
 >   /* If X is not in memory or is at a constant address, it cannot be in
 >      a temporary slot, but it can contain something whose address was
 >      taken.  */
 >   if (p == 0 && (GET_CODE (x) != MEM || CONSTANT_P (XEXP (x, 0))))
 > We should do nothing here if x is a constant, but there is no code for that.
 
 I think I was looking at free_temp_slots instead of pop_temp_slots, so
 I just assumed p->keep was causing the problem; I didn't realize that
 pop_temp_slots overrode keep.  Thanks.  Your suggestion does fix
 Richard's test, so I bootstrapped and tested it on i686-pc-linux-gnu. 
 The patch is below; however, it causes one new failure:
 
 FAIL: g++.dg/opt/alias2.C execution test
 
 The interesting segment of the diff is:
 -       movl    %eax, -56(%ebp)
 -       movl    -24(%ebp), %eax
 -       movl    %eax, -44(%ebp)
 +       movl    %eax, -24(%ebp)
 +       movl    %eax, -12(%ebp)
         call    _ZN11_Deque_baseD2Ev
 
 I'm not quite sure what's going on.  0x123 ends up in eax instead of, I
 think, the "this" pointer.
 
 -- 
 Daniel Jacobowitz
 MontaVista Software                         Debian GNU/Linux Developer
 
 2003-03-09  Daniel Jacobowitz  <drow@mvista.com>
 
 	* function.c (preserve_temp_slots): If X is constant, don't
 	preserve temp slots.  Suggested by Jim Wilson.
 
 Index: function.c
 ===================================================================
 RCS file: /cvs/gcc/gcc/gcc/function.c,v
 retrieving revision 1.405
 diff -u -p -r1.405 function.c
 --- function.c	4 Mar 2003 06:20:12 -0000	1.405
 +++ function.c	9 Mar 2003 16:52:29 -0000
 @@ -1132,6 +1132,10 @@ preserve_temp_slots (x)
        return;
      }
  
 +  /* If X is a constant, then we don't need to preserve stack slots.  */
 +  if (CONSTANT_P (x))
 +    return;
 +
    /* If X is a register that is being used as a pointer, see if we have
       a temporary slot we know it points to.  To be consistent with
       the code below, we really should preserve all non-kept slots

Comment 7 Jim Wilson 2003-03-09 22:12:03 UTC
From: Jim Wilson <wilson@tuliptree.org>
To: Daniel Jacobowitz <drow@mvista.com>
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: 09 Mar 2003 22:12:03 -0500

 On Sun, 2003-03-09 at 15:45, Daniel Jacobowitz wrote:
 > FAIL: g++.dg/opt/alias2.C execution test
 
 I wasn't able to reproduce this with a quick stage1 build.  This might
 take a little time to track down.  I am leaving on a trip tomorrow, so I
 may not be able to look at this until late next week.
 
 Jim
 
 

Comment 8 Jim Wilson 2003-04-29 23:12:45 UTC
From: Jim Wilson <wilson@tuliptree.org>
To: Daniel Jacobowitz <drow@mvista.com>
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: 29 Apr 2003 23:12:45 -0400

 --=-CssSwvW0ZmvCLU9WCSS5
 Content-Type: text/plain
 Content-Transfer-Encoding: 7bit
 
 On Sun, 2003-03-09 at 15:45, Daniel Jacobowitz wrote:
 > On Sun, Mar 09, 2003 at 11:32:47AM -0500, Jim Wilson wrote:
 > The patch is below; however, it causes one new failure:
 > FAIL: g++.dg/opt/alias2.C execution test
 
 This is failing because we have two objects on the stack, which are
 allocated to overlapping slots, and which have overlapping lifetimes.
 When we try to copy fields from one to the other, we accidentally
 clobber a field in the first one before we read it.
 
 This is obvious using the 2003-04-14 snapshot.  I get:
        movl    -36(%ebp), %eax
        movl    %eax, -24(%ebp)
        /* movl  -24(%ebp), %eax deleted by optimizer as redundant */
        movl    %eax, -12(%ebp)
 Note that the objects are 12 bytes apart, and we are copying two fields
 which are 12 bytes apart.  Thus the first copy clobbers the source of
 the second copy before we read it.
 
 The same problem exists in current sources, but is less obvious, because
 the two objects are not 12 bytes apart, and thus they still overlap, but
 it works by accident.
         movl    -36(%ebp), %eax
         movl    %eax, -40(%ebp)
         movl    -24(%ebp), %eax
         movl    %eax, -28(%ebp)
 
 The failure occurs because we have a TARGET_EXPR with a cleanup.  In
 genrtl_stmt_expr_value, we call expand_start_target_temps which calls
 push_temp_slots.  The current temp_slot_level is stored in
 target_temp_slot_level.  Then we call expand_stmt_expr_value which
 evaluates the stmt expr.  Along the way, we evaluate a TARGET_EXPR which
 creates a stack level at target_temp_slot_level, and gives it a cleanup.
 When we get to the end of expand_stmt_expr_value, we call
 free_temp_slots.  Since there were no push_temp_slots calls along the
 way, target_temp_slot_level == temp_slot_level, and this frees the slot
 allocated to the TARGET_EXPR.  Then when we call the cleanup
 (destructor), we are reading from a freed stack slot, and we have a
 problem.
 
 It appears that we need another set of push_temp_slot/pop_temp_slot
 calls here, or else we need to mark the TARGET_EXPR stack slot as kept
 until the end of the binding level.  I have mostly convinced myself that
 marking it as kept is the right solution.
 
 It appears that this was broken when the C++ front end was converted to
 generate an entire function as a tree.  This was such a large change
 that I don't think it makes any sense to try to track down exactly why
 or how this got broken.
 
 This gives me the following patch.  I haven't tried testing it yet,
 other than against the alias2.C testcase.  Since this has one part that
 reduces stack space usage and one part that increases stack space usage,
 we need to check to see what kind of effect this has on C++
 performance.  Hopefully it doesn't make things worse.  It seems OK for
 the alias2.C testcase.  Main uses the same stack space before and after
 the patch.
 
 Jim
 
 
 --=-CssSwvW0ZmvCLU9WCSS5
 Content-Disposition: attachment; filename=tmp.file.2
 Content-Transfer-Encoding: quoted-printable
 Content-Type: text/x-patch; name=tmp.file.2; charset=UTF-8
 
 Index: function.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvs/gcc/gcc/gcc/function.c,v
 retrieving revision 1.421
 diff -p -r1.421 function.c
 *** function.c	22 Apr 2003 12:09:09 -0000	1.421
 --- function.c	30 Apr 2003 05:36:43 -0000
 *************** assign_stack_temp_for_type (mode, size,=20
 *** 796,802 ****
     if (keep =3D=3D 2)
       {
         p->level =3D target_temp_slot_level;
 !       p->keep =3D 0;
       }
     else if (keep =3D=3D 3)
       {
 --- 796,802 ----
     if (keep =3D=3D 2)
       {
         p->level =3D target_temp_slot_level;
 !       p->keep =3D 1;
       }
     else if (keep =3D=3D 3)
       {
 *************** preserve_temp_slots (x)
 *** 1134,1139 ****
 --- 1134,1143 ----
  =20
         return;
       }
 +=20
 +   /* If X is a constant, then we don't need to preserve stack slots.=C2=
 =A0 */
 +   if (CONSTANT_P (x))
 +     return;
  =20
     /* If X is a register that is being used as a pointer, see if we have
        a temporary slot we know it points to.  To be consistent with
 
 --=-CssSwvW0ZmvCLU9WCSS5--
 

Comment 9 Jim Wilson 2003-04-29 23:19:28 UTC
From: Jim Wilson <wilson@tuliptree.org>
To: Daniel Jacobowitz <drow@mvista.com>
Cc: gcc-gnats@gcc.gnu.org
Subject: Re: middle-end/9997: Coelesce stack slots for disjoint scopes.
Date: 29 Apr 2003 23:19:28 -0400

 While looking at this, I noticed that the var_temp_slot_level stuff
 seems to be broken.  In expand_expr, case SAVE_EXPR, we call assign_temp
 with keep == 3.  This ends up using var_temp_slot_level.  However, there
 is no place in the compiler that ever sets var_temp_slot_level.  It is
 initialized to 0, so all SAVE_TEMP temps end up at level 0, which means
 they live until the end of the function.  This can't be right.  I
 suspect this was also broken by the functions-as-trees stuff.  I haven't
 looked into this yet.  We can perhaps open a new PR for this problem.
 
 Jim
Comment 10 Andrew Pinski 2003-07-10 02:54:11 UTC
This looks fixed on the tree-ssa branch.
Comment 11 Andrew Pinski 2003-12-11 02:24:28 UTC
For some reason this is no longer fixed on the tree-ssa but it could be and should be able 
to be fixed.
Comment 12 Andrew Pinski 2004-05-12 17:33:49 UTC
*** Bug 15400 has been marked as a duplicate of this bug. ***
Comment 13 Mike LeBlanc 2004-05-12 17:44:35 UTC
Would someone please assign this?  This bug can severely impact systems with limited memory.
Comment 14 Wolfgang Bangerth 2004-05-12 18:11:26 UTC
Rather than looking just at the scopes of variables, this should properly  
be fixed by looking at the live ranges of variables.  
  
W.  
Comment 15 Jim Wilson 2004-05-12 22:54:58 UTC
The only issue with the patch from Daniel Jacobowitz was a problem with
TARGET_EXPR, and I wasn't sure how to fix that.  Meanwhile, Ulrich Weigand
checked in a patch May 1 to fix an aliasing problem with TARGET_EXPRs that was
apparently related to the problem I identified last year, as his patch is a
superset of mine.  Therefore, it may be the case that Daniel's patch is OK now.
 It needs retesting.

The SAVE_EXPR/var_temp_slot_level stuff is still broken.  Someone needs to look
into that.
Comment 16 Jim Wilson 2004-05-12 23:00:16 UTC
Subject: Re:  Coelesce stack slots for disjoint scopes.

On Wed, 2004-05-12 at 10:44, leblanc at skycomputers dot com wrote:
> Would someone please assign this?  This bug can severely impact systems with limited memory.

See the patch in comment #6.  It should work, but this stuff is tricky,
as there is no way to prove that this patch is safe.  We may just have
to try it and see what happens.
Comment 17 Daniel Jacobowitz 2004-05-12 23:47:42 UTC
Subject: Re:  Coelesce stack slots for disjoint scopes.

On Wed, May 12, 2004 at 10:55:00PM -0000, wilson at gcc dot gnu dot org wrote:
> 
> ------- Additional Comments From wilson at gcc dot gnu dot org  2004-05-12 22:54 -------
> The only issue with the patch from Daniel Jacobowitz was a problem with
> TARGET_EXPR, and I wasn't sure how to fix that.  Meanwhile, Ulrich Weigand
> checked in a patch May 1 to fix an aliasing problem with TARGET_EXPRs that was
> apparently related to the problem I identified last year, as his patch is a
> superset of mine.  Therefore, it may be the case that Daniel's patch is OK now.
>  It needs retesting.
> 
> The SAVE_EXPR/var_temp_slot_level stuff is still broken.  Someone needs to look
> into that.

I'll try to take a look at these after the merge.

Comment 18 naveens 2004-07-07 16:09:14 UTC
 > we should allocate only one structure on the stack.
 > Here's some interesting numbers I got for this testcase.  HEAD+stack is
 > the -fstack-reorg patch posted some time ago.
 
A small clarification.
-fstack-reorg doesn't affect stack height significantly. It had a slightly 
different objective (i.e. reorganizing frame to reduce address arithemtic
for processors with limited offsets). Stack height reduction,
if any, occurs due only due alignment holes elimination. (not in this
case anyway)

I realize that work can be used to reduce stack height as well.
Myself and Rakesh Kumar plan to work on this problem shortly.

If the proposed patch (for this PR) works, it is fine, otherwise we can look 
into this. However, it seems it will not be a bug-fixing exercise, 
because this is actually a design deficiency.

Cannot give a commitment on timeline though.
Comment 19 Jim Wilson 2004-07-08 02:05:09 UTC
A related but slightly different problem was exposed by a testcase from Ian
Taylor.  The message with the testcase is here:
    http://gcc.gnu.org/ml/gcc/2004-06/msg01827.html
My analysis is here:
    http://gcc.gnu.org/ml/gcc/2004-06/msg01913.html

Briefly, expand_builtin returns a value even when the ignore parameter is true.
 This confuses other code.
Comment 20 Richard Henderson 2004-09-02 07:15:22 UTC
Working on it.  Finally.
Comment 21 CVS Commits 2004-09-03 23:50:15 UTC
Subject: Bug 9997

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2004-09-03 23:50:11

Modified files:
	gcc            : ChangeLog Makefile.in cfgexpand.c 

Log message:
	PR middle-end/9997
	* cfgexpand.c (LOCAL_ALIGNMENT): Provide default.
	(STACK_ALIGNMENT_NEEDED, FRAME_GROWS_DOWNWARD): Likewise.
	(struct stack_var, EOC, stack_vars, stack_vars_alloc, stack_vars_num,
	stack_vars_sorted, stack_vars_conflict, stack_vars_conflict_alloc,
	frame_phase, get_decl_align_unit, add_stack_var, triangular_index,
	resize_stack_vars_conflict, add_stack_var_conflict,
	stack_var_conflict_p, add_alias_set_conflicts, stack_var_size_cmp,
	union_stack_vars, partition_stack_vars, dump_stack_var_partition,
	expand_one_stack_var_at, expand_stack_vars, expand_one_stack_var,
	expand_one_static_var, expand_one_hard_reg_var,
	expand_one_register_var, expand_one_error_var, defer_stack_allocation,
	expand_one_var, expand_used_vars_for_block, clear_tree_used): New.
	(expand_used_vars): Rewrite.
	* Makefile.in (cfgexpand.o): Update dependencies.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.5243&r2=2.5244
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/Makefile.in.diff?cvsroot=gcc&r1=1.1367&r2=1.1368
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cfgexpand.c.diff?cvsroot=gcc&r1=2.18&r2=2.19

Comment 22 Richard Henderson 2004-09-03 23:52:44 UTC
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg00440.html