This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

CSE memory leaks and old problems


So while doing some more cleanups, I noticed two things:

1. CSE never frees the memory for the "struct table_elt *table"
   elements, which it xmallocs.
2. We call flush_hash_table every 1000 insns.

flush_hash_table, as one would thing, flushes the hash table, but the
table remove functions it uses just throws the entries on a free list
(free_element_chain).

We *never* actually free them.
And if we had < 1000 insns in the function, we never flushed the hash
table, and put those things on the free list, either.

This is a hard to notice problem, because we invalidate enough, and it
seems we have workarounds in place specifically to deal with invalid
info in the table, so that you never notice the effect of not
flushing.

In fact, even though nothing says it specifically, I get the feeling
this not flushing is done on purpose.

However, even if this is the case, if we never compile functions >
1000 insns, we'll never flush the table.  Which seems, well, wrong.

Whether we *should* be doing this on purpose is another story.
Changing it to flush the table at the end of cse_main unconditionally,
and then free the free element chain elements, reduces memory usage
while not increasing compile time.

Anyone know the story on this?

Similarly, in relation to #2 above, the comment above the flush every
1000 insns said that this was a kludge to prevent quadratic behavior,
and that it would be fixed for 2.9.

I removed this flush, and can't see the quadratic behavior it's
referring to.  I tried functions with 15k insns in them, and compile
time stayed exactly the same (+- 1 percent) whether we flushed every
1000 insns or not.   gprof showed cse_insn not changing in terms of
what percent of the time it makes up.
Generated code for the 15k insn functions generally had one less
register used than before, sometimes a few more (no more than 3).
Other than that, it hand't changed (except, obviously, for the extra
register moves eliminated and such).

Given all of this, and the fact that CSE was not really
understandable, I up and started replacing it with a cselib based
pass.
Since we don't need the jump following anymore (PRE and SSA CCP can
both take care of this)
It almost works, too. 
I believe the part i'm missing is the renumbering of registers in the
cselib hash table after an update.

If someone wants to take this code, and just make it work (it'll
compile stuff, but generate incorrect code due to the lack of
renumbering), we can replace CSE easily.



int
cse_main (f, nregs, after_loop, file)
     rtx f;
     int nregs;
     int after_loop;
     FILE *file;
{
  rtx insn;
  init_alias_analysis();
  cselib_init();
      for (insn = get_insns (); insn ; insn = NEXT_INSN (insn))
	{
	  if (INSN_P (insn))
	    {
	      rtx temp;
	      temp = simplify_rtx (PATTERN (insn));
	      if (temp != NULL)
		PATTERN (insn) = temp;
	    }
	  if (INSN_P (insn) && GET_CODE (PATTERN (insn)) == SET)
	    {
	      rtx pattern;
	      cselib_val *val;
	      pattern = PATTERN (insn);
	      val = 
		cselib_lookup (SET_SRC (pattern), 
			       GET_MODE (SET_DEST (pattern)), 0);
	      if (val != NULL)
		{
		  int found_temp = 0;
		  struct elt_loc_list *curr=val->locs;		      
		  while (curr)
		    {
		      if (curr->temp != NULL)
			{
			  if ( validate_replace_src (SET_SRC (pattern), 
						     curr->temp, insn))
			    {
			      if (file)
				fprintf (file, 
					 "CSE replacing value at location %d with old temporary %d\n",
					 INSN_UID (insn), 
					 REGNO (curr->temp));
			      found_temp = 1;
			      break;
			    }
			}
		      curr =  curr->next;
		    }
		  if (!found_temp)
		    {
		      /* Generate a new temporary, change the existing
		       * locs to use it */
		      rtx newreg;
		      rtx gobefore;
		      rtx set;
		      curr = val->locs;
		      while (curr->next != NULL)
			{
			  curr = curr->next;
			}
		      gobefore = curr->setting_insn;
		      set = single_set (curr->setting_insn);
		      newreg = gen_reg_rtx (GET_MODE (SET_DEST (set)));
		      cselib_update_varray_sizes();
		      if (file)
			fprintf (file, "CSE generated new temporary %d\n", 
				 REGNO (newreg));
		      
		      if (!(GET_CODE (PATTERN (gobefore)) == SET))
			{
			  
			  /* IF it's a parallel or sequence, replace
			     the reg in the set in the sequence, and emit an
			     insn after it to set the original reg, so we
			     don't have to make a new parallel/sequence
			     here. */
			  rtx origdest;
			  origdest = SET_DEST (set);
			  if (!validate_change (gobefore, &SET_DEST (set), newreg, 0))
			    abort();
			  if (!emit_insn_after (gen_rtx_SET (GET_MODE (SET_DEST (set)),  
							     origdest, newreg), gobefore))
			    abort();
			  
			}
		      else
			{
			  
			  if (!emit_insn_before (gen_rtx_SET 
						 (GET_MODE (SET_DEST (set)), 
						  newreg, SET_SRC (set)), 
						 gobefore))
			    abort();
			}
		      
		      
		      
		      curr = val->locs;
		      while (curr)
			{
			  set = single_set (curr->setting_insn);
			  if (validate_replace_src (SET_SRC (set), 
						    newreg, 
						    curr->setting_insn))
			    {
			      
			      if (file)
				fprintf (file, 
					 "CSE replaced src in %d with temporary %d\n", 
					 INSN_UID (curr->setting_insn), REGNO (newreg));
			      curr->temp = newreg;
			    }
			  curr = curr->next;
			}
		    }
		}
	    }
	  cselib_process_insn (insn);
	}
  cselib_finish();
  end_alias_analysis();
  return 1;
}





-- 
"I hate it when my foot falls asleep during the day because that
means it's going to be up all night.
"-Steven Wright


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]