This is the mail archive of the
mailing list for the GCC project.
Re: [patch] cse.c: Speed up cse_reg_info maintainance - Part 2
- From: bernds_cb1 at t-online dot de (Bernd Schmidt)
- To: Kazu Hirata <kazu at cs dot umass dot edu>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 31 Jan 2005 15:38:51 +0100
- Subject: Re: [patch] cse.c: Speed up cse_reg_info maintainance - Part 2
- References: <email@example.com>
Kazu Hirata wrote:
Attached is the second patch to speed up cse_reg_info maintenance.
A collection of cse_reg_info instances forms a mapping from a register
number to a quantity number among other things. Since this mapping is
from an integer to a struct, you might guess that it is implemented as
an array indexed by a register number, but that's not the case.
Instead, CSE has a fixed-sized hash table which maps a register number
to a linked list of cse_reg_info entries with the same hash key. If
you are looking for information associated with a register, you have
to go through one of these collision chains.
This patch simplifies all this by implementing the collection as a
single big array of cse_reg_info called cse_reg_info_table indexed by
a register number. For example, if you are looking for a quantity
number associated with (reg:xx 37), you would say
Funny. This is more or less what this code looked like in older
versions of gcc, except it was using several arrays instead of a
structure. Then, in
Mark changed it to use a splay tree for speed reasons, and subsequently
that got changed to a hash table.
Mark's reason for his change was overhead in new_basic_block, where we
had an O(max_reg) operation to reset these vectors. That problem is
also addressed by the sequence number trick you are using in your patch,
so your change should not reintroduce the problem Mark was seeing.
Very simple. Unfortunately the story doesn't end here. Here are
problems that we have to address:
1. We don't know the highest register number at compile time of GCC.
Uhh, can't we just allocate the array once at the start of cse_main
(where we know the highest current register number, NREGS) and
deallocate it when we're done? That shouldn't be too much overhead, and
we wouldn't keep stuff allocated while we're not using it.
It's probably reasonable to assume that the set of registers numbers
is compact. I expect most dead code to be eliminated at tree level.
Also, most constants should be propagated at tree level, so we
shouldn't have many pseudo registers that are created but gone due to
DCE, constant propagation, etc. This property should even improve the
Also, since we're currently not freeing cse_reg_info entries, total
memory usage should not be much (if at all) increased, if the assumption
is correct that the register numbers which are actually used aren't too