This is the mail archive of the
mailing list for the GCC project.
Re: gcc compile-time performance
- From: law at redhat dot com
- To: "David S. Miller" <davem at redhat dot com>
- Cc: gcc at gcc dot gnu dot org, bernds at redhat dot com
- Date: Sun, 19 May 2002 11:11:08 -0600
- Subject: Re: gcc compile-time performance
- Reply-to: law at redhat dot com
In message <firstname.lastname@example.org>, "David S. Miller" write
> The biggest offender on x86 for some of my profiles seems to be
> for_each_rtx(), which actually stems a tiny bit from stupidity in
> for_each_rtx (some recursion can be eliminated) and a lot of stupidity
> in CSE.
> CSE's approx_reg_cost() is the true problem... That thing gets called
> on every expression CSE inserts into it's tables, every address that
> the cost is computed for, etc. (which is "a lot"). What's more it is
> implemented stupidly (construct a reg set just to check for bits set,
> ummm why not just do the counter bumping in the for_each_rtx helper
> function, duh...)
> Furthermore, when approx_reg_cost does get a REG, it should just
> return -1 to for_each_rtx so it doesn't look into subexpressions
> of the REG (translating into 2 or 1 useless recursive call to
> for_each_rtx depending upon whether simple recursion has been
> eliminated from for_each_rtx).
> Finally, the whole hardreg counting thing is just to see if there
> is ONE hard reg present in the expression. This also only matters
> on SMALL_REGISTER_CLASSES machines, so we can just return '1' from
> approx_reg_cost_1() if we see a hard reg on a S_R_C target and check
> that in the top-level return from for_each_rtx().
> I've begun to hack up most of this... patch below in case anyone
> wants to play along at home.
> One big disappointment is that, because approx_reg_cost runs on
> just about any RTX, we can't use note_uses() just like gcse.c
> does to find REGs. note_uses only works on the toplevel pattern
> of an INSN.
> Even with my fixes below, for_each_rtx still is the third function
> listed in the x86 profiles (right under memset() and cse_insn()).
> I started to look into the memset() issues, but got distracted
> when I noticed this approx_reg_cost() buisness...
> Basically, I discovered that the compiler spends most of it's time
> computing a heuristic... I think that speaks for itself :-)
[ ... ]
Whoops. It is this change that gets me 1-2%.
The change to cselib causes my PAs to hang in cselib_invalidate_regno.