This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: CPROP and spec2000 slowdown in mid february
- To: Richard Kenner <kenner at vlsi1 dot ultra dot nyu dot edu>
- Subject: Re: CPROP and spec2000 slowdown in mid february
- From: Jan Hubicka <jh at suse dot cz>
- Date: Sat, 28 Apr 2001 20:53:37 +0200
- Cc: jh at suse dot cz, gcc at gcc dot gnu dot org
- References: <10104281356.AA26577@vlsi1.ultra.nyu.edu>
> Currently the GCSE and CPROP identifies _much_ more oppurtunities for
> optimization as previously, but due to soomehow crazy try_replace_reg
> implementation throws away most of these.
>
> True, but what I'm confused about is why that change would make things
> *worse* since it does identify many more opportunities and uses
> slightly *more* of them. Can you explain the performance regression?
I tried to explain problem bellow.
Problem is that try_replace_reg believes that it did the replacement, but
due to missing bits in simplify_replace_rtx and forgetting about set
destinations it don't.
So the insn is reported as CPROPed, but it isn't.
> I was trying to do this in the new functions I (later) added in simplify-rtx.c
> and would suggest that those are probably slightly better starting places.
> We should keep validate_replace_rtx as just the very simple case meant
> to replace one register with another.
Yes, I guessed that this was probably your goal. Could you explain
me the main rationale, why we want to have two completely different
infrastructure bits, one designed to substitute register by register
and other to substitute register by constant?
Currently valudate_replace_rtx is used by many parts of compiler to replace
expression by constant, register or even different expression and I believe
we should share as much code as possible and not fork such usages.
Function as I've suggested, that will call your callback to each subexpression
and "fixup" the RTL to be correct after substituting should IMO be good for
all purposes and resonably cheap/simple to implement to make it worthwhile.
(in fact I had it implemented in my tree, but the source probably got lost,
as it was more than year ago).
Your implementation IMO does have two problems. Most improtantly it works
way to hard doing an extra copy of whole rtx even if there is no single
change (by getting everything called trought simplify_*_rtx).
Second problem is that, as currently implemented, it stops recursing on
first unknown element. One I caught was memory expression, but there are
many more. Getting it to recurse over generic rtxes is somewhat tricky.
validate_replace_rtx scheme don't have such problems.
>
> In longer term perhaps even replace combine subst function by this
> beast, if Richard Kenner suceeds with his plan of making combine's
> simplifiers independent on the rest of infrastucture.
>
> That is certainly something I plan to keep working on and may be getting back
> to it to take the next step soon.
Thats can be wonderfull. Only think I am affraid of is performance regression
of combine. Currently combine spends most time IMO by doing substitutions
and these will get much more costy by getting torught memory allocation,
memset, re-filling of members etc.
Honza