This is the mail archive of the
mailing list for the GCC project.
Re: PR 23551: why should we coalesce inlined variables?
On Mon, 2007-07-09 at 12:01 -0300, Alexandre Oliva wrote:
> On Jul 9, 2007, Andrew MacLeod <firstname.lastname@example.org> wrote:
> > The only optimization I know of that cares about the base name is out
> > of ssa when it builds conflict graphs for coalescing,
> The point I was getting at is that there's no good reason to do
> coalescing. Generating RTL with one pseudo per SSA is quite natural,
> and then register allocation will take care of all the coalescing,
You are kidding right? You better write us a new register allocator
before doing that!
Besides, the coalescing the out-off-ssa does is PHI-NODE coalescing
primarily. Otherwise you will have a copy in every block for every PHI
argument in successor block. that is a TON of extra copies you will
then have to issue, and then remove. Your program will become 90+%
a_2 = PHI <a_5(1), a_6(2), a_9(3)>
will generate a copy in each of blocks 1, 2, and 3. out of ssa coaleces
those all together to simply a_2 (or whatever) since the copies are
non-overlapping and introducing those copies would be a lot of
overheaql, and out current back end may very well not lose the copies.
You need to be more practical than that. we are by no means in any state
to allow those kinds of unnecceary copies through. The only way it
would be acceptable is if we SSA-ified the entire backend as well, then
the PHIs get thropugh. Since the PHIs must be removed, the copies
should be removed as well.
The rest of the coalesces we do are also done because we do not have
perfect allocatoin/copy removal in the back end. If and when we do,
then we can start considering something like that. Until then, I see no
reason not to follow a route that can make significant improvements to
the debug info with a minimal amount of impact.
> Which is why I added the /* var = */ comments before the SSAs in my
> latest suggestion (see below). Wouldn't this address your concern?
> How these comments are implemented is not relevant at this point. So
> far I'm just trying to figure out what additional information is
> needed to enable us to generate correct debug info. I haven't put
> much thought into dealing with PHIs, or situations in which an
with SSA, you better think hard about PHIs, they are key.
> assignment that holds a debug info is fully optimized away. So far
> I'm focusing on single basic blocks, and it's already tricky (see
AS I said, it is ortogonal tho. You can do all the debug info work, and
then address the listings. there is no need to go changing the base
variables to a non-name base, especially when it will make out of ssa
explode (or whoever does the coalescing. The issue is the same whereever
it is done)
> > 1 - Add a link for each ssa name to the debug symbol.
> This only works as long as we retain the assignment. I'm trying to
> cover cases in which we don't. As in, the optimized version of
> http://gcc.gnu.org/ml/gcc-patches/2007-07/msg00811.html that I
> envision would be something like:
> /* foo = */ foo_1 = <initial value>;
> 0: /* bar = */ bar_1 = <expr>;
> 1: /* foo = bar_1; */
> 2: /* bar = bar_1 + 1; */
> 3: /* foo = bar_1 - 1; */
> use (bar_1 + 1);
> use (bar_1 - 1);
Ther eis a reason we dont want to do this for this example:
foo_3 = bar_2
bar_4 = bar_2 + 1
foo_5 = foo_3 - 1
If you ignore base variables, and coalesce this, the assignment goes
away, but as I said, we can address things like that by linking
ssa_names to more than one debug symbol.
when the assignment goes away, for foo_3, and we are left with:
bar_4 - bar_2 + 1
foo_5 = bar_2 - 1
when we coalesced foo_3 and bar_2, to bar_2, we update the debug list
for bar_2 to be 'foo' and 'bar'.
then in out of ssa, we issue the debug info for both 'foo' and 'bar' to
match the range of bar_2. we may have to do some twiddling to detect the
overlap between bar_4 and bar_2, but we will have some twiddling to do
with multiple names anyway. I claim its a place to start.
this is at a cost. NOW the live range of bar_2 goes past bar_4, so NOW
bar_2 and bar_4 have to be different variables and cannot be coalesced
together to 'bar' anymore. So we are now using a third variable
bar.1 = bar + 1
foo = bar - 1
IN this simple example its possible to further optimize that and make
the 3rd variable go away, but its also just as possble that you cant or
miss the opportunity since all the tree optimizations have run.
And if they are going to find this case, they might just as well have
found the copy in the original case and reduced it to the same result.