This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFA: Reduce iterations of elimination bookkeeping

From: Jeff Law <law at redhat dot com>
To: Vladimir Makarov <vmakarov at redhat dot com>
Cc: gcc-patches <gcc-patches at gcc dot gnu dot org>
Date: Tue, 07 Apr 2009 12:57:09 -0600
Subject: Re: RFA: Reduce iterations of elimination bookkeeping
References: <49D6394D.3020509@redhat.com> <49D675DA.6050708@redhat.com>

Vladimir Makarov wrote:

Jeff, sorry. I thought too that the patch makes compiler faster (and it was quite obvious for me). But valgrind --tool=lackey actually shows 0.3% more executed insn s for -O2 combine.i on x86 after applying the patch. I think the problem may be (at least partially) in bigger code generation, e.g. for combine.i

83082 0 792 83874 147a2 c0.o 83098 0 792 83890 147b2 c1.o <- after the patch

I see also that compiler allocates bigger stack space for many functions after applying the patch even in cases when all stack displacements are the same. I have no idea why is that.

I'm withdrawing the patch. I've probably already spent more time dorking around with it than I should.

I considered just moving the caller-save setup, but that actually is a net loss. The best theory I've got is that the setup_save_areas is significantly more costly than elimination bookkeeping. Enough that the extra calls we're making to setup_save_areas is offsetting the savings we're getting from fewer iterations of elimination bookkeeping. Basically we had something like this:

 start:
   elimination bookkeeping
  if (something changed)
    goto start;
 caller-save-setup
 if (something changed)
  goto start
...

Note this is (effectively) a two loop nest with a common header. After my patch it looks like this:

 start
   caller-save setup
   elimination bookkeeping
   if (something changed)
     goto start

Note how we've effectively pulled the caller-save setup into the inner loop. Ugh. We could make caller-save setup faster, but as I mentioned, I think I've already spent more time on this than I should.

In regards to the stack alignment bits. We're actually dependent on the multiple alignments right now. So my desire to allocate the alignment slot once clearly won't fly. Too bad, that was a definite savings in space and time.....

For future reference, the multiple stack alignments occur when we spill a pseudo to memory after aligning the stack. The spill allocates a new stack slot and forces the big loop to iterate. If on that next iteration the stack isn't properly aligned, then we align it again (and again and again if we continue to have to spill pseudos to memory on subsequent iterations).

Ideally, we'd align the stack once, after everything has been reloaded and not iterate. The current structure of this code doesn't allow for that possibility.

Jeff

Follow-Ups:
- Re: RFA: Reduce iterations of elimination bookkeeping
  - From: Dave Korn

References:
- RFA: Reduce iterations of elimination bookkeeping
  - From: Jeff Law
- Re: RFA: Reduce iterations of elimination bookkeeping
  - From: Vladimir Makarov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]