This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Gcc 3.1 performance regressions with respect to 2.95.3
- From: David Edelsohn <dje at watson dot ibm dot com>
- To: Dale Johannesen <dalej at apple dot com>
- Cc: law at redhat dot com, Michael Matz <matzmich at cs dot tu-berlin dot de>, gcc-patches at gcc dot gnu dot org
- Date: Tue, 26 Mar 2002 13:50:45 -0500
- Subject: Re: Gcc 3.1 performance regressions with respect to 2.95.3
>>>>> Dale Johannesen writes:
Dale> I don't think the problem has anything to do with clobbers per se; it's
Dale> more general than that. I have half a dozen bug reports of cases where
Dale> the scheduler pass1 makes the code worse by introducing too many stack
Dale> temps. The basic problem, IMO, is that it has no concept that increasing
Dale> register lifetimes too much is a bad thing. (And this is something I
Dale> intend to work on, but I've looked at it enough to conclude there's no
Dale> quick fix, and other things keep coming up.) Perhaps it should keep
Dale> track of how many values are live simultaneously, and avoid increasing
Dale> that number.
This is what the Haifa register pressure code did which Jeff
considers junk. [No comment.]
>> The LIBCALLs I have examined look like:
>>
>> (clobber (reg:DI X)) [LIBCALL]
>> (set (subreg:SI (reg:DI X) 0) (CONST))
>> (set (subreg:SI (reg:DI X) 4) (reg:SI Y))
>> (set (reg:DI X) (reg:DI X)) [RETVAL]
>> (set ... (reg:DI X))
Dale> I don't see why there needs to be a clobber here at all. Wouldn't you
Dale> have the same problem without it? Also, as I read the subreg docs,
Dale> there should be a strict_low_part around at least the store
Dale> into (subreg 4).
One will have the same problem without a CLOBBER, but I mainly
have seen this problem with LIBCALL blocks which use the stylized CLOBBER
-> DEAD STORE bookends to mark the blocks.
Dale> You might be able to hack around the specific problem shown by using
Dale> SCHED_GROUP_P to keep the two subreg stores and the clobber together.
Yes, that is what Michael Matz and I originally did, except
marking these instructions as a scheduler group sometimes causes the
second scheduler pass to go into an endless loop. No one has bothered to
debug that failure mode preventing scheduler groups from being used.
I think the question is whether one needs to teach the scheduler
about register lifetimes to solve this problem "good enough" or whether
one can introduce some other feedback which will produce a similar effect.
E.g., artificially elevating the "cost" of the CLOBBER until the SET is
ready to dispatch on its own. One could possibly ignore the real latency
of the SET of a CONST and defer that detailed scheduling until the second
pass, so the CLOBBER, SET CONST, SET REG issue in successive cycles in the
first scheduler pass. I do not know if one can find a balance between
moving instructions forward and register pressure without actually having
a real policy for register pressure.
Thanks, David