This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Gcc 3.1 performance regressions with respect to 2.95.3


>>>>> Dale Johannesen writes:

Dale> I don't think the problem has anything to do with clobbers per se; it's
Dale> more general than that.  I have half a dozen bug reports of cases where
Dale> the scheduler pass1 makes the code worse by introducing too many stack
Dale> temps.  The basic problem, IMO, is that it has no concept that increasing
Dale> register lifetimes too much is a bad thing.  (And this is something I
Dale> intend to work on, but I've looked at it enough to conclude there's no
Dale> quick fix, and other things keep coming up.)  Perhaps it should keep
Dale> track of how many values are live simultaneously, and avoid increasing
Dale> that number.

	This is what the Haifa register pressure code did which Jeff
considers junk.  [No comment.]

>> The LIBCALLs I have examined look like:
>> 
>> (clobber (reg:DI X)) [LIBCALL]
>> (set (subreg:SI (reg:DI X) 0) (CONST))
>> (set (subreg:SI (reg:DI X) 4) (reg:SI Y))
>> (set (reg:DI X) (reg:DI X)) [RETVAL]
>> (set ... (reg:DI X))

Dale> I don't see why there needs to be a clobber here at all.  Wouldn't you
Dale> have the same problem without it?  Also, as I read the subreg docs,
Dale> there should be a strict_low_part around at least the store
Dale> into (subreg 4).

	One will have the same problem without a CLOBBER, but I mainly
have seen this problem with LIBCALL blocks which use the stylized CLOBBER
-> DEAD STORE bookends to mark the blocks.

Dale> You might be able to hack around the specific problem shown by using
Dale> SCHED_GROUP_P to keep the two subreg stores and the clobber together.

	Yes, that is what Michael Matz and I originally did, except
marking these instructions as a scheduler group sometimes causes the
second scheduler pass to go into an endless loop.  No one has bothered to
debug that failure mode preventing scheduler groups from being used.

	I think the question is whether one needs to teach the scheduler
about register lifetimes to solve this problem "good enough" or whether
one can introduce some other feedback which will produce a similar effect.
E.g., artificially elevating the "cost" of the CLOBBER until the SET is
ready to dispatch on its own.  One could possibly ignore the real latency
of the SET of a CONST and defer that detailed scheduling until the second
pass, so the CLOBBER, SET CONST, SET REG issue in successive cycles in the
first scheduler pass.  I do not know if one can find a balance between
moving instructions forward and register pressure without actually having
a real policy for register pressure.

Thanks, David


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]