This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: register allocation versus scheduling

From: Geoff Keating <geoffk at geoffk dot org>
To: Brad Lucier <lucier at math dot purdue dot edu>
Cc: feeley at iro dot umontreal dot ca, matz at suse dot de, gcc at gcc dot gnu dot org
Date: 06 Jan 2003 14:57:28 -0800
Subject: Re: register allocation versus scheduling
References: <200301060136.h061aNL21961@banach.math.purdue.edu>

Brad Lucier <lucier@math.purdue.edu> writes:

...
> For a molecular energy minization code, where an affine transformation
> is applied consecutively to the location of each atom in the molecule, 
> 
> -O1 -fno-trapping-math -fschedule-insns2 -fnew-ra -mcpu=7400
> 
> works very well, the code is a bunch of overlapped loads, stores,
> and floating-point operations, but
> 
> -O1 -fno-trapping-math -fschedule-insns -fschedule-insns2 -fnew-ra -mcpu=7400
> 
> which also schedules *before* register allocation is 50% slower,
> since the schedule pass before hard register allocation loads *all*
> the x-y-z information for all the atoms into pseudo-registers at the
> top of the routine, and requires many moves between the stack and
> registers when these values are actually needed for computations.

This is very much a known problem, and is not specific to FP code.
I've seen it happen to me when optimising code for the (purely
integer) SHA-1 hash algorithm; normally this fits entirely in the
powerpc register set, but if -fschedule-insns is used it needs twice
as many registers, more than available, and runs much slower.

-- 
- Geoffrey Keating <geoffk@geoffk.org>

References:
- register allocation versus scheduling
  - From: Brad Lucier

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]