This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: An issue for the SC: horrible documentation quality of GCC

From: Michael S. Zick <mszick at goquest dot com>
To: kenner at vlsi1 dot ultra dot nyu dot edu (Richard Kenner),kaih at khms dot westfalen dot de
Cc: gcc at gcc dot gnu dot org
Date: Tue, 13 May 2003 00:51:56 -0500
Subject: Re: An issue for the SC: horrible documentation quality of GCC
References: <10305112003.AA28540@vlsi1.ultra.nyu.edu>

> On Sunday 11 May 2003 03:03 pm, Richard Kenner wrote:
>> On Sunday 11 May 2003 12:29 pm, Kai Henningsen wrote:
>>     As there is so often, there is an answer that is perfectly obvious, but
>>     not helpful in practice.
Unfortunately, the question is not perfectly obvious.

>>
>>     Have a cost function that determines how good the optimization of a
>>     program is.
For the phrase "cost function" taken to mean a general concept, agreed.

>>
>>     Run all optimizations in some predetermined sequence.
>>
>>     Compare the cost function of the output to the one of the input.
>>
>>     As long as there was some improvement, do it again.
Better would be a single "costing function principle" computed "on the
fly" and refined at each stage of optimization.
I.E: A single pass, stepwise refined, "costing function principle" rather
than an iteration.

>>
>>     Then use the best result.
>>
>>     Compile times? What do you mean, compile times?
>>
> Even aside from compile times, since some optimizations are performed
> before register allocation and some after, there's no such iteration
> possible.
Ah... but there could be a non-iterative "costing function" of the optimized
transformations.  One which begins in the non-hardware dependent, generic
transformations and converges to code suitable for a specific processor
within the hardware dependent transformations.

Following the long standing traditions of CS deceptive clarity, I will use a
well established, widely used term to mean something entirely different.

"Address Space"

The highest level (least refined) concept of "Address Space" would be 
where program logic transforms are performed for an abstract machine
with a (virtually) unlimited register set.
Those "Address Spaces" could be numbered:
0 - The address space with the property of "use in place".
1 - The address space with the property of "fetch to use".

"Address Space" in action:

When a specific constant is encountered, not having any hardware
knowledge, it must be presumed to reside in Address Space 1 - fetch
to use.
So assign it to a pseudo register.  Now that constant is in Address 
Space 0.
That movement from Address Space 1 to Address Space 0 incurs
a cost.  That cost may be represented by a small, integer, value.

Unfortunately, costing algorithms often stop with accumulating that
sense of "cost".

Instead, record that cost with the pseudo register - this small, integer,
value now represents the "investment" made in having this constant
in Address Space 0.

Each additional time this specific constant is encountered, it is found
to already be in Address Space 0.  Not having any hardware
knowledge, presume that pseudo register is the source of the constant.

The use of an Address Space 0 pseudo register has a cost.  That cost
may be represented by a small, integer value.  That value should be
numerically less than the cost of an Address Space 1 cost.

Each of those additional times this specific constant in encountered,
record the DIFFERENCE of the two costs.  Literally, the interest
earned on the investment of moving the constant into 
Address Space 0.

So far, our new pseudo register has three fields associated with
it:  1) The address space occupies.  2) The cumulative cost it has
acquired during its address space movements  3)  The cumulative
interest earned.

Now when an optimization decision has to be made, it can be
based on the "rate of return".  NOT on the cost (in instructions,
cycles, whatever - we don't know that yet.)  NOR on the interest
(which, at this point, is equivalent to frequency of re-use).

So far, not very interesting with only a two address space costing
function.  Also, the results would probably not be much different
than not bothering to do any of this in the first place.

Now the interesting part...

Question: "Who said our abstract machine only has two 
address spaces?"
Not I.

Question: "Who said that an abstract machine with a (virtually)
unlimited set of registers doesn't have a limit on the number
of registers that may occupy a single address space?"
Not I.

At this generic level, there would be an Address Space that
roughly similar to each of what the hardware people call an
"Addressing mode".
While there is great variation in hardware instructions, there
are relatively few addressing modes.

Which brings me around to what I called "hardware dependent
parameterization of the generic, hardware independent
optimization's" mentioned in:
Ref: <http://gcc.gnu.org/ml/gcc/2003-05/msg01009.html>

There would be a small table for each supported processor,
that included the cost and pseudo register limit for each
Address Space.

Such a table under this scheme should be adequate to limit
the early optimization's from generating something that the
hardware dependent backend might have to totally rewrite.

Note that these "Address Spaces", by the time they get to
the hardware code generation, would encompass the existing
"Register Mode Classes".  
One might want to turn this description inside-out by calling 
my "Address Spaces" instead "Pseudo Register Modes".

Note that this "costing function principle" can be carried
through (and used) the entire compilation process.

For instance; what choice do you make when you have a 
limited resource to spend (either money or hard registers)?

Easy, (and like the Kai said above "obvious") you use that
resource on the investment that has the highest rate of return.

Similarly for a decision of what to carry in a register, spill to
the stack, or re-compute.
Or what to hoist out of a loop or when a pointer increment
can stay in a loop because the hardware has auto-increment
registers.

It is a matter of asking the right question, not the obviousness
of the answer.

Mike

References:
- Re: An issue for the SC: horrible documentation quality of GCC
  - From: Richard Kenner

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]