This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Exploiting dual mode operation
- From: Mircea Namolaru <NAMOLARU at il dot ibm dot com>
- To: gcc at gcc dot gnu dot org
- Cc: joern dot rennecke at superh dot com, Leehod Baruch <LEEHOD at il dot ibm dot com>
- Date: Mon, 7 Jun 2004 19:17:38 +0200
- Subject: Exploiting dual mode operation
Hello,
Following our message ( http://gcc.gnu.org/ml/gcc/2004-04/msg00970.html )
regarding expoiting dual mode operations, we enclose an overview of our
algorithm.
In order to support 32 bit computations on a 64 bit machine, sign
extension
instructions are generated to ensure the correctness of the computation.
A possible policy (currently implemented in gcc) is to generate a sign
extension after each 32 bit computation. Depending on the instruction set
of
the architecture some of these sign extension instructions may be
redundant.
There are two cases:
Case1:
The instruction using the 64bit operands (after they are sign-extended)
has a dual mode that works with 32bit operands.
For example:
int32 a, b;
a = .... a = ....
a = sign extend a <-->
b = .... b = ....
b = sign extend a <-->
cmpd a, b cmpw a, b //half word compare
Case2:
The instruction defining the 64bit operand (which is later sign-extended)
has a dual mode that defines and sign-extends a 32bit operand.
For example:
int32 a;
ld a lwa a // load half and sign
extend
a = sign extend a <-->
return a return a
We'll present the algorithm on the following example:
Example:
We have two definitions of int32, and multiple uses as below:
Source:
def1 def2
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
use1 use12 use2
Assume that only a single sign extend (se) instruction is needed on the
path
between def1 and use12 - for all other paths, the se can be combined with
either the def or the use.
0. Initial code, as currently generated by gcc.
(There are sign extension after each definition)
def1 def2
se se
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
use1 use12 use2
1. Combine
1.a prepare for combine
(Redundant) sign extensions generated also before uses.
def1 def2
se se
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
se se se
use1 use12 use2
1.b combine
- Combine tries to merge se's with defs and with uses.
Assume it succeeds for def2 and use1:
def1 def2
se [se removed]
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
[se removed] se se
use1 use12 use2
2. PRE
2.a prepare for PRE
- if combine did not remove an se after a def (as in def1), we remove it
now
(recall that it is redundant) - PRE should compute an optimal placement
of
se's which may or may not include this after-the-def location.
- if combine did remove an se after a def (as in def2), we want PRE to
know
that the new def contains an se, so that it will remove other redundant
se's.
So we regenerate the se explicitly, and remove it eventually.
- Note that if combine removed an se before a use, this use no longer
requires
an se, so PRE needs not consider this use any longer (and hence we do
not
regenerate the se).
def1 def2
[se removed] se [regenerated]
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
| \ / |
se se
use1 use12 use2
2.b PRE
An optimal placement is:
def1 def2
se [regenerated]
| \ / |
| \ / |
| se / |
| \ / |
| \ / |
| \ / |
use1 use12 use2
4. Cleanup
- The regenerated sign extension after def2 is removed.
def1 def2
[se removed]
| \ / |
| \ / |
| se / |
| \ / |
| \ / |
| \ / |
use1 use12 use2
Finally we got the best placement for the sign extensions as required.
We will soon post part of code for review.
Comments welcomed. Thanks,
Mircea and Leehod