This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: {PING] [PATCH] Sign extension elimination
- From: "H. J. Lu" <hjl at lucon dot org>
- To: Leehod Baruch <leehod dot baruch at weizmann dot ac dot il>
- Cc: Toon Moene <toon at moene dot indiv dot nluug dot nl>, Daniel Berlin <dberlin at dberlin dot org>, Mircea Namolaru <namolaru at il dot ibm dot com>, gcc-patches at gcc dot gnu dot org, leehod at gmail dot com, mark at codesourcery dot com, Roger Sayle <roger at eyesopen dot com>
- Date: Thu, 20 Apr 2006 06:31:05 -0700
- Subject: Re: {PING] [PATCH] Sign extension elimination
- References: <1331.84.108.234.162.1145524287.squirrel@84.108.234.162>
On Thu, Apr 20, 2006 at 12:11:27PM +0300, Leehod Baruch wrote:
> If what you say here:
> > We may be able to teach the x86-64 backend about SEE. If we can tell SEE
> that
> > SI is always zero-extended to DI for a backend, SEE can do a much
> > better job for x86-64.
> is true, then the high part of reg 73 is a zero extension of the low part
> automatically without the need for an extension instruction or even an
> extension that is embedded into the definition instruction, like the one
> SEE is currently
> producing:
> >> (set (reg:DI 73 [ t.42 ])
> >> (zero_extend:DI (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
> >> ivtmp.37 ])
> And all the uses of reg:DI 73 may stay unchanged.
>
> Did I understand you correctly?
Yes.
>
> On PPC the only implicit extensions are in instructions that
> set a register with a constant value, e.g. setting a DI register with
> the value 1 has an implicit extension.
>
That is the information I need. Basically, the current SEE is trying
to solve a different problem than the x86-64 has. It tries to
eliminate SE by better placing SE. It looks very similar to
http://www.trl.ibm.com/projects/jit/paper/sxt.pdf
But on x86-64, we have a different problem. No wonder it doesn't work
there. We are investigating a different, simple approach based on the
same infrastructure. I think it should work much better on x86-64.
That means that we may have 2 different SEE passes, depending on the
target.
H.J.
-----
Problem description:
--------------------
A 64bit machine may have implicit sign/zero extension instructions
where 32bit register operands are implicitly sign/zero extended to
64bit. Gcc may generate explicit sign/zero extension instructions to
convert a 32bit value into a 64bit value. Depending on the instruction
set of the architecture, some of these explicit extension instructions
may be redundant.
General idea for solution:
--------------------------
Replace the explicit extension with a pseudo extension instruction,
which is similar to the original extension, but a nop for the backend,
and a simple move.
Implementation by example:
--------------------------
Phase 0: Initial code, as currently generated by gcc.
implicit ext
|
...
|
explicit ext
implicit ext:
set ((reg:SI 10) (..defrhs..))
explicit ext:
set ((reg:DI 100) (extend:DI (reg:SI 10)))
Phase 1: We replace the explicit extension with a pseudo extension and
a simple move if there are no updates on the destination of the
original implicit extension before the explicit extension.
implicit ext
|
...
|
pseudo ext
simple move
implicit ext:
set ((reg:SI 10) (..defrhs..))
pseudo ext:
set ((reg:DI 10) (pseudo_extend:DI (reg:SI 10)))
simple move:
set ((reg:DI 100) (reg:DI 10)))
SEE should be enabled only for the backend with pseudo extensions.