{PING] [PATCH] Sign extension elimination

Leehod Baruch leehod.baruch@weizmann.ac.il
Thu Apr 20 09:12:00 GMT 2006


>> For x86-64, the first instruction in
>>
>> (insn:HI 23 22 25 3 (parallel [
>>             (set (reg/v:SI 64 [ t.42 ])
>>                 (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
>>                             (symbol_ref:DI ("state") [flags 0x40]
>> <var_decl 0x2a9853cc60 state>)) [3 state S4
>> A8])
>>                     (reg/v:SI 64 [ t.42 ])))
>>             (clobber (reg:CC 17 flags))
>>         ]) 340 {*xorsi_1} (insn_list:REG_DEP_TRUE 22 (nil))
>>     (expr_list:REG_UNUSED (reg:CC 17 flags)
>>         (expr_list:REG_EQUAL (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
>> ivtmp.37 ])
>>                         (symbol_ref:DI ("state") [flags 0x40] <var_decl
>> 0x2a9853cc60 state>)) [3 state S4 A8])
>>                 (mem/s:SI (plus:DI (mult:DI (reg:DI 69 [ t.45 ])
>>                             (const_int 4 [0x4]))
>>                         (symbol_ref:DI ("S") [flags 0x40] <var_decl
>> 0x2a9853cb00 S>)) [3 S S4 A32]))
>>             (nil))))
>>
>> (insn:HI 25 23 27 3 (set (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
>>                 (symbol_ref:DI ("state") [flags 0x40] <var_decl
>> 0x2a9853cc60 state>)) [3 state S4 A8])
>>         (reg/v:SI 64 [ t.42 ])) 40 {*movsi_1} (insn_list:REG_DEP_TRUE 23
>> (nil))
>>     (nil))
>>
>> (insn:HI 27 25 30 3 (set (reg:DI 73 [ t.42 ])
>>         (zero_extend:DI (reg/v:SI 64 [ t.42 ]))) 111
>> {zero_extendsidi2_rex64} (nil)
>>     (expr_list:REG_DEAD (reg/v:SI 64 [ t.42 ])
>>         (nil)))
>>
>> will zero extend t.42 to DI implicitly. But SEE doesn't know that and
>> it transforms them to
>>
>> (insn:HI 23 22 86 3 (parallel [
>>             (set (reg:DI 73 [ t.42 ])
>>                 (zero_extend:DI (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
>> ivtmp.37 ])
>>                                 (symbol_ref:DI ("state") [flags 0x40]
>> <var_decl 0x2a9853cc60 state>)) [3 state
>> S4 A8])
>>                         (reg/v:SI 64 [ t.42 ]))))
>>             (clobber (reg:CC 17 flags))
>>         ]) 341 {*xorsi_1_zext} (insn_list:REG_DEP_TRUE 22 (nil))
>>     (expr_list:REG_DEAD (reg/v:SI 64 [ t.42 ])
>>         (expr_list:REG_UNUSED (reg:CC 17 flags)
>>             (expr_list:REG_EQUAL (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
>> ivtmp.37 ])
>>                             (symbol_ref:DI ("state") [flags 0x40]
>> <var_decl 0x2a9853cc60 state>)) [3 state S4
>> A8])
>>                     (mem/s:SI (plus:DI (mult:DI (reg:DI 69 [ t.45 ])
>>                                 (const_int 4 [0x4]))
>>                             (symbol_ref:DI ("S") [flags 0x40] <var_decl
>> 0x2a9853cb00 S>)) [3 S S4 A32]))
>>                 (nil)))))
>>
>> (insn 86 23 25 3 (set (reg/v:SI 64 [ t.42 ])
>>         (subreg:SI (reg:DI 73 [ t.42 ]) 0)) -1 (nil)
>>     (nil))
>>
>> (insn:HI 25 86 30 3 (set (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
>>                 (symbol_ref:DI ("state") [flags 0x40] <var_decl
>> 0x2a9853cc60 state>)) [3 state S4 A8])
>>         (reg/v:SI 64 [ t.42 ])) 40 {*movsi_1} (insn_list:REG_DEP_TRUE 23
>> (nil))
>>     (expr_list:REG_DEAD (reg/v:SI 64 [ t.42 ])
>>         (nil)))
>>
>> If we don't want to add codes to the x86-64 backend to deal with SEE
>> transformation, is there a way for a backend to pass such information
>> to SEE so that SEE can transform them to
>>
>> (insn:HI 23 22 25 3 (parallel [
>>             (set (reg/v:SI 64 [ t.42 ])
>>                 (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
>>                             (symbol_ref:DI ("state") [flags 0x40]
>> <var_decl 0x2a9853cc60 state>)) [3 state S4
>> A8])
>>                     (reg/v:SI 64 [ t.42 ])))
>>             (clobber (reg:CC 17 flags))
>>         ]) 340 {*xorsi_1} (insn_list:REG_DEP_TRUE 22 (nil))
>>     (expr_list:REG_UNUSED (reg:CC 17 flags)
>>         (expr_list:REG_EQUAL (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
>> ivtmp.37 ])
>>                         (symbol_ref:DI ("state") [flags 0x40] <var_decl
>> 0x2a9853cc60 state>)) [3 state S4 A8])
>>                 (mem/s:SI (plus:DI (mult:DI (reg:DI 69 [ t.45 ])
>>                             (const_int 4 [0x4]))
>>                         (symbol_ref:DI ("S") [flags 0x40] <var_decl
>> 0x2a9853cb00 S>)) [3 S S4 A32]))
>>             (nil))))
>>
>> (insn:HI 25 23 27 3 (set (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
>>                 (symbol_ref:DI ("state") [flags 0x40] <var_decl
>> 0x2a9853cc60 state>)) [3 state S4 A8])
>>         (reg/v:SI 64 [ t.42 ])) 40 {*movsi_1} (insn_list:REG_DEP_TRUE 23
>> (nil))
>>     (nil))
If this is the optimal code, what should happen to the uses of reg 73?
like in the instruction:
(insn:HI 31 30 33 3 (parallel [
            (set (reg/v:SI 63 [ t.43 ])
                (xor:SI (mem/s:SI (plus:DI (mult:DI (reg:DI 73 [ t.42 ])
                                (const_int 4 [0x4]))
                            (symbol_ref:DI ("S") [flags 0x40] <var_decl
0x2a9853cb00
S>)) [3 S S4 A32])
                    (reg/v:SI 63 [ t.43 ])))
            (clobber (reg:CC 17 flags))
        ]) 340 {*xorsi_1} (insn_list:REG_DEP_TRUE 27
(insn_list:REG_DEP_TRUE 30 (nil)))
    (expr_list:REG_DEAD (reg:DI 73 [ t.42 ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (expr_list:REG_EQUAL (xor:SI (mem/s:SI (plus:DI (mult:DI
(reg:DI 73 [
t.42 ])
                                (const_int 4 [0x4]))
                            (symbol_ref:DI ("S") [flags 0x40] <var_decl
0x2a9853cb00
S>)) [3 S S4 A32])
                    (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
                            (const:DI (plus:DI (symbol_ref:DI ("state")
[flags 0x40]
<var_decl 0x2a9853cc60 state>)
                                    (const_int 4 [0x4])))) [3 state S4 A8]))
                (nil)))))

Maybe the transformation should be to this pattern:
(insn:HI 23 22 25 3 (parallel [
            (set (subreg:SI (reg:DI 73 [ t.42 ]) 0))
                (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
                            (symbol_ref:DI ("state") [flags 0x40]
<var_decl 0x2a9853cc60 state>)) [3 state S4
A8])
                    (reg/v:SI 64 [ t.42 ])))
            (clobber (reg:CC 17 flags))
        ]) 340 {*xorsi_1} (insn_list:REG_DEP_TRUE 22 (nil))
    (expr_list:REG_UNUSED (reg:CC 17 flags)
        (expr_list:REG_EQUAL (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
ivtmp.37 ])
                        (symbol_ref:DI ("state") [flags 0x40] <var_decl
0x2a9853cc60 state>)) [3 state S4 A8])
                (mem/s:SI (plus:DI (mult:DI (reg:DI 69 [ t.45 ])
                            (const_int 4 [0x4]))
                        (symbol_ref:DI ("S") [flags 0x40] <var_decl
0x2a9853cb00 S>)) [3 S S4 A32]))
            (nil))))

(insn:HI 25 23 27 3 (set (mem/s:SI (plus:DI (reg:DI 66 [ ivtmp.37 ])
                (symbol_ref:DI ("state") [flags 0x40] <var_decl
0x2a9853cc60 state>)) [3 state S4 A8])
        (subreg:SI (reg:DI 73 [ t.42 ]) 0) 40 {*movsi_1}
(insn_list:REG_DEP_TRUE 23
(nil))
    (nil))

If what you say here:
> We may be able to teach the x86-64 backend about SEE. If we can tell SEE
that
> SI is always zero-extended to DI for a backend, SEE can do a much
> better job for x86-64.
is true, then the high part of reg 73 is a zero extension of the low part
automatically without the need for an extension instruction or even an
extension that is embedded into the definition instruction, like the one
SEE is currently
producing:
>>             (set (reg:DI 73 [ t.42 ])
>>                 (zero_extend:DI (xor:SI (mem/s:SI (plus:DI (reg:DI 66 [
>> ivtmp.37 ])
And all the uses of reg:DI 73 may stay unchanged.

Did I understand you correctly?

On PPC the only implicit extensions are in instructions that
set a register with a constant value, e.g. setting a DI register with
the value 1 has an implicit extension.

Leehod.



More information about the Gcc-patches mailing list