[RFC] Type promotion pass and elimination of zext/sext
Kugan Vivekanandarajah
kugan.vivekanandarajah@linaro.org
Mon May 23 00:25:00 GMT 2016
Hi Jeff,
On 20 May 2016 at 04:17, Jeff Law <law@redhat.com> wrote:
> On 05/15/2016 06:45 PM, Kugan Vivekanandarajah wrote:
>>
>> Hi Richard,
>>
>> Now that stage1 is open, I would like to get the type promotion passes
>> reviewed again. I have tested the patches on aarch64, x86-64, and
>> ppc64le without any new execution failures. There some test-cases that
>> fails for patterns. I will address them after getting feedback on the
>> basic structure.
>
> I find myself wondering if this will eliminate some of the cases where Kai's
> type casting motion was useful. And just to be clear, that would be a good
> thing.
Let me try with the above patch.
However:
aarch64 testsuite diff is:
# Comparing 11 common sum files
## /bin/sh ./gcc/contrib/compare_tests /tmp/gxx-sum1.20074 /tmp/gxx-sum2.20074
Tests that now fail, but worked before:
gcc.dg/tree-ssa/pr54245.c scan-tree-dump-times slsr "Inserting initializer" 0
gcc.dg/tree-ssa/pr69270-3.c scan-tree-dump-times uncprop1 ", 1" 1
New tests that PASS:
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O1 comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O2 comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O2 -flto
-fno-use-linker-plugin -flto-partition=none comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -O3 -g comparison
gcc.c-torture/unsorted/dump-noaddr.c.*t.promotion, -Os comparison
## Differences found:
# 1 differences in 11 common sum files found
ppc64 had more.
I see improvements and some regressions too;
For example:
short unPack( unsigned char c )
{
c = c & (unsigned char)0x0F ;
if( c > 7 ) {
return( ( short )( c - 8 ) ) ;
}
else
{
return( ( short )c ) ;
}
}
asm differnce:
and w0, w0, 15
- cmp w0, 7
- bhi .L5
- sxth w0, w0
- ret
- .p2align 3
-.L5:
- sub w0, w0, #8
- sxth w0, w0
+ sub w1, w0, #8
+ cmp w0, 8
+ csel w0, w1, w0, cs
ret
short bar (short y);
int foo (short x)
{
short y = bar (x) + 15;
return y;
}
Base - optimized gimple
foo (short int x)
{
short int y;
short int _1;
unsigned short _2;
unsigned short _3;
int _8;
<bb 2>:
_1 = bar (x_5(D));
_2 = (unsigned short) _1;
_3 = _2 + 15;
y_7 = (short int) _3;
_8 = (int) y_7;
return _8;
}
With type promotion - optimized gimple
foo (short int x)
{
unsigned int _2;
unsigned int _3;
signed int _7;
int _8;
short int _10;
unsigned int _12;
<bb 2>:
_10 = bar (x_9(D));
_2 = (unsigned int) _10;
_3 = _2 + 15;
_12 = _3 & 65535;
_7 = (signed int) _12;
_8 = (_7) sext (16);
return _8;
}
ASM difference:
stp x29, x30, [sp, -16]!
add x29, sp, 0
bl bar
+ sxth w0, w0
add w0, w0, 15
ldp x29, x30, [sp], 16
sxth w0, w0
Thanks,
Kugan
>
>>
>> 1. When we promote SSA as part of promote_ssa, we either promote the
>> definition. Or create a copy stmt that is inserted after the stmt that
>> define it. i.e, we want to promote the SSA and reflect the promotion
>> on all the uses (we promote in place). We do this because, we don’t
>> want to change all the uses.
>>
>> +/* Promote definition DEF to promoted type. If the stmt that defines def
>> + is def_stmt, make the type of def promoted type. If the stmt is such
>> + that, result of the def_stmt cannot be of promoted type, create a
>> new_def
>> + of the original_type and make the def_stmt assign its value to newdef.
>> + Then, create a NOP_EXPR to convert new_def to def of promoted type.
>> +
>> + For example, for stmt with original_type char and promoted_type int:
>> + char _1 = mem;
>> + becomes:
>> + char _2 = mem;
>> + int _1 = (int)_2;
>
> When does this case happen, and how is this any better than PRE or other
> elimination/code motion algorithms in improving the generated code?
>
> I would hazard a guess that it could happen if you still needed the char
> sized used in a small number of cases, but generally wanted to promote most
> uses to int?
>
>> +
>>
>> However, if the defining stmt has to be the last stmt in the basic
>> block (eg, stmt that can throw), and if there is more than one normal
>> edges where we use this value, we cant insert the copy in all the
>> edges. Please note that the copy stmt copes the value to promoted SSA
>> with the same name.
>>
>> Therefore I had to return false in this case for promote_ssa and fixup
>> uses. I ran into this while testing ppc64le. I am sure it can happen
>> in other cases.
>
> Right.
>
> Jeff
More information about the Gcc-patches
mailing list