[Bug target/65162] New: [5 Regression][SH] Redundant tests when storing bswapped T bit result in unaligned mem
olegendo at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Sun Feb 22 12:21:00 GMT 2015
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65162
Bug ID: 65162
Summary: [5 Regression][SH] Redundant tests when storing
bswapped T bit result in unaligned mem
Product: gcc
Version: 5.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: olegendo at gcc dot gnu.org
Target: sh*-*-*
The following example is taken from libav, which contains quite some uses of
this code pattern.
typedef unsigned short int uint16_t;
union unaligned_16 { uint16_t l; } __attribute__((packed));
int
test (unsigned char* buf, int bits_per_component)
{
(((union unaligned_16 *)(buf))->l) =
__builtin_bswap16 (bits_per_component == 10 ? 1 : 0);
return 0;
}
compiled with 4.8 / 4.9 and -m4 -O2:
mov r5,r0
cmp/eq #10,r0
movt r0
swap.b r0,r0
extu.w r0,r0
extu.b r0,r1
shlr8 r0
mov.b r1,@r4
mov.b r0,@(1,r4)
rts
mov #0,r0
compiled with 5.0 r220892:
mov r5,r0
cmp/eq #10,r0
movt r1
swap.b r1,r1 // r1 = T << 8
extu.w r1,r1
tst r1,r1 // T = r1 == 0 = 1-T
mov #-1,r0
negc r0,r0 // r0 = 1-T = r1 != 0 = cmp/eq #10,r0
extu.b r1,r2
mov.b r0,@(1,r4) // @(1,r4) = cmp/eq #10,r0
mov.b r2,@r4 // @r4 = 0
rts
mov #0,r0
This is caused by the introduction of the treg_set_expr stuff.
For some reason, combine tries to recalculate the 'other' stored byte and tries
this pattern:
Successfully matched this instruction:
(set (reg:SI 179)
(ne:SI (reg:SI 164 [ D.1476 ])
(const_int 0 [0])))
The resulting insn sequence looks roughly like this:
(set (reg:SI 169) (eq:SI (reg:SI 5 r5) (const_int 10)))
(set (reg:HI 170) (rotate:HI (subreg:HI (reg:SI 169) 0) (const_int 8)))
(set (reg:SI 164) (zero_extend:SI (reg:HI 170)))
(set (reg:SI 171) (zero_extend:SI (subreg:QI (reg:SI 164) 0)))
(set (mem:QI (reg/v/f:SI 166) (subreg:QI (reg:SI 171) 0)))
(set (reg:SI 179) (ne:SI (reg:SI 164) (const_int 0)))
(set (mem:QI (plus:SI (reg/v/f:SI 166) (const_int 1)))
(subreg:QI (reg:SI 179) 0))
If the ne:SI pattern is not matched, the redundant comparison/test is not
emitted.
The sh_treg_combine.cc RTL was actually meant to handle such cases of repeated
T bit inversions/tests. However, at the moment it is triggered by conditional
insns only. Moreover, it currently will not look through zero_extend and the
rotate insns.
On the other hand, this seems to happen only for unaligned stores. Examples
such as
typedef unsigned short int uint16_t;
int
test_00 (unsigned short* buf, int bits_per_component)
{
buf[0] =
__builtin_bswap16 (bits_per_component == 10 ? 1 : 0);
return 0;
}
int
test_01 (unsigned char* buf, int bits_per_component)
{
buf[0] =
__builtin_bswap16 (bits_per_component == 10 ? 1 : 0);
return 0;
}
int
test_02 (unsigned char* buf, int bits_per_component)
{
buf[0] =
__builtin_bswap16 (bits_per_component == 10 ? 1 : 0) >> 8;
return 0;
}
int
test_03 (unsigned char* buf, int bits_per_component)
{
buf[1] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0) >> 0;
buf[0] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0) >> 8;
return 0;
}
.. do not suffer from the problem.
Probably this problem will not be triggered after unaligned loads/stores have
been improved (PR 64306).
More information about the Gcc-bugs
mailing list