User account creation filtered due to spam.
The following example is taken from libav, which contains quite some uses of this code pattern. typedef unsigned short int uint16_t; union unaligned_16 { uint16_t l; } __attribute__((packed)); int test (unsigned char* buf, int bits_per_component) { (((union unaligned_16 *)(buf))->l) = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0); return 0; } compiled with 4.8 / 4.9 and -m4 -O2: mov r5,r0 cmp/eq #10,r0 movt r0 swap.b r0,r0 extu.w r0,r0 extu.b r0,r1 shlr8 r0 mov.b r1,@r4 mov.b r0,@(1,r4) rts mov #0,r0 compiled with 5.0 r220892: mov r5,r0 cmp/eq #10,r0 movt r1 swap.b r1,r1 // r1 = T << 8 extu.w r1,r1 tst r1,r1 // T = r1 == 0 = 1-T mov #-1,r0 negc r0,r0 // r0 = 1-T = r1 != 0 = cmp/eq #10,r0 extu.b r1,r2 mov.b r0,@(1,r4) // @(1,r4) = cmp/eq #10,r0 mov.b r2,@r4 // @r4 = 0 rts mov #0,r0 This is caused by the introduction of the treg_set_expr stuff. For some reason, combine tries to recalculate the 'other' stored byte and tries this pattern: Successfully matched this instruction: (set (reg:SI 179) (ne:SI (reg:SI 164 [ D.1476 ]) (const_int 0 [0]))) The resulting insn sequence looks roughly like this: (set (reg:SI 169) (eq:SI (reg:SI 5 r5) (const_int 10))) (set (reg:HI 170) (rotate:HI (subreg:HI (reg:SI 169) 0) (const_int 8))) (set (reg:SI 164) (zero_extend:SI (reg:HI 170))) (set (reg:SI 171) (zero_extend:SI (subreg:QI (reg:SI 164) 0))) (set (mem:QI (reg/v/f:SI 166) (subreg:QI (reg:SI 171) 0))) (set (reg:SI 179) (ne:SI (reg:SI 164) (const_int 0))) (set (mem:QI (plus:SI (reg/v/f:SI 166) (const_int 1))) (subreg:QI (reg:SI 179) 0)) If the ne:SI pattern is not matched, the redundant comparison/test is not emitted. The sh_treg_combine.cc RTL was actually meant to handle such cases of repeated T bit inversions/tests. However, at the moment it is triggered by conditional insns only. Moreover, it currently will not look through zero_extend and the rotate insns. On the other hand, this seems to happen only for unaligned stores. Examples such as typedef unsigned short int uint16_t; int test_00 (unsigned short* buf, int bits_per_component) { buf[0] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0); return 0; } int test_01 (unsigned char* buf, int bits_per_component) { buf[0] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0); return 0; } int test_02 (unsigned char* buf, int bits_per_component) { buf[0] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0) >> 8; return 0; } int test_03 (unsigned char* buf, int bits_per_component) { buf[1] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0) >> 0; buf[0] = __builtin_bswap16 (bits_per_component == 10 ? 1 : 0) >> 8; return 0; } .. do not suffer from the problem. Probably this problem will not be triggered after unaligned loads/stores have been improved (PR 64306).
(In reply to Oleg Endo from comment #0) > The following example is taken from libav, which contains quite some uses of > this code pattern. > > typedef unsigned short int uint16_t; > union unaligned_16 { uint16_t l; } __attribute__((packed)); > > int > test (unsigned char* buf, int bits_per_component) > { > (((union unaligned_16 *)(buf))->l) = > __builtin_bswap16 (bits_per_component == 10 ? 1 : 0); > > return 0; > } > BTW, it should actually translate to something like: mov r6,r0 cmp/eq #10,r0 movt r0 mov.b r0,@(1,r4) mov #0,r0 rts mov.b r0,@r4 or mov r6,r0 cmp/eq #10,r0 movt r0 mov.b r0,@(1,r4) shlr8 r0 rts mov.b r0,@r4
GCC 5.1 has been released.
GCC 5.2 is being released, adjusting target milestone to 5.3.
GCC 5.3 is being released, adjusting target milestone.
GCC 5.4 is being released, adjusting target milestone.