[Bug tree-optimization/78821] GCC7: Copying whole 32 bits structure field by field not optimised into copying whole 32 bits at once
jakub at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Mon Nov 20 11:28:00 GMT 2017
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
--- Comment #19 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #17)
> Hm, even with the latest patch, the testcase from comment #5:
> still compiles to:
>
> movl %esi, %eax
> movw %si, (%rdi)
> notl %esi
> notl %eax
> movb %sil, 3(%rdi)
> movb %ah, 2(%rdi)
> ret
The reason for that is that the IL is something the bswap framework can't
handle. Let's look just at the simplified:
void baz (char *buf, unsigned int data)
{
buf[2] = ~data >> 8;
buf[3] = ~data;
}
_1 = ~data_6(D);
_2 = _1 >> 8;
_3 = (char) _2;
MEM[(char *)buf_7(D) + 2B] = _3;
_4 = (char) data_6(D);
_5 = ~_4;
MEM[(char *)buf_7(D) + 3B] = _5;
If it was instead:
_1 = ~data_6(D);
_2 = _1 >> 8;
_3 = (char) _2;
MEM[(char *)buf_7(D) + 2B] = _3;
_4 = (char) _1;
MEM[(char *)buf_7(D) + 3B] = _4;
then it would handle that. So I think it is a missed optimization in FRE or
whatever else does SCCVN, or something match.pd should handle.
As for:
> void baz (char *buf, unsigned int data)
> {
> buf[0] = data >> 8;
> buf[1] = data;
> }
not using movbew, that is something that should be done in the backend.
For the middle-end, we don't have bswap16 and consider {L,R}ROTATE_EXPR by 8
as the canonical 16-bit byte swap. Please also have a look:
unsigned short
baz (unsigned short *buf)
{
unsigned short a = buf[0];
return ((unsigned short) (a >> 8)) | (unsigned short) (a << 8);
}
where we could also emit movbew instead of movw + rolw (if it is actually a
win). Thus, I think i386.md should provide patterns for combine (or peephole2
if the former doesn't work for some reason) for this.
More information about the Gcc-bugs
mailing list