I tested an svn build from 20100813 with the following code: struct bar { unsigned int a:1, b:1, c:1, d:1, e:28; }; void foo(struct bar * __restrict__ src, struct bar * __restrict__ dst) { dst->a = src->a; dst->b = src->b; dst->c = src->c; dst->d = src->d; dst->e = src->e; } Built as 32bit, we see loads and stores as if the compiler is following pointer aliasing rules: # gcc -m32 -O2 -S foo.c foo: lwz 9,0(3) lwz 0,0(4) rlwimi 0,9,0,0,0 stw 0,0(4) lwz 9,0(3) rlwimi 0,9,0,1,1 stw 0,0(4) lwz 9,0(3) rlwimi 0,9,0,2,2 stw 0,0(4) lwz 9,0(3) rlwimi 0,9,0,3,3 stw 0,0(4) lwz 9,0(3) rlwimi 0,9,0,4,31 stw 0,0(4) blr Apologies if I am misusing or misinterpreting the use of __restrict__ here. Also, when built as 64bit things are considerably more complex. Is there a reason why we can't use the same code as 32bit? # gcc -m64 -O2 -S foo.c ... .L.foo: lwz 9,0(4) lwz 0,0(3) rlwinm 9,9,0,1,31 rlwinm 0,0,0,0,0 or 0,9,0 stw 0,0(4) rlwinm 0,0,1,1,31 rlwinm 0,0,31,0xffffffff lwz 9,0(3) rldicl 9,9,34,63 slwi 9,9,30 or 0,0,9 stw 0,0(4) rlwinm 9,0,2,1,31 rlwinm 9,9,30,0xffffffff lwz 0,0(3) rldicl 0,0,35,63 slwi 0,0,29 or 0,9,0 stw 0,0(4) rlwinm 0,0,3,1,31 rlwinm 0,0,29,0xffffffff lwz 9,0(3) rldicl 9,9,36,63 slwi 9,9,28 or 0,0,9 stw 0,0(4) rlwinm 0,0,0,0,3 lwz 9,0(3) rlwinm 9,9,0,4,31 or 0,0,9 stw 0,0(4) blr
I don't think this has anything to do with restrict and all with lowering bitfield accesses only during expansion, and at RTL level the bitfield operations being too big for combiner to optimize them.
Expand has the issue: (insn 7 6 8 t2.c:7 (set (reg:SI 197) (mem/s:SI (reg/v/f:DI 193 [ src ]) [0 S4 A32])) -1 (nil)) Notice the aliasing set of 0. Confirmed.
(In reply to comment #1) > I don't think this has anything to do with restrict and all with lowering > bitfield accesses only during expansion, and at RTL level the bitfield > operations being too big for combiner to optimize them. No this is unrelated to the combiner not be able to optimize the bitfield accesses. Rather it is related to how store and loads happen on bitfields. We don't try to keep track of individual bits for a change in the store.
The alias-set issue doesn't occur since quite some time (it's using alias-set 1 for me). Also restrict is working. GCC 6 optimizes this on x86_64 to foo: .LFB0: .cfi_startproc movzbl (%rdi), %edx movzbl (%rsi), %eax movl %edx, %r8d movl %edx, %ecx andl $-4, %eax andl $1, %r8d andl $2, %ecx orl %r8d, %eax orl %ecx, %eax movl %edx, %ecx andl $8, %edx andl $4, %ecx andl $-13, %eax orl %ecx, %eax orl %edx, %eax movb %al, (%rsi) movl (%rdi), %eax andl $-16, %eax movl %eax, %edx movl (%rsi), %eax andl $15, %eax orl %edx, %eax movl %eax, (%rsi) ret thus it is doing a good job in piecewise copying of the struct. It doesn't detect that it can simply use a 32bit load/store. But there are duplicates in bugzilla for that issue. Interestingly with some bitfield lowering work plus some match.pd hackery I get that: foo: .LFB0: .cfi_startproc movl (%rdi), %eax movl %eax, (%rsi) ret whee.
Mine. With my bit-field lowering and a patch to reassociation to some handle BIT_INSERT optimizations, we are able to optimize this to just load/store.