In a compiler built from svn revision 114878, a memset(,0,) followed by storing 0 into some of the just-cleared locations produces redundant stores. #include <string.h> struct blob { int a[3]; void *p; }; void foo (struct blob *bp) { int i; memset (bp, 0, 1024 * sizeof(*bp)); /* Null pointer not required by ANSI to be all-bits-0, so: */ for (i = 0; i < 1024; i++) bp[i].p = 0; } With "gcc -O9 -fomit-frame-pointer -march=pentiumpro -mtune=pentiumpro" the assembly code produces a call to memset, then a loop storing 0 into the pointer slots. But on this platform, since a pointer has all bits clear, the loop is redundant. If I add "-minline-all-stringops", it doesn't help; the memset call is replaced by a sequence using "rep stosl", and the following loop is still there. If I change the array size from 1024 to 1, then gcc expands the memset inline (no loop), and figures out the redundancy. Same issue with storing zero in bit fields after memset: #include <string.h> struct blob { unsigned char a:1, b:7; }; void foo (struct blob *bp) { int i; memset(bp, 0, 1024 * sizeof(*bp)); for (i = 0; i < 1024; i++) bp[i].a = 0; } The memset is followed by a loop with "andb $-2,...". A possible optimization I'm less would be allowed for odd cases: If I change the second example to use "bp[i].a = 1", is the compiler allowed to optimize this into memset(,1,)? If so, add that to the wish list. :-)
s/I'm less would be allowed/I'm less confident would be allowed/
Confirmed: <bb 2>: memset (bp_2(D), 0, 24576); <bb 3>: # ivtmp.14_9 = PHI <ivtmp.14_20(3), 0(2)> MEM[base: bp_2(D), index: ivtmp.14_9, offset: 16B] = 0B; ivtmp.14_20 = ivtmp.14_9 + 24; if (ivtmp.14_20 != 24576) goto <bb 3>; else goto <bb 4>;
*** Bug 104276 has been marked as a duplicate of this bug. ***