The attached code is a modified gcc.dg/pr46647.c, which shows that memset isn't optimized on unaligned short (int-sized) data as it is for aligned data, even for non-strict-alignment targets, such as cris-* and x86_64-linux. Observe the emitted assembly code, which uses the same instructions for aligned and unaligned code as later optimizations cover up (for both cris-* and x86_64-linux). Hence, I guess this bug isn't really that important when it comes to just the generated code, just an annoying middle-end miss and annoyingly failing test-case. (Whether the over-alignment-checks misses other optimization opportunities is another issue.)
Background: I stumbled upon this when changing the CRIS port to align global data by default. This made the always-before-failing gcc.dg/pr46647.c pass, for no good reason: alignment of data should not make a difference for emitted code (except for atomic support, WIP for CRIS).
This may be related to PR 52861.
Created attachment 27528 [details]
>alignment of data should not make a difference for emitted code
Unless the loading of unalignment makes it much slower. I thinking where two aligned half loads are better than one unaligned word load. I think there are targets like that.
(In reply to comment #0)
> Observe the
> emitted assembly code, which uses the same instructions for aligned and
> unaligned code
...(compare with the code from gcc.dg/pr46647.c here)...
SVN revision 188039.
(In reply to comment #2)
> Unless the loading of unalignment makes it much slower.
Well, yes, I missed adding !SLOW_UNALIGNED_ACCESS in the title. :)
(In reply to comment #4)
> Well, yes, I missed adding !SLOW_UNALIGNED_ACCESS in the title. :)
Never mind, SLOW_UNALIGNED_ACCESS != 0 is much more severe than the cost of single insns. Bah. Maybe this is an issue of missing cost metric.
Date: Thu Jun 7 20:44:01 2012
New Revision: 188317
* gcc.dg/pr46647.c: xfail for cris-* and crisv32-*.