Sadly, I had been using the following code, presuming that I was telling gcc that some data was aligned: void copy_something(void *p, const void *s) { struct some_struct __aligned__((aligned(8))) *_d = d; struct some_struct __aligned__((aligned(8))) *_s = s; *_d = *_s; } However, it continually generate copy code with an alignment prologue and epilogue. Then I learned (on freenode's ##gcc) that the proper way to tell gcc this was via __builtin_assume_aligned() and my problem was solved: void copy_something(void *p, const void *s) { struct some_struct *_d = __builtin_assume_aligned(d, 8); struct some_struct *_s = __builtin_assume_aligned(s, 8); *_d = *_s; } It would seem to me that gcc should issue a warning when such an attribute is assigned, but has no effect as it does in some other cases. This seems to apply to all cases where it is used to define a type, of which you derive a pointer, e.g.: int __aligned__((aligned(1))) *i; Here, I am actually expecting to get a pointer that I can safely access ints that are not aligned to the machine word, but will indeed blow up on machines that do not allow unaligned access of words. The quandary here is that simply treating it as an array of bytes is less efficient on x86, where an unaligned 32-bit mov would be faster than a rep movsb, so such a request is often highly reasonable.
Confirmed still occurs in 6.2.1 on the following C++ code: #include <numeric> #include <vector> float f(std::vector<float>& A, std::vector<float>& B) { __builtin_assume_aligned(A.data(), 64); __builtin_assume_aligned(B.data(), 64); return std::inner_product(A.begin(), A.end(), B.begin(), 0.f); } Compiled using -O3 -ffast-math -mavx2.
(In reply to Vedran Miletic from comment #1) > #include <numeric> > #include <vector> > float f(std::vector<float>& A, std::vector<float>& B) > { > __builtin_assume_aligned(A.data(), 64); > __builtin_assume_aligned(B.data(), 64); > return std::inner_product(A.begin(), A.end(), B.begin(), 0.f); > } You are doing it wrong. __builtin_assume_aligned() returns void* and you must use it's return value for it to be effective. So your code should be something like this: float f(std::vector<float>& A, std::vector<float>& B) { float *a_data = __builtin_assume_aligned(A.data(), 64); float *b_data = __builtin_assume_aligned(B.data(), 64); return std::inner_product(a_data, b_data, B.begin(), 0.f); } Of course, this assumes that the buffer that your vector<> implementation supplies is 64 byte aligned.
I had to modify the original testcase a bit to get it to compile: $ cat 61939.c struct some_struct { int foo; }; void copy_something(void *p, const void *s) { struct some_struct __attribute__((aligned(8))) *_d = p; struct some_struct __attribute__((aligned(8))) *_s = s; *_d = *_s; } $ /usr/local/bin/gcc -c -Wall -Wextra -pedantic -Wcast-align -Wattributes 61939.c 61939.c: In function ‘copy_something’: 61939.c:4:58: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] struct some_struct __attribute__((aligned(8))) *_s = s; ^ $ But beyond that, yeah, confirmed. I think there's probably a duplicate around here somewhere but I've forgotten the number already...
(In reply to Daniel Santos from comment #2) > (In reply to Vedran Miletic from comment #1) > > #include <numeric> > > #include <vector> > > float f(std::vector<float>& A, std::vector<float>& B) > > { > > __builtin_assume_aligned(A.data(), 64); > > __builtin_assume_aligned(B.data(), 64); > > return std::inner_product(A.begin(), A.end(), B.begin(), 0.f); > > } > > You are doing it wrong. __builtin_assume_aligned() returns void* and you > must use it's return value for it to be effective. Sounds like __builtin_assume_aligned() should be marked up with __attribute__((warn_unused_result))