gcc -march=core2 -O3 -ftree-vectorizer-verbose=6 for this code: #define SIZE 10000 signed short a[SIZE]; signed short b[SIZE]; signed short c[SIZE]; void add() { int i; for (i = 0; i < SIZE; ++i) a[i] = b[i] + c[i]; } cannot vectorize the loop: add_sshort.c:9: note: vect_model_load_cost: aligned. add_sshort.c:9: note: vect_model_load_cost: inside_cost = 1, outside_cost = 0 . add_sshort.c:9: note: not vectorized: relevant stmt not supported: D.1580_6 = (short unsigned int) D.1579_5 add_sshort.c:7: note: vectorized 0 loops in function. The same happens if the type for a,b and c is "signed char". But if the type is "unsigned short" or "unsigned char" the loop is vectorized.
*** Bug 39069 has been marked as a duplicate of this bug. ***
(reminds me of a couple missed-optimization PRs where vectorization is also failing due to casts - PR31873 , PR26128 - don't know if this is related)
(In reply to comment #2) > (reminds me of a couple missed-optimization PRs where vectorization is also > failing due to casts - PR31873 , PR26128 - don't know if this is related) Are the casts actually needed in this case? It seems the get introduced very early on, the .original dump already has: a[i] = (short int) ((short unsigned int) b[i] + (short unsigned int) c[i]);
(In reply to comment #3) > Are the casts actually needed in this case? It seems the get introduced very > early on, the .original dump already has: Yes because char = char + char is really char = (char)((int)char + (int)char); So this is dup of bug 26128. *** This bug has been marked as a duplicate of 26128 ***
(In reply to comment #4) > Yes because char = char + char is really char = (char)((int)char + (int)char); Let me expand on that. ((char)CHAR_MAX) + 1 is well defined and there is no overflow that occurs. Since GCC internally assumes signed integer overflow is undefined, it has to convert it to be the well defined unsigned integer version.