I'm working on GCC135 from the compile farm. I'm experiencing a UBsan runtime error for signed overflow when using unsigned vectors. The algorithms in the real program depend on unsigned wrap; and not signed overflow. gcc135:~$ cat test.cxx #include <stdio.h> #include <altivec.h> #undef vector #undef pixel #undef bool typedef unsigned char byte; typedef __vector unsigned char uint8x16_p; typedef __vector unsigned int uint32x4_p; int main(int argc, char* argv[]) { uint32x4_p a = { 1795745381, 0, 0, 0 }; uint32x4_p b = { 1359893119, 0, 0, 0 }; uint32x4_p c = vec_add(a, b); byte x[16]; vec_vsx_st((uint8x16_p)c, 0, x); for (size_t i=0; i<16; ++i) printf("%02x ", x[i]); printf("\n"); return 0; } The data "1795745381" and "1359893119" were taken from the first finding at https://github.com/weidai11/cryptopp/issues/749. And: $ /opt/at12.0/bin/c++ -O1 -fsanitize=undefined -std=c++14 test.cxx -o test.exe gcc135:~$ ./test.exe test.cxx:15:25: runtime error: signed integer overflow: 1795745381 + 1359893119 cannot be represented in type 'int' e4 38 17 bc 00 00 00 00 00 00 00 00 00 00 00 00 Finally: gcc135:~$ /opt/at12.0/bin/c++ --version c++ (GCC) 8.2.1 20180813 (Advance-Toolchain-at12.0) [revision 263510] Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
About typedef __vector unsigned char uint8x16_p; [...] vec_vsx_st((uint8x16_p)c, 0, x); isn't this an integer promotion issue, unsigned char being promoted to int, thus being signed?
Possibly a target issue on the way vec_vsx_st/vec_add are implemented.
I guess it depends on how vec_add is supposed to be defined. It is certainly expanded as signed int vector addition, already gimple dump (and original too) has something like: _1 = VIEW_CONVERT_EXPR<__vector signed int>(a); _2 = VIEW_CONVERT_EXPR<__vector signed int>(b); _3 = _1 + _2; c = VIEW_CONVERT_EXPR<__vector unsigned int>(_3); _4 = VIEW_CONVERT_EXPR<__vector signed char>(c); So, either the testcase is wrong (if vec_add is defined to operate on signed vectors) or the powerpc __builtin_vec_add lowering is incorrect.
This is documented in the ELFv2 ABI, linked from https://gcc.gnu.org/readings.html . vec_add is overloaded, and defined as X vec_add(X, X) for pretty much every vector type X. So I would expect everything to be vector unsigned int here, but something is going wrong. Confirmed. (The vec_vsx_st doesn't have to do with the problem btw. Not that it is good or anything ;-) )
The problem is that the builtin has __vector signed int arguments: <function_decl 0x7fffefda2700 __builtin_altivec_vadduwm type <function_type 0x7fffefda07e0 type <vector_type 0x7fffefd547e0 __vector signed int type <integer_type 0x7fffefc8f5e8 int> V4SI size <integer_cst 0x7fffefc91000 constant 128> unit-size <integer_cst 0x7fffefc91018 constant 16> align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffefd54738 nunits:4> SI size <integer_cst 0x7fffefc911f8 constant 32> unit-size <integer_cst 0x7fffefc91210 constant 4> align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x7fffefd742a0 arg-types <tree_list 0x7fffefd9d618 value <vector_type 0x7fffefd547e0 __vector signed int> chain <tree_list 0x7fffefd9d640 value <vector_type 0x7fffefd547e0 __vector signed int> chain <tree_list 0x7fffefc92050 value <void_type 0x7fffefc8ff18 void>>>> pointer_to_this <pointer_type 0x7fffefe53348>> readonly nothrow public external built-in decl_6 SI <built-in>:0:0 align:32 warn_if_not_align:0 built-in: BUILT_IN_MD:48 context <translation_unit_decl 0x7fffefe1aac8 pr88234.c> chain <function_decl 0x7fffefda2800 __builtin_altivec_vaddfp>> Guess most of the builtins have those, so the easiest fix to me seems to be to cast it in the gimple_fold targhook.
#define vec_add(a,b) ((a)+(b)) IIRC that only fails for __vector bool, but I think the ppc people didn't like this approach.
--- gcc/config/rs6000/rs6000.c.jj 2018-11-27 16:40:23.821758250 +0100 +++ gcc/config/rs6000/rs6000.c 2018-11-28 15:15:08.254736153 +0100 @@ -15328,6 +15328,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_ enum rs6000_builtins fn_code = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl); tree arg0, arg1, lhs, temp; + enum tree_code bcode; gimple *g; size_t uns_fncode = (size_t) fn_code; @@ -15366,10 +15367,32 @@ rs6000_gimple_fold_builtin (gimple_stmt_ case P8V_BUILTIN_VADDUDM: case ALTIVEC_BUILTIN_VADDFP: case VSX_BUILTIN_XVADDDP: + bcode = PLUS_EXPR; + do_binary: arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); lhs = gimple_call_lhs (stmt); - g = gimple_build_assign (lhs, PLUS_EXPR, arg0, arg1); + if (INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (lhs))) + && !TYPE_OVERFLOW_WRAPS (TREE_TYPE (TREE_TYPE (lhs)))) + { + /* Ensure the binary operation is performed in a type + that wraps if it is integral type. */ + gimple_seq stmts = NULL; + tree type = unsigned_type_for (TREE_TYPE (lhs)); + tree uarg0 = gimple_build (&stmts, VIEW_CONVERT_EXPR, + type, arg0); + tree uarg1 = gimple_build (&stmts, VIEW_CONVERT_EXPR, + type, arg1); + tree res = gimple_build (&stmts, gimple_location (stmt), bcode, + type, uarg0, uarg1); + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + g = gimple_build_assign (lhs, VIEW_CONVERT_EXPR, + build1 (VIEW_CONVERT_EXPR, + TREE_TYPE (lhs), res)); + gsi_replace (gsi, g, true); + return true; + } + g = gimple_build_assign (lhs, bcode, arg0, arg1); gimple_set_location (g, gimple_location (stmt)); gsi_replace (gsi, g, true); return true; @@ -15381,13 +15404,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_ case P8V_BUILTIN_VSUBUDM: case ALTIVEC_BUILTIN_VSUBFP: case VSX_BUILTIN_XVSUBDP: - arg0 = gimple_call_arg (stmt, 0); - arg1 = gimple_call_arg (stmt, 1); - lhs = gimple_call_lhs (stmt); - g = gimple_build_assign (lhs, MINUS_EXPR, arg0, arg1); - gimple_set_location (g, gimple_location (stmt)); - gsi_replace (gsi, g, true); - return true; + bcode = MINUS_EXPR; + goto do_binary; case VSX_BUILTIN_XVMULSP: case VSX_BUILTIN_XVMULDP: arg0 = gimple_call_arg (stmt, 0); handles this, but I'm not sure if it is correct for all the builtins that can show up and if it isn't needed for some other builtins (e.g. some multiplications too).
(In reply to Segher Boessenkool from comment #4) > This is documented in the ELFv2 ABI, linked from > https://gcc.gnu.org/readings.html . > > ... > (The vec_vsx_st doesn't have to do with the problem btw. Not that it is > good or anything ;-) ) Yeah, I tried to convert to the C/C++ pointer dereference pattern several times as shown in the literature. I.e., this: byte x[16]; uint8x16_p cc = (uint8x16_p)c; *(uint8x16_p*) &x[0] = cc; Instead of this: byte x[16]; vec_vsx_st((uint8x16_p)c, 0, x); It breaks a lot of our self tests for certain versions of a particular compiler. I narrowed it down to the loads but I could not take it further and develop a reduced case. I don't have the skill to untangle what is happening in ppc asm at -O3. If you want to see something really obscene, then take a look at https://github.com/weidai11/cryptopp/blob/master/ppc_simd.h#L251 . (If you are talking about something else that is wrong or bad, then please point it out.)
(In reply to Jeffrey Walton from comment #8) > Yeah, I tried to convert to the C/C++ pointer dereference pattern several > times as shown in the literature. I.e., this: > > byte x[16]; > uint8x16_p cc = (uint8x16_p)c; > *(uint8x16_p*) &x[0] = cc; That is the preferred way, in general. > Instead of this: > > byte x[16]; > vec_vsx_st((uint8x16_p)c, 0, x); > > It breaks a lot of our self tests for certain versions of a particular > compiler. Yeah, no doubt :-( But vec_vsx_st isn't even ABI. If it doesn't work well on some maintained GCC version, please tell me, or open a PR even. > I narrowed it down to the loads but I could not take it further > and develop a reduced case. I don't have the skill to untangle what is > happening in ppc asm at -O3. It seems the vec_add itself is handled incorrectly. We'll figure it out :-) > If you want to see something really obscene, then take a look at > https://github.com/weidai11/cryptopp/blob/master/ppc_simd.h#L251 . Heh. Well you have it nicely abstracted away to this one function :-) > (If you are talking about something else that is wrong or bad, then please > point it out.) Loads and stores should preferably be written as just assignments. That way, the compiler can figure out what is best to use for the selected ABI and CPU.
Author: jakub Date: Thu Nov 29 14:23:21 2018 New Revision: 266619 URL: https://gcc.gnu.org/viewcvs?rev=266619&root=gcc&view=rev Log: PR target/88234 * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): For vec_add and vec_sub builtins, perform PLUS_EXPR or MINUS_EXPR in unsigned_type_for instead of vector integral type where overflow doesn't wrap. * gcc.dg/ubsan/pr88234.c: New test. Added: trunk/gcc/testsuite/gcc.dg/ubsan/pr88234.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.c trunk/gcc/testsuite/ChangeLog
Fixed on the trunk so far, queued for backports.
(In reply to Jakub Jelinek from comment #10) > Author: jakub > Date: Thu Nov 29 14:23:21 2018 > New Revision: 266619 > > URL: https://gcc.gnu.org/viewcvs?rev=266619&root=gcc&view=rev > Log: > PR target/88234 > * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): For > vec_add and vec_sub builtins, perform PLUS_EXPR or MINUS_EXPR > in unsigned_type_for instead of vector integral type where overflow > doesn't wrap. > > * gcc.dg/ubsan/pr88234.c: New test. > ... In case it is useful, here are some additional datapoints from the exyernal bug report. It could help provide a little more coverage for the test case. signed integer overflow: 127 + 1 cannot be represented in type 'signed char' signed integer overflow: 7640891576939301132 + 5840696475078001361 cannot be represented in type 'long int' signed integer overflow: -4942790177534073029 + -7262195936335986787 cannot be represented in type 'long int' signed integer overflow: 6625583534739731862 + 7088136013498015297 cannot be represented in type 'long int' signed integer overflow: -5587509538033248007 + -3814479935187347353 cannot be represented in type 'long int' p
The issue is cleared in GCC 9 from DEC 08 2018. Our self tests pass where they formerly failed. The failures are the reason I wanted a diagnostic run with UBsan. On GCC110 and GCC112 with gcc-9 located at /home/mik/misc/gcc-9, I was able to confirm there are no UBsan findings. I was not able to confirm the fix on GCC135, which is the original problematic machine. I did not find an updated gcc-8 or gcc-9 on GCC135. Thanks for the help.
Closing then.
Oh sorry, backports are still pending. Reopening.
Author: jakub Date: Tue Jan 8 10:03:34 2019 New Revision: 267692 URL: https://gcc.gnu.org/viewcvs?rev=267692&root=gcc&view=rev Log: Backported from mainline 2018-11-29 Jakub Jelinek <jakub@redhat.com> PR target/88234 * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): For vec_add and vec_sub builtins, perform PLUS_EXPR or MINUS_EXPR in unsigned_type_for instead of vector integral type where overflow doesn't wrap. * gcc.dg/ubsan/pr88234.c: New test. Added: branches/gcc-8-branch/gcc/testsuite/gcc.dg/ubsan/pr88234.c Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/rs6000/rs6000.c branches/gcc-8-branch/gcc/testsuite/ChangeLog
Fixed for 8.3+ too.