gcc expects _subborrow_u64 arguments in a different order than the one described in intel intrinsics reference and implemented by icc. I couldn't find any documentation about it, so I am writing this issue under the presumption it is not intentional. The two full-width input arguments are swapped. If what icc does is a-b-carry, gcc does b-a-carry. Needless to say, this most likely breaks any code using _subborrow_u64 on one of the two compilers. Curiously, it seems that the intel intrinsics guide *used to* (in April 2016) describe the same behavior that gcc implements: https://web.archive.org/web/20160422045348/https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=subb&expand=5283 Now, however, the reference describes icc behavior: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=subborrow_u64&expand=5283,5304,5304 To determine what gcc does, I used the following test program: #include <stdio.h> #include <stdint.h> #include <x86intrin.h> int main(int argc, char** argv) { if (argc != 3) { return 111; } uint64_t a = 0; uint64_t b = 0; sscanf(argv[1], "%llu", &a); sscanf(argv[2], "%llu", &b); unsigned long long s = 0; uint8_t c = 0; c = _subborrow_u64(1, a, b, &s); printf("_subborrow_u64(1, %llu, %llu) = %llx + (%llx<<64)\n", a, b, s, c); } Output if ran with gcc: $ ./sbb 0 8 _subborrow_u64(1, 2, 8) = 5 + (0<<64) Compiled with gcc 7.1: https://godbolt.org/g/Usq5Jb Compiled with icc 17: https://godbolt.org/g/uMdFFm The difference can be seen by looking at the address that is in rdx when sscanf is called and then tracing which argument of sbb that number ends up in. According to https://stackoverflow.com/questions/29029572/multi-word-addition-using-the-carry-flag/29212615#comment61187795_29212615, MSVC seems to agree with icc. If that behavior is the consensus of other implementations (and the intel reference change was fixing an error), I think it would make sense for gcc to change to match. The arguments seem to get swapped at https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/adxintrin.h;h=9c4152b9f360c0f9be408c84da4950ded8ad5654;hb=HEAD#l58 _subborrow_u64 (unsigned char __CF, unsigned long long __X, unsigned long long __Y, unsigned long long *__P) { return __builtin_ia32_sbb_u64 (__CF, __Y, __X, __P); } _subborrow_u32 seems to be affected as well
Seems LLVM also agrees with ICC implementation (rather than the past documentation).
It looks like (now fixed) bug in the documentation. So, following patch should sync gcc with the fixed docs: --cut here-- Index: adxintrin.h =================================================================== --- adxintrin.h (revision 249844) +++ adxintrin.h (working copy) @@ -33,7 +33,7 @@ _subborrow_u32 (unsigned char __CF, unsigned int __X, unsigned int __Y, unsigned int *__P) { - return __builtin_ia32_sbb_u32 (__CF, __Y, __X, __P); + return __builtin_ia32_sbb_u32 (__CF, __X, __Y, __P); } extern __inline unsigned char @@ -58,7 +58,7 @@ _subborrow_u64 (unsigned char __CF, unsigned long long __X, unsigned long long __Y, unsigned long long *__P) { - return __builtin_ia32_sbb_u64 (__CF, __Y, __X, __P); + return __builtin_ia32_sbb_u64 (__CF, __X, __Y, __P); } extern __inline unsigned char --cut here--
Author: uros Date: Tue Jul 4 20:46:38 2017 New Revision: 249976 URL: https://gcc.gnu.org/viewcvs?rev=249976&root=gcc&view=rev Log: PR target/81294 * config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y arguments in the call to __builtin_ia32_sbb_u32. (_subborrow_u64): Swap _X and _Y arguments in the call to __builtin_ia32_sbb_u64. testsuite/ChangeLog: PR target/81294 * gcc.target/i386/adx-addcarryx32-2.c (adx_test): Swap x and y arguments in the call to _subborrow_u32. * gcc.target/i386/adx-addcarryx64-2.c (adx_test): Swap x and y arguments in the call to _subborrow_u64. * gcc.target/i386/pr81294-1.c: New test. * gcc.target/i386/pr81294-2.c: Ditto. Added: trunk/gcc/testsuite/gcc.target/i386/pr81294-1.c trunk/gcc/testsuite/gcc.target/i386/pr81294-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/adxintrin.h trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c trunk/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
Author: uros Date: Tue Jul 4 21:05:17 2017 New Revision: 249978 URL: https://gcc.gnu.org/viewcvs?rev=249978&root=gcc&view=rev Log: PR target/81300 * config/i386/i386.md (setcc + movzbl/and to xor + setcc peepholes): Require dead FLAGS_REG at the beginning of a peephole. PR target/81294 * config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y arguments in the call to __builtin_ia32_sbb_u32. (_subborrow_u64): Swap _X and _Y arguments in the call to __builtin_ia32_sbb_u64. testsuite/ChangeLog: PR target/81300 * gcc.target/i386/pr81300.c: New test. PR target/81294 * gcc.target/i386/adx-addcarryx32-2.c (adx_test): Swap x and y arguments in the call to _subborrow_u32. * gcc.target/i386/adx-addcarryx64-2.c (adx_test): Swap x and y arguments in the call to _subborrow_u64. * gcc.target/i386/pr81294-1.c: New test. * gcc.target/i386/pr81294-2.c: Ditto. Added: branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81294-1.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81294-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81300.c Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/i386/adxintrin.h branches/gcc-7-branch/gcc/config/i386/i386.md branches/gcc-7-branch/gcc/testsuite/ChangeLog branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
Fixed for gcc-7.2+.