Bug 81294

Summary: _subborrow_u64 argument order inconsistent with intrinsic reference, icc
Product: gcc Reporter: andreser-gccbugs
Component: targetAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: hjl.tools, jakub, kyukhin, uros
Priority: P3 Keywords: wrong-code
Version: 7.1.1   
Target Milestone: 7.2   
Host: Target: x86
Build: Known to work:
Known to fail: Last reconfirmed:

Description andreser-gccbugs 2017-07-03 15:49:49 UTC
gcc expects _subborrow_u64 arguments in a different order than the one described in intel intrinsics reference and implemented by icc. I couldn't find any documentation about it, so I am writing this issue under the presumption it is not intentional.

The two full-width input arguments are swapped. If what icc does is a-b-carry, gcc does b-a-carry. Needless to say, this most likely breaks any code using _subborrow_u64 on one of the two compilers.

Curiously, it seems that the intel intrinsics guide *used to* (in April 2016) describe the same behavior that gcc implements: https://web.archive.org/web/20160422045348/https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=subb&expand=5283
Now, however, the reference describes icc behavior: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=subborrow_u64&expand=5283,5304,5304

To determine what gcc does, I used the following test program:

#include <stdio.h>
#include <stdint.h>
#include <x86intrin.h>

int main(int argc, char** argv) {
	if (argc != 3) {
		return 111;
	}
	uint64_t a = 0;
	uint64_t b = 0;
	sscanf(argv[1], "%llu", &a);
	sscanf(argv[2], "%llu", &b);
	unsigned long long s = 0;
	uint8_t c = 0;
	c = _subborrow_u64(1, a, b, &s);
	printf("_subborrow_u64(1, %llu, %llu) = %llx + (%llx<<64)\n", a, b, s, c);
}


Output if ran with gcc:
$ ./sbb 0 8
_subborrow_u64(1, 2, 8) = 5 + (0<<64)


Compiled with gcc 7.1: https://godbolt.org/g/Usq5Jb
Compiled with icc  17: https://godbolt.org/g/uMdFFm
The difference can be seen by looking at the address that is in rdx when sscanf is called and then tracing which argument of sbb that number ends up in.

According to https://stackoverflow.com/questions/29029572/multi-word-addition-using-the-carry-flag/29212615#comment61187795_29212615, MSVC seems to agree with icc. If that behavior is the consensus of other implementations (and the intel reference change was fixing an error), I think it would make sense for gcc to change to match.

The arguments seem to get swapped at
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/i386/adxintrin.h;h=9c4152b9f360c0f9be408c84da4950ded8ad5654;hb=HEAD#l58

_subborrow_u64 (unsigned char __CF, unsigned long long __X,
                unsigned long long __Y, unsigned long long *__P)
{ return __builtin_ia32_sbb_u64 (__CF, __Y, __X, __P); }

_subborrow_u32 seems to be affected as well
Comment 1 Jakub Jelinek 2017-07-03 16:19:18 UTC
Seems LLVM also agrees with ICC implementation (rather than the past documentation).
Comment 2 UroŇ° Bizjak 2017-07-03 18:51:42 UTC
It looks like (now fixed) bug in the documentation. So, following patch should sync gcc with the fixed docs:

--cut here--
Index: adxintrin.h
===================================================================
--- adxintrin.h (revision 249844)
+++ adxintrin.h (working copy)
@@ -33,7 +33,7 @@
 _subborrow_u32 (unsigned char __CF, unsigned int __X,
                unsigned int __Y, unsigned int *__P)
 {
-  return __builtin_ia32_sbb_u32 (__CF, __Y, __X, __P);
+  return __builtin_ia32_sbb_u32 (__CF, __X, __Y, __P);
 }
 
 extern __inline unsigned char
@@ -58,7 +58,7 @@
 _subborrow_u64 (unsigned char __CF, unsigned long long __X,
                unsigned long long __Y, unsigned long long *__P)
 {
-  return __builtin_ia32_sbb_u64 (__CF, __Y, __X, __P);
+  return __builtin_ia32_sbb_u64 (__CF, __X, __Y, __P);
 }
 
 extern __inline unsigned char
--cut here--
Comment 3 uros 2017-07-04 20:47:10 UTC
Author: uros
Date: Tue Jul  4 20:46:38 2017
New Revision: 249976

URL: https://gcc.gnu.org/viewcvs?rev=249976&root=gcc&view=rev
Log:
	PR target/81294
	* config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y
	arguments in the call to __builtin_ia32_sbb_u32.
	(_subborrow_u64): Swap _X and _Y arguments in the call to
	__builtin_ia32_sbb_u64.

testsuite/ChangeLog:

	PR target/81294
	* gcc.target/i386/adx-addcarryx32-2.c (adx_test): Swap
	x and y arguments in the call to _subborrow_u32.
	* gcc.target/i386/adx-addcarryx64-2.c (adx_test): Swap
	x and y arguments in the call to _subborrow_u64.
	* gcc.target/i386/pr81294-1.c: New test.
	* gcc.target/i386/pr81294-2.c: Ditto.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr81294-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr81294-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/adxintrin.h
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
    trunk/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
Comment 4 uros 2017-07-04 21:05:49 UTC
Author: uros
Date: Tue Jul  4 21:05:17 2017
New Revision: 249978

URL: https://gcc.gnu.org/viewcvs?rev=249978&root=gcc&view=rev
Log:
	PR target/81300
	* config/i386/i386.md (setcc + movzbl/and to xor + setcc peepholes):
	Require dead FLAGS_REG at the beginning of a peephole.

	PR target/81294
	* config/i386/adxintrin.h (_subborrow_u32): Swap _X and _Y
	arguments in the call to __builtin_ia32_sbb_u32.
	(_subborrow_u64): Swap _X and _Y arguments in the call to
	__builtin_ia32_sbb_u64.

testsuite/ChangeLog:

	PR target/81300
	* gcc.target/i386/pr81300.c: New test.

	PR target/81294
	* gcc.target/i386/adx-addcarryx32-2.c (adx_test): Swap
	x and y arguments in the call to _subborrow_u32.
	* gcc.target/i386/adx-addcarryx64-2.c (adx_test): Swap
	x and y arguments in the call to _subborrow_u64.
	* gcc.target/i386/pr81294-1.c: New test.
	* gcc.target/i386/pr81294-2.c: Ditto.


Added:
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81294-1.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81294-2.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/pr81300.c
Modified:
    branches/gcc-7-branch/gcc/ChangeLog
    branches/gcc-7-branch/gcc/config/i386/adxintrin.h
    branches/gcc-7-branch/gcc/config/i386/i386.md
    branches/gcc-7-branch/gcc/testsuite/ChangeLog
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/adx-addcarryx32-2.c
    branches/gcc-7-branch/gcc/testsuite/gcc.target/i386/adx-addcarryx64-2.c
Comment 5 UroŇ° Bizjak 2017-07-13 20:31:48 UTC
Fixed for gcc-7.2+.