Built with '-march=x86-64-v3 -O1', the following code generates a bsr instruction, which has undefined behavior when the source operand is zero, thus gives wrong result (code also in https://gcc.godbolt.org/z/zzT7x57MT): static inline int test_clz32(uint32_t value) { if (value != 0) { return __builtin_clz(value); } else { return 32; } } /* returns -1 if x == 0 */ int firstone(uint32_t x) { return 31 - test_clz32(x); } The result assembly: firstone: bsr eax, edi ret Note that the lzcnt instruction has a defined behavior to return the operand size when operand is zero.
Created attachment 51052 [details] Proposed patch Patch enhances BSR insn pattern with ZF setting to prevent unwanted combinations with LZCNT insn pattern.
(In reply to Iru Cai from comment #0) > Built with '-march=x86-64-v3 -O1', the following code generates a bsr > instruction, which has undefined behavior when the source operand is zero, > thus gives wrong result The documentation for __builtin_clz(x) states "If x is 0, the result is undefined".
(In reply to Mikael Pettersson from comment #2) > (In reply to Iru Cai from comment #0) > > Built with '-march=x86-64-v3 -O1', the following code generates a bsr > > instruction, which has undefined behavior when the source operand is zero, > > thus gives wrong result > > The documentation for __builtin_clz(x) states "If x is 0, the result is > undefined". The testcase from Comment #0 does: if (value != 0) { return __builtin_clz(value);
(In reply to Uroš Bizjak from comment #3) > (In reply to Mikael Pettersson from comment #2) > > (In reply to Iru Cai from comment #0) > > > Built with '-march=x86-64-v3 -O1', the following code generates a bsr > > > instruction, which has undefined behavior when the source operand is zero, > > > thus gives wrong result > > > > The documentation for __builtin_clz(x) states "If x is 0, the result is > > undefined". > > The testcase from Comment #0 does: > > if (value != 0) { > return __builtin_clz(value); Yes I just noticed. My mistake.
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>: https://gcc.gnu.org/g:1e16f2b472c7d253d564556a048dc4ae16119c00 commit r12-1743-g1e16f2b472c7d253d564556a048dc4ae16119c00 Author: Uros Bizjak <ubizjak@gmail.com> Date: Wed Jun 23 12:50:53 2021 +0200 i386: Prevent unwanted combine from LZCNT to BSR [PR101175] The current RTX pattern for BSR allows combine pass to convert LZCNT insn to BSR. Note that the LZCNT has a defined behavior to return the operand size when operand is zero, where BSR has not. Add a BSR specific setting of zero-flag to RTX pattern of BSR insn in order to avoid matching unwanted combinations. 2021-06-23 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/101175 * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX. (bsr): Ditto. (*bsrhi): Remove. (clz<mode>2): Update RTX pattern for additions. gcc/testsuite/ PR target/101175 * gcc.target/i386/pr101175.c: New test.
The releases/gcc-11 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>: https://gcc.gnu.org/g:e99256fc5eab1cf8958223d79b23e359b6d5ca60 commit r11-8644-ge99256fc5eab1cf8958223d79b23e359b6d5ca60 Author: Uros Bizjak <ubizjak@gmail.com> Date: Wed Jun 23 12:50:53 2021 +0200 i386: Prevent unwanted combine from LZCNT to BSR [PR101175] The current RTX pattern for BSR allows combine pass to convert LZCNT insn to BSR. Note that the LZCNT has a defined behavior to return the operand size when operand is zero, where BSR has not. Add a BSR specific setting of zero-flag to RTX pattern of BSR insn in order to avoid matching unwanted combinations. 2021-06-23 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/101175 * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX. (bsr): Ditto. (*bsrhi): Remove. (clz<mode>2): Update RTX pattern for additions. gcc/testsuite/ PR target/101175 * gcc.target/i386/pr101175.c: New test. (cherry picked from commit 1e16f2b472c7d253d564556a048dc4ae16119c00)
The releases/gcc-10 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>: https://gcc.gnu.org/g:ab383ecb4a45413fcc0012bc1791c094fe7fed29 commit r10-9955-gab383ecb4a45413fcc0012bc1791c094fe7fed29 Author: Uros Bizjak <ubizjak@gmail.com> Date: Wed Jun 23 12:50:53 2021 +0200 i386: Prevent unwanted combine from LZCNT to BSR [PR101175] The current RTX pattern for BSR allows combine pass to convert LZCNT insn to BSR. Note that the LZCNT has a defined behavior to return the operand size when operand is zero, where BSR has not. Add a BSR specific setting of zero-flag to RTX pattern of BSR insn in order to avoid matching unwanted combinations. 2021-06-23 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/101175 * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX. (bsr): Ditto. (*bsrhi): Remove. (clz<mode>2): Update RTX pattern for additions. gcc/testsuite/ PR target/101175 * gcc.target/i386/pr101175.c: New test. (cherry picked from commit 1e16f2b472c7d253d564556a048dc4ae16119c00)
The releases/gcc-9 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>: https://gcc.gnu.org/g:9b997caa72498bc3a14a064648b721fe0f11945e commit r9-9603-g9b997caa72498bc3a14a064648b721fe0f11945e Author: Uros Bizjak <ubizjak@gmail.com> Date: Wed Jun 23 12:50:53 2021 +0200 i386: Prevent unwanted combine from LZCNT to BSR [PR101175] The current RTX pattern for BSR allows combine pass to convert LZCNT insn to BSR. Note that the LZCNT has a defined behavior to return the operand size when operand is zero, where BSR has not. Add a BSR specific setting of zero-flag to RTX pattern of BSR insn in order to avoid matching unwanted combinations. 2021-06-23 Uroš Bizjak <ubizjak@gmail.com> gcc/ PR target/101175 * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX. (bsr): Ditto. (*bsrhi): Remove. (clz<mode>2): Update RTX pattern for additions. gcc/testsuite/ PR target/101175 * gcc.target/i386/pr101175.c: New test. (cherry picked from commit 1e16f2b472c7d253d564556a048dc4ae16119c00)
Fixed everywhere.