Bug 101175 - builtin_clz generates wrong bsr instruction
Summary: builtin_clz generates wrong bsr instruction
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 11.1.0
: P3 normal
Target Milestone: 9.5
Assignee: Uroš Bizjak
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2021-06-23 05:56 UTC by Iru Cai
Modified: 2021-06-24 06:18 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2021-06-23 00:00:00


Attachments
Proposed patch (661 bytes, patch)
2021-06-23 07:33 UTC, Uroš Bizjak
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Iru Cai 2021-06-23 05:56:57 UTC
Built with '-march=x86-64-v3 -O1', the following code generates a bsr instruction, which has undefined behavior when the source operand is zero, thus gives wrong result (code also in https://gcc.godbolt.org/z/zzT7x57MT):

static inline int test_clz32(uint32_t value)
{
	if (value != 0) {
		return __builtin_clz(value);
	} else {
		return 32;
	}
}

/* returns -1 if x == 0 */
int firstone(uint32_t x)
{
	return 31 - test_clz32(x);
}

The result assembly:

firstone:
        bsr     eax, edi
        ret

Note that the lzcnt instruction has a defined behavior to return the operand size when operand is zero.
Comment 1 Uroš Bizjak 2021-06-23 07:33:52 UTC
Created attachment 51052 [details]
Proposed patch

Patch enhances BSR insn pattern with ZF setting to prevent unwanted combinations with LZCNT insn pattern.
Comment 2 Mikael Pettersson 2021-06-23 08:33:47 UTC
(In reply to Iru Cai from comment #0)
> Built with '-march=x86-64-v3 -O1', the following code generates a bsr
> instruction, which has undefined behavior when the source operand is zero,
> thus gives wrong result

The documentation for __builtin_clz(x) states "If x is 0, the result is undefined".
Comment 3 Uroš Bizjak 2021-06-23 08:35:51 UTC
(In reply to Mikael Pettersson from comment #2)
> (In reply to Iru Cai from comment #0)
> > Built with '-march=x86-64-v3 -O1', the following code generates a bsr
> > instruction, which has undefined behavior when the source operand is zero,
> > thus gives wrong result
> 
> The documentation for __builtin_clz(x) states "If x is 0, the result is
> undefined".

The testcase from Comment #0 does:

	if (value != 0) {
		return __builtin_clz(value);
Comment 4 Mikael Pettersson 2021-06-23 08:38:06 UTC
(In reply to Uroš Bizjak from comment #3)
> (In reply to Mikael Pettersson from comment #2)
> > (In reply to Iru Cai from comment #0)
> > > Built with '-march=x86-64-v3 -O1', the following code generates a bsr
> > > instruction, which has undefined behavior when the source operand is zero,
> > > thus gives wrong result
> > 
> > The documentation for __builtin_clz(x) states "If x is 0, the result is
> > undefined".
> 
> The testcase from Comment #0 does:
> 
> 	if (value != 0) {
> 		return __builtin_clz(value);

Yes I just noticed. My mistake.
Comment 5 GCC Commits 2021-06-23 10:51:45 UTC
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:1e16f2b472c7d253d564556a048dc4ae16119c00

commit r12-1743-g1e16f2b472c7d253d564556a048dc4ae16119c00
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Jun 23 12:50:53 2021 +0200

    i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
    
    The current RTX pattern for BSR allows combine pass to convert LZCNT insn
    to BSR. Note that the LZCNT has a defined behavior to return the operand
    size when operand is zero, where BSR has not.
    
    Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
    in order to avoid matching unwanted combinations.
    
    2021-06-23  Uroš Bizjak  <ubizjak@gmail.com>
    
    gcc/
            PR target/101175
            * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX.
            (bsr): Ditto.
            (*bsrhi): Remove.
            (clz<mode>2): Update RTX pattern for additions.
    
    gcc/testsuite/
    
            PR target/101175
            * gcc.target/i386/pr101175.c: New test.
Comment 6 GCC Commits 2021-06-23 18:11:32 UTC
The releases/gcc-11 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:e99256fc5eab1cf8958223d79b23e359b6d5ca60

commit r11-8644-ge99256fc5eab1cf8958223d79b23e359b6d5ca60
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Jun 23 12:50:53 2021 +0200

    i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
    
    The current RTX pattern for BSR allows combine pass to convert LZCNT insn
    to BSR. Note that the LZCNT has a defined behavior to return the operand
    size when operand is zero, where BSR has not.
    
    Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
    in order to avoid matching unwanted combinations.
    
    2021-06-23  Uroš Bizjak  <ubizjak@gmail.com>
    
    gcc/
            PR target/101175
            * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX.
            (bsr): Ditto.
            (*bsrhi): Remove.
            (clz<mode>2): Update RTX pattern for additions.
    
    gcc/testsuite/
    
            PR target/101175
            * gcc.target/i386/pr101175.c: New test.
    
    (cherry picked from commit 1e16f2b472c7d253d564556a048dc4ae16119c00)
Comment 7 GCC Commits 2021-06-24 06:17:30 UTC
The releases/gcc-10 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:ab383ecb4a45413fcc0012bc1791c094fe7fed29

commit r10-9955-gab383ecb4a45413fcc0012bc1791c094fe7fed29
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Jun 23 12:50:53 2021 +0200

    i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
    
    The current RTX pattern for BSR allows combine pass to convert LZCNT insn
    to BSR. Note that the LZCNT has a defined behavior to return the operand
    size when operand is zero, where BSR has not.
    
    Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
    in order to avoid matching unwanted combinations.
    
    2021-06-23  Uroš Bizjak  <ubizjak@gmail.com>
    
    gcc/
            PR target/101175
            * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX.
            (bsr): Ditto.
            (*bsrhi): Remove.
            (clz<mode>2): Update RTX pattern for additions.
    
    gcc/testsuite/
    
            PR target/101175
            * gcc.target/i386/pr101175.c: New test.
    
    (cherry picked from commit 1e16f2b472c7d253d564556a048dc4ae16119c00)
Comment 8 GCC Commits 2021-06-24 06:17:54 UTC
The releases/gcc-9 branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:9b997caa72498bc3a14a064648b721fe0f11945e

commit r9-9603-g9b997caa72498bc3a14a064648b721fe0f11945e
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Wed Jun 23 12:50:53 2021 +0200

    i386: Prevent unwanted combine from LZCNT to BSR [PR101175]
    
    The current RTX pattern for BSR allows combine pass to convert LZCNT insn
    to BSR. Note that the LZCNT has a defined behavior to return the operand
    size when operand is zero, where BSR has not.
    
    Add a BSR specific setting of zero-flag to RTX pattern of BSR insn
    in order to avoid matching unwanted combinations.
    
    2021-06-23  Uroš Bizjak  <ubizjak@gmail.com>
    
    gcc/
            PR target/101175
            * config/i386/i386.md (bsr_rex64): Add zero-flag setting RTX.
            (bsr): Ditto.
            (*bsrhi): Remove.
            (clz<mode>2): Update RTX pattern for additions.
    
    gcc/testsuite/
    
            PR target/101175
            * gcc.target/i386/pr101175.c: New test.
    
    (cherry picked from commit 1e16f2b472c7d253d564556a048dc4ae16119c00)
Comment 9 Uroš Bizjak 2021-06-24 06:18:28 UTC
Fixed everywhere.