This is the mail archive of the
gcc-help@gcc.gnu.org
mailing list for the GCC project.
Re: Using bt,bts
On Wed, Sep 26, 2012 at 04:20:52PM -0700, Ian Lance Taylor wrote:
> On Wed, Sep 26, 2012 at 10:34 AM, OndÅej BÃlka <neleai@seznam.cz> wrote:
>
> > is there a reason why for example
> > x=x|(1<<11);
> > is not expanded into
> > bts rax,11
> > ?
>
> The bts instruction is never faster than the corresponding or
> instruction. There's no reason to use it when setting a bit in the
> low 32 bits.
>
> Ian
Following benchmarks tells otherwise. On ivy bridge bts variant is twice
faster than doing or.
I used
for(i=0;i<1000000;i++)
x=x|(1<<i);
implemented as
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
xorl %eax, %eax
xorl %ecx, %ecx
movl $1, %edx
.p2align 4,,10
.p2align 3
.L2:
bts %ecx, %edx
addl $1, %ecx
cmpl $100000000, %ecx
jne .L2
rep
ret
.cfi_endproc
and
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
xorl %eax, %eax
xorl %ecx, %ecx
movl $1, %edx
.p2align 4,,10
.p2align 3
.L2:
movl %edx, %esi
sall %cl, %esi
addl $1, %ecx
orl %esi, %eax
cmpl $100000000, %ecx
jne .L2
rep
ret
.cfi_endproc