Re: sqrt via SSE2 registers

> Is there any way of telling gcc that I care so little about precision that
> I'm prepared to have it compute the square roots of objects in SSE2
> registers by using SQRTSD rather than by storing the register to memory,
> loading it onto the FP stack and calling fsqrt?
> I've tried (on the principle of sticking in anything that might have an
> effect)
> -O3 -march=pentium4 -msse2 -mfpmath=sse{,387} -ffast-math \
> -mno-fancy-math-387
> but nonetheless always get the worst-of-both-worlds behavior described
> above.
> Am I just being too optimistic about mainline gcc's current level of support
> for the P4? There is a sqrtdf2_1 instruction in which looks as if it
> should behave correctly.

Isn't this library issue? Some glibc redefine the sqrt as asm statement.
I get:
u-pl5:/tmp/egcs/build/gcc$ more t.c                                             double a;
u-pl5:/tmp/egcs/build/gcc$ ./xgcc t.c -O2 -S -B ./ -march=pentium4  -mfpmath=sse -ffast-math
u-pl5:/tmp/egcs/build/gcc$ more t.s                                                     .file   "t.c"
        .align 2
.globl main
        .type   main,@function
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        sqrtsd  a, %xmm0
        andl    $-16, %esp
        movsd   %xmm0, a
        movl    %ebp, %esp
        popl    %ebp
        .size   main,.Lfe1-main
        .comm   a,8,8
        .ident  "GCC: (GNU) 3.1 20020217 (experimental)"

