[Bug c++/23793] New: Unhealthy optimization. Accessing double with reinterpret_cast.

Fri Sep 9 07:51:00 GMT 2005

This is an error-report. However I will provide some background 
for my little piece of code. (The code itself is very simple)

I will try to post on comp.std.c++ in order to make this a part of C++ 
(maybe c) otherwise I might come back too beck you to implement it just in your
compiler.

There are many reasons. (One and the best is to switch a double based on
intervals) Therefore I would like a VERY FAST FUNCTION to return the sign of a
double (and float and long double) ...

(And compare on a double d<=-0.0 (without branch) wont do the trick. And I can't
blame you because you will have to respect Nan. 

Since c/c++ does not have this fast function (skipping Nan)
(which I hope will come) I have no other no other options that to 
write it myself (and cheat!)

That means that my trick will only work on doubles in IEEE 754 - with a size of 2. 

The sizepart of double (in my case 8 bytes) and int (in my case 4 bytes) could
probably be fixed with the right macros. 

However my code will only work on x86 and other machines accepting the IEEE 754
standard. I think Motorola does not follow this - but nevermind.

The code with the bug is : (Read signbit a push it to be one or zero)
int is_not_positive(double v)
{
  return ((reinterpret_cast<unsigned int*>(&v)[1]) >> 31);
}

This works with option O1 (and below)
but fails with O2 (and above)

The O1 correct (but not fast fast code) looks like this:

    .file    "bug.cpp"
    .text
    .align 2
.globl _Z15is_not_positived
    .type    _Z15is_not_positived, @function
_Z15is_not_positived:
.LFB3:
    pushl    %ebp
.LCFI0:
    movl    %esp, %ebp
.LCFI1:
    subl    $8, %esp
.LCFI2:
    fldl    8(%ebp)
    fstpl    -8(%ebp)
    movl    -4(%ebp), %eax
    shrl    $31, %eax
    movl    %ebp, %esp
    popl    %ebp
    ret
.LFE3:
    .size    _Z15is_not_positived, .-_Z15is_not_positived
    .section    .note.GNU-stack,"",@progbits
    .ident    "GCC: (GNU) 3.3.5-20050130 (Gentoo 3.3.5.20050130-r1,
ssp-3.3.5.20050130-1, pie-8.7.7.1)"

The wrong optimization simply removes:
fldl    8(%ebp)
fstpl    -8(%ebp)
// I guess that it removes the store. 

---------------------------------------
The "wished optimized code" is (notice this is partly manually written so
I might be wrong. I am not an assembler expert)

.LFB4:
    pushl    %ebp
.LCFI0:
    movl    %esp, %ebp
.LCFI1:
    movl    12(%ebp), %eax
    popl    %ebp
    shrl    $31, %eax
    ret
.LFE4:

I am sorry that I have not testet it with a newer version. 
(However I am not to bright and last time I did emerge gcc (with accept newest
version I got problems with compiling my kernel))
I hope the answer is "just upgrade you stupid man..."

Regards 
ThorbjÃ¸rn Martsum

PS: 
BTW...
I have found a workaround using unsigned long long. 
This works with O3 and has only one extra instruction compared to "my best"

    movl    12(%ebp), %eax
    popl    %ebp

expands to

    movl    12(%ebp), %ecx
    popl    %ebp
    movl    %ecx, %eax

(a missing peephole-pattern (?) ) 
But sill WAY WAY Better than what the MS VS6 gives =)

-- 
           Summary: Unhealthy optimization. Accessing double with
                    reinterpret_cast.
           Product: gcc
           Version: 3.3.5
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: c++
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: martsummsw at hotmail dot com
                CC: gcc-bugs at gcc dot gnu dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23793