[v3] libstdc++/44413 (for ext/vstring)

Paolo Carlini paolo.carlini@oracle.com
Wed Jun 9 18:23:00 GMT 2010


Hi,

and, first thanks a lot for the time you spent on this.
> I don't know the OP (actually I think it's a bugzilla submission).
> Maybe there is/could be a missed optimization -- I don't know.  But:
>
> For example changing the original code to:
>
> return (__n1 >= __n2) ? (__n1 - __n2) : -(__n2 - __n1);
>
> Doesn't change the behavior at *all* for 32 bit while reducing the 64
> bit code to what the 32 bit code generates (TOTALLY not runtime
> tested, of course).   The caveat, however, is that  that particular
> line of code is taking advantage of gcc's implementation defined
> behavior for the integral conversions of the unsigned subtraction into
> the 32 bit int (section 4.7 i think??).  On the other hand, it's
> exactly what the 32 bit version decays into *for gcc*.
>   
Agreed. Personally, I would like to stay away from this implementation
defined behavior, if at all possible. But let's add in CC a compiler
person I trust and let's see what he thinks, whether he believes it
would make sense to use the above *in GCC* (but well, we have got quite
a few users of the library together with ICC too, for example). I'm also
worried that when (I don't think it's the case already) command line
switches like -ftrapv will fully work the code may stop working completely.

What do you think Richard? Above the return type is int and __n2 and
__n1 are unsigned long, on 64-bit.
> In addition, maintaining the (standard's defined) behavior for
> string.compare (IOW, returning any value less-than, zero, or
> greater-than), the 64 bit can be reduced to something like the
> following (can be changed around a bit by switching the order of
> comparisons...IOW the best code to represent this isn't necessarily
> selected)
>
> return (__n2 != __n1) ? (__n1 < __n2 ? -1 : 1) : 0;
>
>         xorl    %eax, %eax
>         cmpq    %rsi, %rdi
>         je      .L24
>         sbbl    %eax, %eax
>         orl     $1, %eax
> .L24:
>         rep
>         ret
>
>
> But again, the 32 bit *suffers* because it emits very similar code
> with an additional compare and branch in place of the original
> subtract (and cmove).
>   
I would say that the latter modulo, maybe, minor performance-neutral
reshuffles in the assembly, should be equivalent to the code I quickly
hacked earlier today. 32-bit problems included.

Paolo.



More information about the Libstdc++ mailing list