[v3] libstdc++/44413 (for ext/vstring)
Wed Jun 9 18:23:00 GMT 2010
and, first thanks a lot for the time you spent on this.
> I don't know the OP (actually I think it's a bugzilla submission).
> Maybe there is/could be a missed optimization -- I don't know. But:
> For example changing the original code to:
> return (__n1 >= __n2) ? (__n1 - __n2) : -(__n2 - __n1);
> Doesn't change the behavior at *all* for 32 bit while reducing the 64
> bit code to what the 32 bit code generates (TOTALLY not runtime
> tested, of course). The caveat, however, is that that particular
> line of code is taking advantage of gcc's implementation defined
> behavior for the integral conversions of the unsigned subtraction into
> the 32 bit int (section 4.7 i think??). On the other hand, it's
> exactly what the 32 bit version decays into *for gcc*.
Agreed. Personally, I would like to stay away from this implementation
defined behavior, if at all possible. But let's add in CC a compiler
person I trust and let's see what he thinks, whether he believes it
would make sense to use the above *in GCC* (but well, we have got quite
a few users of the library together with ICC too, for example). I'm also
worried that when (I don't think it's the case already) command line
switches like -ftrapv will fully work the code may stop working completely.
What do you think Richard? Above the return type is int and __n2 and
__n1 are unsigned long, on 64-bit.
> In addition, maintaining the (standard's defined) behavior for
> string.compare (IOW, returning any value less-than, zero, or
> greater-than), the 64 bit can be reduced to something like the
> following (can be changed around a bit by switching the order of
> comparisons...IOW the best code to represent this isn't necessarily
> return (__n2 != __n1) ? (__n1 < __n2 ? -1 : 1) : 0;
> xorl %eax, %eax
> cmpq %rsi, %rdi
> je .L24
> sbbl %eax, %eax
> orl $1, %eax
> But again, the 32 bit *suffers* because it emits very similar code
> with an additional compare and branch in place of the original
> subtract (and cmove).
I would say that the latter modulo, maybe, minor performance-neutral
reshuffles in the assembly, should be equivalent to the code I quickly
hacked earlier today. 32-bit problems included.
More information about the Libstdc++