This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Slow memcmp for aligned strings on Pentium 3
"Kevin Atkinson" <kevina at gnu dot org> wrote in message
news:Pine dot LNX dot 4 dot 44 dot 0304040923350 dot 912-300000 at kevin-pc dot atkinson dot dhs dot org dot dot dot
> Here you go. Still assume the memory is alligned:
>
> This is what I did:
>
> int cmpa2(const unsigned int * x, const unsigned int * y, size_t size)
> {
> int i = 0;
> size_t s = size / 4;
> while (i < s && x[i] == y[i]) ++i;
> size -= i * 4;
> if (size == 0) return 0;
> // hopefully if this is inline expanded when size is known
> // the compiler can eliminate many of these conditionals
> else if (size >= 4) { // if original size % 4 == 0 this should
> // always be the case
> unsigned int xx = x[i], yy = y[i];
> asm("bswap %0" : "+r"(xx));
> asm("bswap %0" : "+r"(yy));
> return xx - yy;
> } else {
> const unsigned char * xb = (const unsigned char *)(x + i);
> const unsigned char * yb = (const unsigned char *)(y + i);
> // if size is known at compile time then the compiler should be
> // able to select the correct choice at compile time
> switch (size) {
> case 1:
> return *xb - *yb;
> case 2:
> return ((xb[0] - yb[0]) << 8) + (xb[1] - yb[1]);
> case 3:
> return ((xb[0] - yb[0]) << 16) + ((xb[1] - yb[1]) << 8)
> + xb[2] - yb[2];}
> }
> }
There is still a flaw in that you assume that the difference of 2 unsigned
integers will return the correct signed result. This will not work if for
instance in your "return xx - yy" statement, xx for instance is 0xf0000000
and yy is 0x100000. In that case, xx is clearly greater than yy, but the
difference is 0xe0000000 which cast to signed integer will be a negative
number and will indicate that xx is smaller than yy which is clearly wrong.
Marcel