This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Slow memcmp for aligned strings on Pentium 3
- From: Kevin Atkinson <kevina at gnu dot org>
- To: Roger Sayle <roger at www dot eyesopen dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Fri, 04 Apr 2003 08:15:44 -0500 (EST)
- Subject: Re: Slow memcmp for aligned strings on Pentium 3
On Thu, 3 Apr 2003, Roger Sayle wrote:
>
> Hi Kevin,
> > I did some tests and discovered that using cmps was rather slow,
> > compared to a simple loop and then a bswap and subtract at the end.
>
> I'm sure that GCC's memcmp implementations could be improved, but
> from reading the code examples in your patch it looks like you
> are always assuming that either the length is a multiple of four,
> or that the bytes following the memory sections to be compared
> contain identical values (i.e. you're hoping they're all zero).
>
> i.e., if p and q are suitably 4-byte aligned
>
> memset(p,"abcd",4);
> memset(q,"abef",4);
> memcmp(p,q,2)
>
> should compare equal but don't using bswaps and subtractions.
> Similarly, when two words mismatch their return value <0 or
> >0 should depend upon the first byte that differs, not the
> values of the bytes that come after it.
Your right. It can be fixed by a test if it is a multiple of 4 and if not
do a byte wise comparison at the end.
> I suspect it should be possible to fix your code to handle these
> termination conditions correctly, and a comparison of your
> routine's performance with these fixes vs. __builtin_memcmp
> would be of interest.
I will see what I can do.
---
http://kevin.atkinson.dhs.org