This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Slow memcmp for aligned strings on Pentium 3


On Thu, 3 Apr 2003, Roger Sayle wrote:

> 
> Hi Kevin,
> > I did some tests and discovered that using cmps was rather slow,
> > compared to a simple loop and then a bswap and subtract at the end.
> 
> I'm sure that GCC's memcmp implementations could be improved, but
> from reading the code examples in your patch it looks like you
> are always assuming that either the length is a multiple of four,
> or that the bytes following the memory sections to be compared
> contain identical values (i.e. you're hoping they're all zero).
> 
> i.e., if p and q are suitably 4-byte aligned
> 
>   memset(p,"abcd",4);
>   memset(q,"abef",4);
>   memcmp(p,q,2)
> 
> should compare equal but don't using bswaps and subtractions.
> Similarly, when two words mismatch their return value <0 or
> >0 should depend upon the first byte that differs, not the
> values of the bytes that come after it.

Your right.  It can be fixed by a test if it is a multiple of 4 and if not 
do a byte wise comparison at the end.

> I suspect it should be possible to fix your code to handle these
> termination conditions correctly, and a comparison of your
> routine's performance with these fixes vs. __builtin_memcmp
> would be of interest.

I will see what I can do.

--- 
http://kevin.atkinson.dhs.org


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]