This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Flag for handling inlining of strcmp/memcmp on i386
- From: Andrew Pinski <pinskia at gmail dot com>
- To: Martin Thuresson <martint at google dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 29 Sep 2009 15:40:55 -0700
- Subject: Re: Flag for handling inlining of strcmp/memcmp on i386
- References: <e73979930909291131x3b4de3re41bc420fe478811@mail.gmail.com> <e73979930909291309x7b882bd1m46af513e16b56860@mail.gmail.com> <e73979930909291332o50246831q4a551525662c88e2@mail.gmail.com>
On Tue, Sep 29, 2009 at 1:32 PM, Martin Thuresson <martint@google.com> wrote:
> Gcc currently inlines memcmp and strcmp to repz cmpsb during
> optimization. ÂSince the library call has optimizations, such as
> reading full, aligned words, it turns out that byte-by-byte
> comparison is usually slower than calling the library functions.
>
> The diagrams show performance numbers for the library
> call and the inlined version. The numbers are from a
> microbenchmark that compare buffers, (both equal and not equal
> buffers), of various lengths.
>
> http://www.ce.chalmers.se/~martin/foo/amd_opteron_call_repz.png
> http://www.ce.chalmers.se/~martin/foo/intel_core_call_repz.png
Did these microbenchmarks include aligned and unaligned addresses?
You mention that the out of line version supports aligned accesses
better but I can't tell if you tested the unaligned case.
Thanks,
Andrew Pinski