This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Here is a revised version; thank everybody for the quick feedback! I am now using EAX to hold the decreasing loop counter, and checking if <16 before checking =0 as Jakub suggested. It could be interesting to check if working on -EAX and using indexed addressing modes is worth (I'm not sure, because you'd lose the ability to decrease the loop counter). I performed the tests with 100 repetitions, since on machines only slightly newer than mine a single repetition is way too small to observe anything; the number is customizable. A single program does testing and benchmarking (to avoid clashes my implementation is now called my_memcmp); compile it with -fno-builtin of course! As Roger Sayle suggested, I'm taking user time instead of wall time (though my machine is so lightly loaded that I cannot see the difference). Note I'm using -mcpu=i686 -march=i686 (Andreas Jaeger suggested that, but since I compiled my GCC on my own it is already the default). Results: utente@engineer:~/esperimenti$ gcc -O3 -fno-builtin test.c memcmp.S utente@engineer:~/esperimenti$ ./a.out 100 repetitions Testing custom memcmp ..... done, 8.130 seconds Testing libc memcmp ....... done, 10.250 seconds Testing builtin memcmp .... done, 9.620 seconds Testing loop overhead ..... done, 5.580 seconds (CCed to gcc people because of the dismaying __builtin_memcmp performance on the PII). Paolo
Attachment:
memcmp.S
Description: Binary data
Attachment:
test.c
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |