This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: GCC performance regression - its memset!
- From: Jan Hubicka <jh at suse dot cz>
- To: Michel LESPINASSE <walken at zoy dot org>
- Cc: Roger Sayle <roger at eyesopen dot com>, gcc at gcc dot gnu dot org,Richard Henderson <rth at redhat dot com>, Jan Hubicka <jh at suse dot cz>,gcc-patches at gcc dot gnu dot org
- Date: Tue, 23 Apr 2002 11:51:45 +0200
- Subject: Re: GCC performance regression - its memset!
- References: <Pine.LNX.4.33.0204222307450.2893-100000@www.eyesopen.com> <20020423060709.GA21922@zoy.org>
> rep
> stosl
> testl $2, %edi <- ooops, its really meant to test the remainder
> not the address !!! so test will always fail.
> je .L9
> movw $0, (%edi)
> addl $2, %edi
> .L9:
> testl $1, %edi <- that one too.
> je .L10
> movb $0, (%edi)
> .L10:
Hmm, an pasto.
In memcpy case I got it right, while in memset I broke it. I am attaching patch
I am testing currently. OK for mainline/branch assuming it passes?
COncerning the inlining, gcc inlines all memcpys with size smaller than 64 bytes.
Perhaps this should be extended to 128 bytes in case we are still about 2 times as bad.
This is partly due to lame implementation of memset in glibc too :(
Tue Apr 23 11:48:53 CEST 2002 Jan HUbicka <jh@suse.cz>
* i386.c (ix86_expand_clrstr): Fix pasto.
Index: i386.c
===================================================================
RCS file: /cvs/gcc/egcs/gcc/config/i386/i386.c,v
retrieving revision 1.353.2.14
diff -c -3 -p -r1.353.2.14 i386.c
*** i386.c 16 Apr 2002 18:16:36 -0000 1.353.2.14
--- i386.c 23 Apr 2002 09:47:50 -0000
*************** ix86_expand_clrstr (src, count_exp, alig
*** 9451,9457 ****
gen_rtx_SUBREG (SImode, zeroreg, 0)));
if (TARGET_64BIT && (align <= 4 || count == 0))
{
! rtx label = ix86_expand_aligntest (destreg, 2);
emit_insn (gen_strsetsi (destreg,
gen_rtx_SUBREG (SImode, zeroreg, 0)));
emit_label (label);
--- 9451,9457 ----
gen_rtx_SUBREG (SImode, zeroreg, 0)));
if (TARGET_64BIT && (align <= 4 || count == 0))
{
! rtx label = ix86_expand_aligntest (countreg, 2);
emit_insn (gen_strsetsi (destreg,
gen_rtx_SUBREG (SImode, zeroreg, 0)));
emit_label (label);
*************** ix86_expand_clrstr (src, count_exp, alig
*** 9462,9468 ****
gen_rtx_SUBREG (HImode, zeroreg, 0)));
if (align <= 2 || count == 0)
{
! rtx label = ix86_expand_aligntest (destreg, 2);
emit_insn (gen_strsethi (destreg,
gen_rtx_SUBREG (HImode, zeroreg, 0)));
emit_label (label);
--- 9462,9468 ----
gen_rtx_SUBREG (HImode, zeroreg, 0)));
if (align <= 2 || count == 0)
{
! rtx label = ix86_expand_aligntest (countreg, 2);
emit_insn (gen_strsethi (destreg,
gen_rtx_SUBREG (HImode, zeroreg, 0)));
emit_label (label);
*************** ix86_expand_clrstr (src, count_exp, alig
*** 9473,9479 ****
gen_rtx_SUBREG (QImode, zeroreg, 0)));
if (align <= 1 || count == 0)
{
! rtx label = ix86_expand_aligntest (destreg, 1);
emit_insn (gen_strsetqi (destreg,
gen_rtx_SUBREG (QImode, zeroreg, 0)));
emit_label (label);
--- 9473,9479 ----
gen_rtx_SUBREG (QImode, zeroreg, 0)));
if (align <= 1 || count == 0)
{
! rtx label = ix86_expand_aligntest (countreg, 1);
emit_insn (gen_strsetqi (destreg,
gen_rtx_SUBREG (QImode, zeroreg, 0)));
emit_label (label);