This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc 3.1 is still very slow, compared to 2.95.3


>    From: Neil Booth <neil@daikokuya.demon.co.uk>
>    Date: Sun, 19 May 2002 08:07:03 +0100
>    
>    Results John posted had memset very high on the list, so I suspect
>    someone is.  I think every tree and rtx allocated is memset to
>    zero.
> 
> What overkill, it's clearing out one word. 
> 
> When optimizing, GCC should turn that into an inline
> store into the first word of the rtx though....

I guess rewriting memset into something like
switch (size)
{
case 4: memset (dest,4,0); break;
case 8: memset (dest,8,0); break;
...
default: memset (variable size)
}

Will lead GCC to inline common calls to memset resonably at least for i386.
Perhaps we can teach GCC to do such trick automatically (expand inline memset
for small values and do library call for large values), but I am not
convenienced this is a win in general....

Putting some __builtin_prefetch there should be also a win.
> 
> (Dave checks...)
> 
> Yes, it does optimize this, but into 3 byte stores.  One of
> which overlaps with the PUT_CODE (rt, code) rtx_alloc does.
> :-(
That is probably because GCC is unable to detect the alignment for some
purpose.  I don't see why :(

Honza
> 
> On Sparc this is:
> 
> 	stb	%g0, [%rt + 1]
> 	sth	code, [%rt]
> 	stb	%g0, [%rt + 2]
> 	stb	%g0, [%rt + 3]
> 
> When it should be optimized into:
> 
> 	sth	code, [%rt]
> 	sth	%g0, [%rt + 2]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]