This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: gcc 3.1 is still very slow, compared to 2.95.3


>    From: Jan Hubicka <jh@suse.cz>
>    Date: Sun, 19 May 2002 13:52:06 +0200
> 
>    > Yes, it does optimize this, but into 3 byte stores.  One of
>    > which overlaps with the PUT_CODE (rt, code) rtx_alloc does.
>    > :-(
> 
>    That is probably because GCC is unable to detect the alignment for some
>    purpose.  I don't see why :(
> 
> I promise to look more deeply into this.  It may be a Sparc specific
> problem because x86 outputs:
> 
> 	movl	$0, (%eax)
> 	movw	code, (%eax)
> 
> But I think GCC should really give us:
> 
> 	movw	$0, 2(%eax)
> 	movw	code, (%eax)
> 
> Well, better yet:
> 
> 	movw	code, (%eax)
> 	movw	$0, 2(%eax)

I don't think there is any framework for partially dead stores.
I also believe at least Athlon will combine stores in any order and
that majority of recent chips do (P3/P4).
On the other hand, I think it can be better to construct the value
in register and store once.
> 
> so it actually combines in the store buffer of the processor.  (This
> is one area GCC really needs to improve, ordering consequetive stores
> to the same area)
> 
> The Sparc output is very perplexing because GCC eliminated one of
> the byte stores that overlapped the store of "code" but not both
> of them!
:) I think this is present in the flow.c dead store ellimination - it checks
address for equivalence.

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]