This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Congerie of performance improvements, take two
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Zack Weinberg <zack at codesourcery dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Tue, 18 Feb 2003 17:07:49 -0500
- Subject: Re: Congerie of performance improvements, take two
- References: <87znp0c2om.fsf@egil.codesourcery.com>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Wed, Feb 12, 2003 at 06:52:25PM -0800, Zack Weinberg wrote:
>
> This is a revised version of the patch I posted last night. The
> changes are: (1) I didn't mess up the GTY markers, so it actually
> works; (2) I reordered the fields in the alist structure in i386.c to
> enable tail recursion; (3) I redid bfs_walk from scratch; it's now
> reentrant again, and even faster to boot.
>
> The changes to bfs_walk deserve a bit of explanation. I eliminated
> the varray entirely, in favor of an open-coded circular queue. (The
> older code just kept enlarging the varray forever, instead of reusing
> slots that were no longer holding live data.) The initial storage for
> this queue is on the stack. If it needs to grow past ten slots it
> gets moved to the malloc arena. This never happens in my sp.ii test
> case, but does happen several times in the g++ test suite (so I'm
> confident there aren't bugs in the implementation). Then, for
> additional savings, I put BINFO_BASETYPES pointers on the queue
> instead of unpacking their contents onto the queue. This reduces the
> chance that the queue will need to grow.
>
> The upshot for sp.ii is a 2.67% performance improvement relative to
> current CVS, and a 0.31% improvement relative to the previous patch
> (these are "estimated cycles" as calculated by cachegrind).
>
> zw
>
> * emit-rtl.c (init_emit): Use ggc_alloc for regno_reg_rtx.
> * function.h (struct emit_status): Length of regno_pointer_align
> and x_regno_reg_rtx as seen by gengtype is only x_reg_rtx_no,
> not regno_pointer_align_length (i.e. length actually used, not
> length as allocated)
Shouldn't then init_emit's
f->emit->regno_pointer_align
= (unsigned char *) ggc_alloc_cleared (f->emit->regno_pointer_align_length
* sizeof (unsigned char));
be replaced with:
f->emit->regno_pointer_align
= (unsigned char *) ggc_alloc (f->emit->regno_pointer_align_length);
memset (f->emit->regno_pointer_align, 0, LAST_VIRTUAL_REGISTER + 1);
and in gen_reg_rtx when reallocing the arrays both memset's removed?
Jakub