This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Transformations to increase parallelism
- From: "Ayal Zaks" <ZAKS at il dot ibm dot com>
- To: Jan Hoogerbrugge <hoogerbrugge at hotmail dot com>
- Cc: "Dorit Naishlos" <DORIT at il dot ibm dot com>, gcc at gcc dot gnu dot org
- Date: Wed, 23 Jul 2003 18:27:21 +0300
- Subject: Re: Transformations to increase parallelism
In response to: http://gcc.gnu.org/ml/gcc/2003-07/msg01606.html
>On Tue, 22 Jul 2003, Dorit Naishlos wrote:
>
>> Other compiler stages may be able to generate better code w/o these
address
>> forms (we encountered such a situation when trying to optimize
addressing
>> during combine); in fact, it may even be beneficial if this decision
would
>> take place as late as possible (possibly even after sched2...?).
>
>Can you give an example of this? Is this a power4-specific problem?
>
>Toshi
Yes, and possibly yes again.
In general, instead of generating a series of pairwise dependent insns:
load_inc r2,4(r1)
...
load_inc r3,4(r1)
...
load_inc r4,4(r1)
we prefer to generate:
load r2,4(r1)
...
load r3,8(r1)
...
load_inc r4,12(r1)
because on power4 (1) load_inc is more expensive than load in terms of
resource utilization, and (2) removing data-dependencies allows faster
time to start (out-of-order) execution.
I think we ran across such redundant pre-increment modes compiling
gap/integer.c.
Ayal.