This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: target cost model tuning for x86

From: Dorit Nuzman <DORIT at il dot ibm dot com>
To: "Jagasia, Harsha" <harsha dot jagasia at amd dot com>
Cc: gcc-patches at gcc dot gnu dot org, "Jan Hubicka" <jh at suse dot cz>, "Sebastian Pop" <sebpop at gmail dot com>
Date: Mon, 10 Sep 2007 21:13:11 +0300
Subject: RE: target cost model tuning for x86

"Jagasia, Harsha" <harsha.jagasia@amd.com> wrote on 10/09/2007 05:07:32:

> Hi Dorit,
>
...
> If we go with it, I suppose this could be a new guard [if
> scalar_loop_iters <= th] and we could go either to the epilogue or
> prologue. Or we could emit the whole original scalar loop body as an
> alternative code path, but then there might be some code bloat. I can
> get started with a patch for this, thoughts?
>
> If we go with the early seperate test, the penalties need only be
> considered on the vector side even for the run-time case and we will be
> able to keep the target.m.builtin.vectorization cost because there will
> be atleast one new taken branch for the run time scalar loop. Also both
> the compile and run time cases can just use the threshold based on
> scalar loop iterations and the cost model equation can remain in terms
> of scalar loop iterations. And only the guard costs need to be split
> into taken, not taken and considered differently as you suggest for
> when:
> - neither alignment is known nor iterations
> - alignment is known but iterations are not
> - both alignment and iterations are known.
>

An early test is clearly better, cause it saves extra overheads from the
scalar version. It doesn't necessarily have to be a separate test though.
We could use one of the guards we already create (there are enough of
those...):
- if we do versioning, we can add our threshold comparison to the guard
that controls the versioing,
- otherwise, if we do peeling for alignment, we can determine the
loop-count of the prolog loop according to the threshold test
- otherwise, we'll have to create a new guard code

> Let me know what you think. I can submit the patch with the changes you
> recommended before end of stage 2 tomorrow, except what we decide to do
> for the above discussion. Given Mark's response, the fix for the above
> discussion can be a part of stage 3 since it really is a bug fix of some
> sort, which could likely bring some improvement to the run time cases.
>

sounds good

thanks,
dorit

> Once we are done with the vectorizer changes, the backend target costs
> may also need to change a bit. But these would be fairly non intrusive
> and we put them in before stage 2 closes and tweak them as needed in
> stage 3.
>
> Thanks,
> Harsha
>
>

Follow-Ups:
- RE: target cost model tuning for x86
  - From: Jagasia, Harsha

References:
- RE: target cost model tuning for x86
  - From: Jagasia, Harsha

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]