This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: SPEC 456.hmmer vectorization question

From: Richard Biener <richard dot guenther at gmail dot com>
To: Jakub Jelinek <jakub at redhat dot com>
Cc: Steve Ellcey <sellcey at caviumnetworks dot com>, Michael Matz <matz at suse dot de>, GCC Development <gcc at gcc dot gnu dot org>, Jeff Law <law at redhat dot com>
Date: Thu, 9 Mar 2017 09:19:07 +0100
Subject: Re: SPEC 456.hmmer vectorization question
Authentication-results: sourceware.org; auth=none
References: <201703062237.v26MbW5e008866@sellcey-dt.caveonetworks.com> <alpine.LSU.2.20.1703071423440.13579@wotan.suse.de> <1489002090.22552.19.camel@caviumnetworks.com> <CAFiYyc3H+DZPdKGq6hw+1cbhv=dSEfbaq0oD55hW8hfasUmwDA@mail.gmail.com> <20170309081243.GE22703@tucnak>

On Thu, Mar 9, 2017 at 9:12 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Mar 09, 2017 at 09:02:38AM +0100, Richard Biener wrote:
>> It would need to be done before graphite, and yes, the question is when
>> to do this (given the non-trival text size and runtime cost).  One option is
>> to do sth similar like we do with IFN_LOOP_VECTORIZED, that is, after
>> followup transforms decide whether the specialized version received any
>> important optimization.  Another option is to add value profile counters
>> for aliasing and only do this with FDO when we know at runtime there
>> is no aliasing.
>
> It doesn't have to be either/or.  If we have FDO, we can do it
> unconditionally if we have gathered into that there is likely no aliasing,
> and optimize the other loop (for the case of aliasing) for size.
> If we don't have FDO, we could do the IFN_LOOP_VERSIONED way.
> For IFN_LOOP_VERSIONED, if we check all aliasing cases we could then either
> use the OpenMP/Cilk/ivdep pragma loop properties (loop->safelen etc.),
> or even have something stronger (that would say that there aren't
> any inter-iteration memory dependencies).

We can use MR_DEPENDENCE_* to partition the dependences properly
as well.

For loop distribution we can also check profitability before adding any
dependence related edges and version according to them.  Of course
that needs a meaningful cost model...

Similarly you can run the ISL optimizer as if there were no dependences
and compare the resulting code to the original one with a cost model.

This is what the vectorizer does before doing the versioning.  For enablement
transforms cost modeling is of course hard unless you can chain analysis
parts of multiple passes (basically integrate loop passes into "one").

Of course this breaks down once you consider not disambiguating all
unknown dependences but only a few (in case the transform can still
handle some of those cases - the vectorizer for example cannot deal
with any unknown dependences).  (breaks down in complexity)

Richard.

>
>         Jakub

References:
- SPEC 456.hmmer vectorization question
  - From: Steve Ellcey
- Re: SPEC 456.hmmer vectorization question
  - From: Michael Matz
- Re: SPEC 456.hmmer vectorization question
  - From: Steve Ellcey
- Re: SPEC 456.hmmer vectorization question
  - From: Richard Biener
- Re: SPEC 456.hmmer vectorization question
  - From: Jakub Jelinek

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]