This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: PATCH: Add SSE4.1 support

From: Jan Hubicka <jh at suse dot cz>
To: Jan Hubicka <jh at suse dot cz>
Cc: Richard Henderson <rth at redhat dot com>, "H. J. Lu" <hjl at lucon dot org>, Jan Hubicka <hubicka at ucw dot cz>, gcc-patches at gcc dot gnu dot org
Date: Fri, 20 Apr 2007 23:15:50 +0200
Subject: Re: PATCH: Add SSE4.1 support
References: <20070418160052.GA10054@lucon.org> <20070418221809.GB12902@atrey.karlin.mff.cuni.cz> <20070419055639.GA19742@lucon.org> <20070420000542.GA25665@redhat.com> <20070420190059.GA18893@lucon.org> <20070420195642.GA17189@redhat.com> <20070420201809.GA19220@lucon.org> <20070420203050.GD17189@redhat.com> <20070420210131.GA23507@kam.mff.cuni.cz>

> > On Fri, Apr 20, 2007 at 01:18:09PM -0700, H. J. Lu wrote:
> > > But "mov xmm, gr" is always a win, 
> > 
> > Um, this assertion is FALSE for AMD.
> 
> Indeed.  AMD has different length of reg and XMM queues (xmm one is
> longer).  Instructions affecting both units needs to synchronize those
> two that is bit expensive.
> 
> It seems to me that for !INTER_UNIT_MOVES targets, the intrincisc
> representing xmm->gr or gr->xmm moves of some form should be
> automatically optimized into xmm->mem->gr or gr->mem->xmm form, so we
> won't run into this problem at all.

Just for a record, other sane behaviour I can think of (and what I think
H. J. is shooting for) is to make GCC closely follow what user wrote
expecting that user knows why he writes XMM->gr move (for example by
verifying that the code is not running on AMD chip or that particular
code path is cold) that would need some wrapping in unspecs to avoid
generic simplifiers optimizing the intrincisc.

It would make sense because XMM code is most likely CPU specific and the
intrincisc looks like C encoding of assembly language. Our
blended bodel defaults to !INTER_UNIT_MOVES so to effectively use
xmm->gr or gr->xmm builtins user would need to separate the code into
unit compiled with apropriate -march flag that is bit dificult.

However this would require quite large reorganization of SSE builtins
patterns and I think is better to give optimizer freedom to optimize
user's SSE code as we do now.

We probably should not stay somewhere in between those two cases.

Honza
> 
> Honza
> > 
> > 
> > r~

Follow-Ups:
- Re: PATCH: Add SSE4.1 support
  - From: H. J. Lu

References:
- PATCH: Add SSE4.1 support
  - From: H. J. Lu
- Re: PATCH: Add SSE4.1 support
  - From: Jan Hubicka
- Re: PATCH: Add SSE4.1 support
  - From: H. J. Lu
- Re: PATCH: Add SSE4.1 support
  - From: Richard Henderson
- Re: PATCH: Add SSE4.1 support
  - From: H. J. Lu
- Re: PATCH: Add SSE4.1 support
  - From: Richard Henderson
- Re: PATCH: Add SSE4.1 support
  - From: H. J. Lu
- Re: PATCH: Add SSE4.1 support
  - From: Richard Henderson
- Re: PATCH: Add SSE4.1 support
  - From: Jan Hubicka

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]