This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [i386] scalar ops that preserve the high part of a vector

From: Marc Glisse <marc dot glisse at inria dot fr>
To: Uros Bizjak <ubizjak at gmail dot com>
Cc: gcc-patches at gcc dot gnu dot org, "H.J. Lu" <hjl dot tools at gmail dot com>
Date: Tue, 4 Dec 2012 15:05:14 +0100 (CET)
Subject: Re: [i386] scalar ops that preserve the high part of a vector
References: <alpine.DEB.2.02.1210131032460.9651@stedding.saclay.inria.fr> <CAFULd4YHdLF1ZyxrMG8MhRjo40f-EfAJZnDOEBc80pOGa4WNGQ@mail.gmail.com> <alpine.DEB.2.02.1210141057010.3752@laptop-mg.saclay.inria.fr> <alpine.DEB.2.02.1211301317160.3783@laptop-mg.saclay.inria.fr> <CAFULd4bVFbr1KC6FrjWOhEfmqEr_v2VuQMFjg4+TpTB681=HhA@mail.gmail.com> <alpine.DEB.2.02.1211302244290.3783@laptop-mg.saclay.inria.fr> <alpine.DEB.2.02.1212011800400.19206@stedding.saclay.inria.fr> <CAFULd4Z=8xeVScUM7JSqkHHXJrmO7qVa40q-h-mUE+Vehb86vw@mail.gmail.com> <alpine.DEB.2.02.1212021215330.3747@laptop-mg.saclay.inria.fr> <CAFULd4Z4Ywk+BQupB+=tDKiWx5QrCt9VL83XF4Xtgu03=y0VZw@mail.gmail.com> <alpine.DEB.2.02.1212030958050.3737@laptop-mg.saclay.inria.fr> <CAFULd4a5my_Hre6k5kTUm2UqpBX06kKKe9jjMiFisBJSUJPxyw@mail.gmail.com>

On Mon, 3 Dec 2012, Uros Bizjak wrote:

On Mon, Dec 3, 2012 at 4:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:

However, looking a bit more into the usage cases for these patterns,
they are only used through intrinsics with _m128 operands. While your
proposed patch makes these patterns more general (they can use 64bit
aligned memory), this is not their usual usage, and for their intended
usage, your proposed improvement complicates these patterns
unnecessarily. Following on these facts, I'd say that we leave these
special patters (since they serve their purpose well) and rather
introduce new patterns for "other" uses.

You mean like in the original patch?
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01279.html

(it only had the V2DF version, not the V4SF one)

Funny how we switched sides, now I am the one who would rather have a single
pattern instead of having one for the builtin and one for recog. It seems
that once we add the new pattern, keeping the old one is a waste of
maintenance time, and the few extra rtx from the slightly longer pattern for
these seldomly used builtins should be negligible.


Yes,  I didn't notice at the time that the intention of existing
patterns was to implement intrinsics that exclusively use _m128
operands.

But I don't mind, if that's the version you prefer, I'll update the patch.


Actually, both approaches have their benefits and drawbacks.
Specialized vec_merge patterns can be efficiently macroized, and
support builtins with _m128 operands in a simple and efficient way.
You are proposing patterns that do not macroize well (this is what was
learned from your last patch) and require breakup of existing
macroized patterns.

So, we are actually adding new functionality - operations on an array
of values. IMO, this warrants new patterns, but please find a way for
V2DF and V4SF to macroize in the same way.


I am still confused as to what is wanted. If the quantity to minimize is
the number of entries in sse.md, we should replace the existing
vec_merge pattern with this one: it macroizes just as well, it directly
matches for V4SF, and the piece of code needed in simplify-rtx for V2DF
isn't too absurd. (then we need to adjust the builtins as in one of the
previous patches)

[(set (match_operand:VF_128 0 "register_operand" "=x,x")
      (vec_merge:VF_128
	(vec_duplicate:VF_128
	  (plusminus:<ssescalarmode>
	    (vec_select:<ssescalarmode>
	      (match_operand:VF_128 1 "register_operand" "0,x")
	      (parallel [(const_int 0)]))
	    (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,xm"))))
      (match_dup 1)
      (const_int 1)))]

Then there is the question (i) of possibly introducing a specialized
version for V2DF (different pattern) instead of adding code to
simplify-rtx.

And finally there is the question (ii) of keeping the old define_insn in
addition to the new one(s), just for the builtins.

My preference is:
(i) specialized pattern for V2DF
(ii) remove

It seems like you might be ok with:
(i) simplify-rtx
(ii) remove

Do you agree?

--
Marc Glisse

Follow-Ups:
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Marc Glisse

References:
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Marc Glisse
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Uros Bizjak
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Marc Glisse
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Uros Bizjak
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Marc Glisse
- Re: [i386] scalar ops that preserve the high part of a vector
  - From: Uros Bizjak

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]