This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [i386] scalar ops that preserve the high part of a vector


On Mon, 3 Dec 2012, Uros Bizjak wrote:

On Mon, Dec 3, 2012 at 4:34 PM, Marc Glisse <marc.glisse@inria.fr> wrote:

However, looking a bit more into the usage cases for these patterns,
they are only used through intrinsics with _m128 operands. While your
proposed patch makes these patterns more general (they can use 64bit
aligned memory), this is not their usual usage, and for their intended
usage, your proposed improvement complicates these patterns
unnecessarily. Following on these facts, I'd say that we leave these
special patters (since they serve their purpose well) and rather
introduce new patterns for "other" uses.


You mean like in the original patch?
http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01279.html

(it only had the V2DF version, not the V4SF one)

Funny how we switched sides, now I am the one who would rather have a single
pattern instead of having one for the builtin and one for recog. It seems
that once we add the new pattern, keeping the old one is a waste of
maintenance time, and the few extra rtx from the slightly longer pattern for
these seldomly used builtins should be negligible.

Yes, I didn't notice at the time that the intention of existing patterns was to implement intrinsics that exclusively use _m128 operands.

But I don't mind, if that's the version you prefer, I'll update the patch.

Actually, both approaches have their benefits and drawbacks. Specialized vec_merge patterns can be efficiently macroized, and support builtins with _m128 operands in a simple and efficient way. You are proposing patterns that do not macroize well (this is what was learned from your last patch) and require breakup of existing macroized patterns.

So, we are actually adding new functionality - operations on an array
of values. IMO, this warrants new patterns, but please find a way for
V2DF and V4SF to macroize in the same way.

I am still confused as to what is wanted. If the quantity to minimize is the number of entries in sse.md, we should replace the existing vec_merge pattern with this one: it macroizes just as well, it directly matches for V4SF, and the piece of code needed in simplify-rtx for V2DF isn't too absurd. (then we need to adjust the builtins as in one of the previous patches)

[(set (match_operand:VF_128 0 "register_operand" "=x,x")
      (vec_merge:VF_128
	(vec_duplicate:VF_128
	  (plusminus:<ssescalarmode>
	    (vec_select:<ssescalarmode>
	      (match_operand:VF_128 1 "register_operand" "0,x")
	      (parallel [(const_int 0)]))
	    (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,xm"))))
      (match_dup 1)
      (const_int 1)))]

Then there is the question (i) of possibly introducing a specialized
version for V2DF (different pattern) instead of adding code to
simplify-rtx.

And finally there is the question (ii) of keeping the old define_insn in
addition to the new one(s), just for the builtins.

My preference is:
(i) specialized pattern for V2DF
(ii) remove

It seems like you might be ok with:
(i) simplify-rtx
(ii) remove

Do you agree?

--
Marc Glisse


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]