This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [i386] scalar ops that preserve the high part of a vector

On Sun, 14 Oct 2012, Marc Glisse wrote:

On Sun, 14 Oct 2012, Uros Bizjak wrote:

On Sat, Oct 13, 2012 at 10:52 AM, Marc Glisse <> wrote:

this patch provides an alternate pattern to let combine recognize scalar
operations that preserve the high part of a vector. If the strategy is all
right, I could do the same for more operations (mul, div, ...). Something
similar is also possible for V4SF (different pattern though), but probably
not as useful.

But, we _do_ have vec_merge pattern that describes the operation. Adding another one to each operation just to satisfy combine is IMO not correct approach.

At some point I wondered about _replacing_ the existing pattern, so there would only be one ;-)

The vec_merge pattern takes as argument 2 vectors instead of a vector and a scalar, and describes the operation as a vector operation where we drop half of the result, instead of a scalar operation where we re-add the top half of the vector. I don't know if that's the most convenient choice. Adding code in simplify-rtx to replace vec_merge with vec_concat / vec_select might be easier than the other way around.

If the middle-end somehow gave us: (plus X (vec_concat Y 0)) it would seem a bit strange to add an optimization that turns it into: (vec_merge (plus X (subreg:V2DF Y)) X 1) but then producing: (vec_concat (plus (vec_select X 0) Y) (vec_select X 1)) would be strange as well. (ignoring the signed zero issues here)

I'd rather see generic RTX simplification that
simplifies your proposed pattern to vec_merge pattern.

Ok, I'll see what I can do.

Also, as you mention in PR54855, Comment #5, the approach is too fragile...

I am not sure I can make the RTX simplification much less fragile... Whenever I see (vec_concat X (vec_select Y 1)), I would have to check whether X is some (possibly large) tree of scalar computations involving Y[0], move it all to vec_merge computations, and fix other users of some of those scalars to now use S[0]. Seems too hard, I would stop at single-operation X that is used only once. Besides, the gain is larger in proportion when there is a single operation :-)

Thank you for your comments,


I experimented with the simplify-rtx transformation you suggested, see:

It works when the argument is a register, but not for memory (which is where the constant is in the testcase). And the description of the operation in does seem problematic. It says the second argument is:

(match_operand:VF_128 2 "nonimmediate_operand" "xm,xm"))

but Intel's documentation says "The source operand can be an XMM register or a 64-bit memory location", not quite the same.

Do you think the .md description should really stay this way, or could we change it to something that better reflects "64-bit memory location"?

Marc Glisse

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]