[PATCH] Change vcond<mode> to vcond<mode1><mode2>

Tue Aug 30 09:19:00 GMT 2011

On Tue, 30 Aug 2011, Uros Bizjak wrote:

> On Mon, Aug 29, 2011 at 9:44 PM, Richard Guenther <rguenther@suse.de> wrote:
> 
> >> > This patch makes a conversion optab from the direct optabs vcond
> >> > and vcondu. Â This allows to specify different modes for the
> >> > actual comparison and the value that is selected.
> >> >
> >> > All targets but i386 are trivially converted by
> >> > s/vcond<mode>/vcond<mode><mode>/. Â The i386 port is enhanced
> >> > to support a OP b ? c : d as ({ mask = a OP b; (c & mask) | (d & ~mask);
> >> > }), constraining it to what the middle-end constrained itself to
> >> > (matching number of vector elements in the comparison operands with
> >> > the result vector types) would explode patterns too much.
> >> > Thus, only a subset of mode combinations will be excercised
> >> > (but none at the moment - a followup will fix the vectorizer,
> >> > and generic vectors from the C extensions have a patch pending).
> >> >
> >> > Bootstrapped on x86_64-unknown-linux-gnu, tests are currently
> >> > running for {,-m32}.
> >> >
> >> > Ok if that succeeds?
> >> >
> >> > Thanks,
> >> > Richard.
> >> >
> >> > 2011-08-29 Â Richard Guenther Â <rguenther@suse.de>
> >> >
> >> > Â  Â  Â  Â * genopinit.c (optabs): Turn vcond{,u}_optab into a conversion
> >> > Â  Â  Â  Â optab with two modes.
> >> > Â  Â  Â  Â * optabs.h (enum convert_optab_index): Add COI_vcond, COI_vcondu.
> >> > Â  Â  Â  Â (enum direct_optab_index): Remove DOI_vcond, DOI_vcondu.
> >> > Â  Â  Â  Â (vcond_optab): Adjust.
> >> > Â  Â  Â  Â (vcondu_optab): Likewise.
> >> > Â  Â  Â  Â (expand_vec_cond_expr_p): Adjust prototype.
> >> > Â  Â  Â  Â * optabs.c (get_vcond_icode): Adjust.
> >> > Â  Â  Â  Â (expand_vec_cond_expr_p): Likewise.
> >> > Â  Â  Â  Â (expand_vec_cond_expr): Likewise.
> >> > Â  Â  Â  Â * tree-vect-stmt.c (vectorizable_condition): Adjust.
> >> >
> >> > Â  Â  Â  Â * config/i386/sse.md (vcond<mode>): Split to
> >> > Â  Â  Â  Â vcond<V_256:mode><VF_256:mode>, vcond<V_128:mode><VF_128:mode>,
> >> > Â  Â  Â  Â vcond<V_128:mode><VI124_128:mode> and
> >> > Â  Â  Â  Â vcondu<V_128:mode><VI124_128:mode>.
> >> > Â  Â  Â  Â (vcondv2di): Change to vcond<VI8F_128:mode>v2di.
> >> > Â  Â  Â  Â (vconduv2di): Likewise.
> >> > Â  Â  Â  Â * config/arm/neon.md (vcond<mode>): Change to vcond*<mode><mode>.
> >> > Â  Â  Â  Â (vcondu<mode>): Likewise.
> >> > Â  Â  Â  Â * config/ia64/vect.md (vcond<mode>): Likewise.
> >> > Â  Â  Â  Â (vcondu<mode>): Likewise.
> >> > Â  Â  Â  Â (vcondv2sf): Likewise.
> >> > Â  Â  Â  Â * config/mips/mips-ps-3d.md (vcondv2sf): Likewise.
> >> > Â  Â  Â  Â * config/rs6000/paired.md (vcondv2sf): Likewise.
> >> > Â  Â  Â  Â * config/rs6000/vector.md (vcond<mode>): Likewise.
> >> > Â  Â  Â  Â (vcondu<mode>): Likewise.
> >> > Â  Â  Â  Â * config/spu/spu.md (vcond<mode>): Likewise.
> >> > Â  Â  Â  Â (vcondu<mode>): Likewise.
> >>
> >> Do we really want to introduce stuff like:
> >>
> >> ! (define_expand "vcond<V_128:mode><VF_128:mode>"
> >>
> >> You are in fact introducing 6x2 = 12 patterns, many of them (i.e.
> >> v16qiv2df combination) invalid.
> >
> > Well, in principle they are not "invalid" they would be a short-hand -
> > the example of (subreg:V16QI (vcond:V2DI (... )).
> >
> >> I'd prefer a pattern with mode-less operands 4 and 5, rejected in insn
> >> constraints for invalid combinations:
> >
> > Hm, ok - that was the first variant I tried (well, but with modeless
> > operands 1 and 2, to keep 4 and 5 selcting int vs. fp compare). Â But
> > modeless operands get you that annoying warning from the gen* programs.
> 
> Only for define_insn, if your c_test does not include string "operands".
> 
> > How'd you ask if a pattern is available for vcondv4si with v4sf
> > operands 4 and 5? Â The vectorizer would need to be able to ask this
> > question.
> 
> Maybe with something along the lines of:
> 
> (define_expand "vcond<mode>"
>   [(set (match_operand:VI124_128 0 "register_operand" "")
> 	(if_then_else:VI124_128
> 	  (match_operator 3 ""
> 	    [(match_operand 4 "nonimmediate_operand" "")
> 	     (match_operand 5 "nonimmediate_operand" "")])
> 	  (match_operand:VI124_128 1 "general_operand" "")
> 	  (match_operand:VI124_128 2 "general_operand" "")))]
>   "TARGET_SSE2"
> {
>   if (GET_MODE (operands[4]) != GET_MODE (operands[5])
>       || (GET_MODE_NUNITS (GET_MODE (operands[4]))
>       	  != GET_MODE_NUNITS (GET_MODE (operands[0]))))
>     FAIL;
> 
>   bool ok = ix86_expand_int_vcond (operands);
>   gcc_assert (ok);
>   DONE;
> })
> 
> This means that vcond pattern is allowed to FAIL, so when vectorizer
> tentatively tries to expand the pattern, FAIL signalizes that operand
> modes are not supported.

Hmm.  But then I'd have to try emit an insn, right?  Currently
the vectorizer simply looks for an optab handler ... the
operands are not readily available (but their mode is known).
So I'd create some fake regs, setup operands and call GEN_FCN
on it?  If it succeds I'd have to delete emitted insns, etc.
Or I could add a target hook ...

That shifts the ugliness towards the users, for just saving
a few redundant patterns?

> > In the end putting both mask generation and apply into one
> > instruction pattern causes all this issues - but it helps
> > not exposing the mask representation of the HW.
> 
> No, but we should be sure that the widths are the same...

Yes.  We'd guarantee that we'd never try to expand something
where that doesn't hold.

I just saw PR29269 - vcond is not documented ... I'd document
it for the two-mode variant as

@cindex @code{vcond@var{m}@var{n}} instruction pattern
@item @samp{vcond@var{m}@var{n}}
Output a conditional vector move.  Operand 0 is the destination to
receive a combination of operand 1 and operand 2, which are of mode 
@var{m},
dependent on the outcome of the predicate in operand 3 which is a
vector comparison with operands of mode @var{n} in operands 4 and 5.  The
modes @var{m} and @var{n} should have the same size.  Operand 0
will be set to the value @var{op1} & @var{msk} | @var{op2} & ~@var{msk}
where @var{msk} is computed by element-wise evaluation of the vector
comparison with a truth value of all-ones and a false value of all-zeros.

not constraining the modes further than requiring the same size.

The two patches tested ok on x86_64-unknown-linux-gnu with {,-m32}.

Richard.

> Uros.