This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC] Come up with VEC_COND_OP_EXPRs.
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Sep 24, 2019 at 1:11 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Martin Liška <mliska@suse.cz> writes:
>> > Hi.
>> >
>> > The patch introduces couple of new TREE_CODEs that will help us to have
>> > a proper GIMPLE representation of current VECT_COND_EXPR. Right now,
>> > the first argument is typically a GENERIC tcc_expression tree with 2 operands
>> > that are visited at various places in GIMPLE code. That said, based on the discussion
>> > with Richi, I'm suggesting to come up with e.g.
>> > VECT_COND_LT_EXPR<COND_LHS, COND_RHS, IF_CLAUSE, ELSE_CLAUSE>. Such a change logically
>> > introduces new GIMPLE_QUATERNARY_RHS gassignments. For now, the VEC_COND_EXPR remains
>> > and is only valid in GENERIC and gimplifier will take care of the corresponding transition.
>> >
>> > The patch is a prototype and missing bits are:
>> > - folding support addition for GIMPLE_QUATERNARY_RHS is missing
>> > - fancy tcc_comparison expressions like LTGT_EXPR, UNORDERED_EXPR, ORDERED_EXPR,
>> > UNLT_EXPR and others are not supported right now
>> > - comments are missing for various functions added
>> >
>> > Apart from that I was able to bootstrap and run tests with a quite small fallout.
>> > Thoughts?
>> > Martin
>>
>> I think this is going in the wrong direction. There are some targets
>> that can only handle VEC_COND_EXPRs well if we know the associated
>> condition, and others where a compare-and-VEC_COND_EXPR will always be
>> two operations. In that situation, it seems like the native gimple
>> representation should be the simpler representation rather than the
>> more complex one. That way the comparisons can be optimised
>> independently of any VEC_COND_EXPRs on targets that benefit from that.
>>
>> So IMO it would be better to use three-operand VEC_COND_EXPRs with
>> no embedded conditions as the preferred gimple representation and
>> have internal functions for the fused operations that some targets
>> prefer. This means that using fused operations is "just" an instruction
>> selection decision rather than hard-coded throughout gimple. (And that
>> fits in well with the idea of doing more instruction selection in gimple.)
>
> So I've been doing that before, but more generally also for COND_EXPR.
> We cannot rely on TER and the existing RTL expansion "magic" for the
> instruction selection issue you mention because TER isn't reliable. With
> IFNs for optabs we could do actual [vector] condition instruction selection
> before RTL expansion, ignoring "single-use" issues - is that what you are
> hinting at?
Yeah. It'd be similar to how most FMA selection happens after
vectorisation but before expand.
> How should the vectorizer deal with this? Should it directly
> use the optab IFNs then when facing "split" COND_EXPRs? IIRC the
> most fallout of a simple patch (adjusting is_gimple_condexpr) is in the
> vectorizer.
I guess that would be down to how well the vector costings work if we
just stick to VEC_COND_EXPR and cost the comparison separately. Using
optabs directly in the vectoriser definitely sounds OK if that ends up
being necessary for good code. But if (like you say) the COND_EXPR is
also split apart, we'd be costing the scalar comparison and selection
separately as well.
> Note I'm specifically looking for a solution that applies to both COND_EXPR
> and VEC_COND_EXPR since both suffer from the same issues.
Yeah, think the same approach would work for COND_EXPR if it's needed.
(And I think the same trade-off applies there too. Some targets will
always need a separate comparison to implement a four-operand COND_EXPR.)
> There was also recent work in putting back possibly trapping comparisons
> into [VEC_]COND_EXPR because it doesn't interfere with EH and allows
> better code.
OK, that's a good counter-reason :-) But it seems quite special-purpose.
I assume this works even for targets that do split the VEC_COND_EXPR
because the result is undefined on entry to the EH receiver if the
operation didn't complete. But that should be true of any non-trapping
work done after the comparison, with the same proviso.
So this still seems like an instruction-selection issue. We're just
saying that it's OK to combine a trapping comparison and a VEC_COND_EXPR
from the non-trapping path. The same would be true for any other
instruction selection that combines trapping and non-trapping
operations, provided that the speculated parts can never trap.
> Also you SVE people had VN issues with cond-exprs and
> VN runs into the exact same issue (but would handle separate comparisons
> better - with the caveat of breaking TER).
The VN thing turned out to be a red herring there, sorry. I think
I was remembering the state before ifcvt did its own value numbering.
The remaining issue for the vectoriser is that we don't avoid duplicate
cast conversions in vect_recog_mask_conversion_pattern, but that's
mostly a cost thing. The redundancies do get removed by later passes.
Thanks,
Richard