[PATCH] match.pd: Add ~(X - Y) -> ~X + Y simplification [PR96685]

Sun Jan 10 00:43:11 GMT 2021

On Mon, 21 Dec 2020, Jakub Jelinek wrote:

> > > This patch adds the ~(X - Y) -> ~X + Y simplification requested
> > > in the PR (plus also ~(X + C) -> ~X + (-C) for constants C that can
> > > be safely negated.
> > 
> >  This regresses VAX code produced by the cmpelim-eq-notsi.c test case (and 
> > its similar counterparts) with the `vax-netbsdelf' target.
> 
> The point of the match.pd changes is to canonicalize GIMPLE on some form
> when there are several from GIMPLE POV equivalent or better forms of writing
> the same thing.  The advantage of having one canonical way is that ICF,
> SCCVN etc. optimizations can then understand the different forms are
> equivalent.

 Fair enough, though in cases like this I think it is unclear which of the 
two forms is going to be ultimately better, especially as it may depend on 
the exact form of the operands used, e.g. values of any immediates, so I 
think a way to make the reverse transformation (whether to undo one made 
here or genuinely) needs to be available at a later compilation stage.  
One size doesn't fit all.

 With this in mind...

> If another form is then better for a particular machine, it should be done
> either during expansion (being able to produce both RTLs and computing their
> costs), or during combine with either combine splitters or
> define_insn_and_split in the backend, or, if it can't be done in RTL, during
> the isel pass.

 Hmm, maybe it has been discussed before, so please forgive me if I write 
something silly, but it seems to me like this should be done in a generic 
way like match.pd so that all the targets do not have to track the changes 
made there and then perhaps repeat the same or similar code each.  So I 
think it would make sense to make a change like this include that reverse 
transformation as well, so that ultimately both forms are tried with RTL, 
as there is no clear advantage to either here.

 However expand AFAICT essentially boils down to `expand_expr_real_2', 
which uses a bijective mapping between individual operations in the tree 
and the RTL form respectively rather than sequences as with the unary one 
complement and an addition/subtraction covered here, for two operations 
total.  So this cannot be done with expand, by design, unless I'm missing 
something.

 So from the look at combine code I infer the way to handle this within 
the existing infrastructure would be with `try_combine'/`find_split_point' 
(as we have one sequence of insns to be replaced with another sequence 
rather than combined into a single insn), right?

 Overall it seems to me it would be the best if we were able to produce a 
reverse RTL transformation automatically straight from match.pd, however I 
am not sure whether the syntax used is suitable for reverse interpretation 
in the general case.  And in any case this would be a major effort, but it 
would be hugely beneficial, as we have lots of knowledge already collected 
in match.pd.

 Have I missed anything here?

 NB I've seen it often enough already to irritate me that someone claims 
we ought to do something "because LLVM does it".  Well, if someone wants 
things done the LLVM way, then there's LLVM readily available.  If there 
is a technical advantage in doing something, then surely we ought to do 
that, however whether LLVM does it too or does not is I think irrelevant.

  Maciej