This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.
- From: Robin Dapp <rdapp at linux dot ibm dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>, Marc Glisse <marc dot glisse at inria dot fr>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 16 Aug 2019 14:17:25 +0200
- Subject: Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.
- References: <20190813083326.10833-1-rdapp@linux.ibm.com> <20190813083326.10833-3-rdapp@linux.ibm.com> <CAFiYyc3ZQ_HHKhCCq-qZ8kGfs-HpqEuBoioWEZQ=xPnB1-V40A@mail.gmail.com>
> So - what are you really after? (sorry if I don't remeber, testcase(s)
> are missing
> from this patch)
>
> To me it seems that 1) loses information if A + CST was done in a signed type
> and we know that overflow doesn't happen because of that. For the reverse
> transformation we don't. Btw, if you make A == A' + CST' then
> you get (T)A + CST -> (T)(A' + CST' + CST) which is again trivially handled
> so why do you need both transforms again?
As far as I know the second transformation resulted from some suggestion
to split what was the status back then. Some things have changed in
match.pd so I checked again:
What I want to achieve in the end is getting rid of
add r1,-1
extend r1
add r1,1.
emitted via doloop.
We have
(1) (T)(A + CST1) + CST2 -> (T)(A) + (T)CST1 + CST2
(2) (T)(A) + (T)CST1 + CST2 -> (T)(A) + (T)(CST1 + CST2)
(2better) (A +- CST1) +- CST2 -> A + CST3 (the one Marc mentioned)
(3) (T)(A) + CST -> T(A + CST)
Considering
unsigned long foo(unsigned int a)
{
if (a > 3 && a < INT_MAX - 100)
return (unsigned long)(a - 1) + 1;
}.
This results in s390 assembly
ahi %r2,-1
llgfr %r2,%r2
aghi %r2,1
br %r14.
With the second transform it becomes just
br %r14
due to
(3) Applying pattern match.pd:2091, gimple-match.c:36515
Matching expression match.pd:111, gimple-match.c:65
(2better) Applying pattern match.pd:1980, gimple-match.c:1011
in evrp.
I believe when I last tried long time ago it would not work that way but
now it does, so (3) is basically an enabler of (2better). As far as I
can tell (2better) does not make use of range information but I didn't
really now why it works the way it does using (3).
Now for
unsigned long bar(unsigned int a)
{
if (a > 3 && a < INT_MAX - 100)
return (unsigned long)(a + 1) + ULONG_MAX;
}
we have
Folded into: _2 = a_7 + 1;
Folding statement: _3 = (long unsigned int) _2;
Folded into: _3 = (long unsigned int) _2;
Folding statement: _6 = _3 + 18446744073709551615;
(1) Applying pattern match.pd:2060, gimple-match.c:36437
in vrp1.
This is not being handled by (2alt) but by (1). The same applies for
long/int instead of ulong/uint except that we don't need range
information there.
Now, when removing (2) from my patch I run into nasty out-of-mem errors
during bootstrap which I didn't find the reason for yet. I thought of
something like two match.md patterns that keep reverting the other's
result but I think we have a recursion depth check in place there.
> Also do
>
> if ((did_replace || fold_all_stmts)
> && fold_stmt (...))
> {
> }
>
> to avoid extra work when folding does nothing.
Does this work like this? We want to mark the statement as modified even
if folding wasn't successful, I guess, and the gimple_set_modified ()
call should not be that expensive? Could do a separate
if (fold_all_stmts && fold_stmt (...))
of course.
Regards
Robin