[PATCH 2/8] [RS6000] rs6000_rtx_costs for AND
Segher Boessenkool
segher@kernel.crashing.org
Wed Oct 21 20:29:11 GMT 2020
On Wed, Oct 21, 2020 at 01:27:42PM +1030, Alan Modra wrote:
> On Tue, Oct 20, 2020 at 01:55:56PM -0500, Segher Boessenkool wrote:
> > On Thu, Oct 08, 2020 at 09:27:54AM +1030, Alan Modra wrote:
> > > The existing "case AND" in this function is not sufficient for
> > > optabs.c:avoid_expensive_constant usage, where the AND is passed in
> > > outer_code. We'd like to cost AND of rs6000_is_valid_and_mask
> > > or rs6000_is_valid_2insn_and variety there, so that those masks aren't
> > > seen as expensive (ie. better to load to a reg then AND).
> > >
> > > * config/rs6000/rs6000.c (rs6000_rtx_costs): Combine CONST_INT
> > > AND handling with IOR/XOR. Move costing for AND with
> > > rs6000_is_valid_and_mask or rs6000_is_valid_2insn_and to
> > > CONST_INT.
> >
> > Sorry this took so long to review :-(
> >
> > On 64-bit BE this leads to *bigger* code, and closer observation shows
> > that some common sequences degrade on all configs. This seems to mostly
> > be about "andc" (and its dot form). It wasn't costed properly before,
> > but after your patch, a single instruction is replaced by three.
> >
> > Could you look into this?
>
> ~/build/gcc-alan/gcc$ for z in *.o; do if test `objdump -dr $z | grep andc | wc -l` != `objdump -dr ../../gcc/gcc/$z | grep andc | wc -l`; then echo $z; fi; done
> gimplify.o
> insn-emit.o
> insn-opinit.o
> insn-recog.o
> rs6000-string.o
>
> All of these are exactly the case I talked about in
> https://gcc.gnu.org/pipermail/gcc-patches/2020-September/553919.html
For a kernel build (my testcase) it happens more often.
> "Sometimes correct insn cost leads to unexpected results. For
> example:
>
> extern unsigned bar (void);
> unsigned
> f1 (unsigned a)
> {
> if ((a & 0x01000200) == 0x01000200)
> return bar ();
> return 0;
> }
>
> emits for a & 0x01000200
> (set (reg) (and (reg) (const_int 0x01000200)))
> at expand time (two rlwinm insns) rather than the older
> (set (reg) (const_int 0x01000200))
> (set (reg) (and (reg) (reg)))
And that is bad. Why on earth does expand "optimise" this? It should
not, it hinders various *real* optimisations!
> which is three insns. However, since 0x01000200 is needed later the
> older code after optimisation is smaller."
>
> Things have changed slightly since I wrote the above, with the two
> rlwinm insns being emitted at expand time, so you see
> (set (reg) (and (reg) (const_int 0xffffffffff0003ff)))
> (set (reg) (and (reg) (const_int 0x0000000001fffe00)))
It has done that for many years?
> but of course that doesn't change anything regarding the cost of
> "a & 0x01000200".
Yeah. But the problem is that cost that are "better", "closer to
reality", sometimes result in worse results :-(
Anyway:
+ || (outer_code == AND
+ && rs6000_is_valid_2insn_and (x, mode)))
{
*total = COSTS_N_INSNS (1);
return true;
It should return COSTS_N_INSNS (2) for that?
Testing with that now.
Segher
More information about the Gcc-patches
mailing list