97071 – Fails to CSE / inherit constant pool load

Bug 97071 - Fails to CSE / inherit constant pool load

Summary: Fails to CSE / inherit constant pool load

Status:	NEW

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	11.0

Importance:	P3 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	missed-optimization

Depends on:
Blocks:

Reported:	2020-09-16 13:45 UTC by Richard Biener
Modified:	2022-01-11 13:14 UTC (History)
CC List:	3 users (show)

See Also:
Host:
Target:	powerpc64le, x86_64--
Build:
Known to work:
Known to fail:
Last reconfirmed:	2020-09-16 00:00:00

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Richard Biener 2020-09-16 13:45:34 UTC

double foo (double x)
{
  return x * -3. + 3.;
}

compiles to

0:      addis 2,12,.TOC.-.LCF0@ha
        addi 2,2,.TOC.-.LCF0@l
        .localentry     foo,.-foo
        addis 9,2,.LC0@toc@ha
        lfd 12,.LC0@toc@l(9)
        addis 9,2,.LC2@toc@ha
        lfd 0,.LC2@toc@l(9)
        fmadd 1,1,12,0
        blr

...

.LC0:   
        .long   0
        .long   -1073217536
        .align 3
.LC2:   
        .long   0
        .long   1074266112

but CSE or reload inheritance could have replaced the add of + 3. with
subtraction of the available -3. constant.  Might be more difficult to
pull off on x86 where the add/mul has memory operands.

Comment 1 Richard Biener 2020-09-16 13:49:24 UTC

Right before combine we see the following, still nicely with REG_EQUAL notes

(insn 7 17 9 2 (set (reg:DF 119)
        (mem/u/c:DF (unspec:DI [
                    (symbol_ref/u:DI ("*.LC0") [flags 0x82])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [0  S8 A64])) "t.c":3:20 533 {*movdf_hardfloat64}   
     (expr_list:REG_EQUAL (const_double:DF -3.0e+0 [-0x0.cp+2])
        (nil)))
(insn 9 7 14 2 (set (reg:DF 121)
        (mem/u/c:DF (unspec:DI [
                    (symbol_ref/u:DI ("*.LC2") [flags 0x82])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [0  S8 A64])) "t.c":3:20 533 {*movdf_hardfloat64}   
     (expr_list:REG_EQUAL (const_double:DF 3.0e+0 [0x0.cp+2])
        (nil)))
(insn 14 9 15 2 (set (reg/i:DF 33 1)
        (fma:DF (reg:DF 124)
            (reg:DF 119)
            (reg:DF 121))) "t.c":4:1 894 {*fmadf4_fpr}
     (expr_list:REG_DEAD (reg:DF 124)
        (expr_list:REG_DEAD (reg:DF 121)
            (expr_list:REG_DEAD (reg:DF 119)
                (nil)))))

eventually the easiest pass to teach this to is fwprop though as it already
works DF DEF -> USE.  Alternatively PRE could make the subtract and/or the
negated value anticipated.

Comment 2 Jakub Jelinek 2020-09-16 14:07:19 UTC

REG_EQUAL notes aren't really needed for that, we have functions to query the values from the constant pool for loads from it.
So guess it is a matter of looking at the constant pool entry, if the negation of it is already emitted and the current value is not, try if instruction with the negation can be recognized.
If neither of the constant pool entries is emitted already, but both are present, it should try to canonicalize to one of them...

Comment 3 Richard Biener 2020-09-16 14:14:23 UTC

So on targets where the FP constant loads are separate insns the load of the
negated constant could be replaced by a (neg:DF ..) which might even be
profitable when not combined with the following add.  As said targets like
x86 might be more difficult in this regard though it looks like the
memory operands in this case only appear during LRA.

Comment 4 Richard Biener 2020-09-16 14:15:28 UTC

(In reply to Jakub Jelinek from comment #2)
> REG_EQUAL notes aren't really needed for that, we have functions to query
> the values from the constant pool for loads from it.
> So guess it is a matter of looking at the constant pool entry, if the
> negation of it is already emitted and the current value is not, try if
> instruction with the negation can be recognized.
> If neither of the constant pool entries is emitted already, but both are
> present, it should try to canonicalize to one of them...

Note it's not mainly about optimizing the size of the constant pool but
to reduce the number of loads from it and eventually shrink code size.

Comment 5 Richard Biener 2020-09-16 14:18:55 UTC

Related and more difficult case where the add is first and we'd want to
change the load of -3 to a load of 3 so we can CSE the 3 for the multiplication.

double foo (double x)
{
  return (x + -3.) * 3.;
}

Comment 6 Segher Boessenkool 2020-09-16 17:46:35 UTC

Confirmed.

Maaybe cse2 should do this?

Comment 7 Richard Biener 2022-01-11 12:12:00 UTC

Another possibility would be to do this on GIMPLE, creating parts of the constant pool early with CONST_DECLs and loads from them for constants that are never legitimate (immediate) in instructions.

Comment 8 Segher Boessenkool 2022-01-11 12:54:48 UTC

(In reply to Richard Biener from comment #7)
> Another possibility would be to do this on GIMPLE, creating parts of the
> constant pool early with CONST_DECLs and loads from them for constants that
> are never legitimate (immediate) in instructions.

How can Gimple know this though?  Gimple does not know what instructions will
be generated.

The constant pools are a very machine-specific concept, poorly suited to Gimple.

What abstraction does Gimple have for immediates currently?

Comment 9 Richard Biener 2022-01-11 13:14:25 UTC

(In reply to Segher Boessenkool from comment #8)
> (In reply to Richard Biener from comment #7)
> > Another possibility would be to do this on GIMPLE, creating parts of the
> > constant pool early with CONST_DECLs and loads from them for constants that
> > are never legitimate (immediate) in instructions.
> 
> How can Gimple know this though?  Gimple does not know what instructions will
> be generated.
> 
> The constant pools are a very machine-specific concept, poorly suited to
> Gimple.

Sure.

> What abstraction does Gimple have for immediates currently?

There's no "abstraction" for immediates in GIMPLE, likewise for symbolic
addresses.