[Bug middle-end/19988] [4.0 Regression] pessimizes fp multiply-add/subtract combo

roger at eyesopen dot com gcc-bugzilla@gcc.gnu.org
Wed Feb 16 23:47:00 GMT 2005

------- Additional Comments From roger at eyesopen dot com  2005-02-16 19:17 -------
Hmm.  I don't think the problem in this case is at the tree-level, where I think
keeping X-(Y*C) and -(Y*C) as a more canonical X + (Y*C') and Y*C' should help
with reassociation and other tree-ssa optimizations.  Indeed, it's these types
of transformations that have enabled the use of fmadd on the PowerPC for mainline.

The regression however comes from the (rare) interaction when a floating point
constant and its negative now need to be stored in the constant pool.  It's only
when X and -X are required in a function (potentially in short succession) that
this is a problem, and then only on machines that need to load floating point
constant from memory (AVR and other platforms with immediate floating point
constants, for example, are unaffected).

Some aspects of keeping X and -X in the constant pool were addressed by my
patch quoted in comment #1, which attempts to keep floating point constant
positive *when* this doesn't interfere with GCC's other optimizations.

I think the correct solution to this regression is to improve CSE/GCSE to
recognize that X*C can be synthesized from a previously available X*(-C) at
the cost of a negation, which is presumably cheaper than a multiplication on
most platforms.  Indeed, there's probably a set of targets for which loading
a positive from a constant pool and then negating it, is cheaper than loading
both a positive constant and then loading a negative constant.

Unfortunately, I doubt whether it'll be possible to siumultaneously address
this performance regression without reintroducing the 3.x issue mentioned in
the original "PS".  I doubt on many platforms a two multiply-adds are much
faster than a single floating point multiplication whose result is shared by
two additions.  Though again it might be possible to do something at the RTL
level, especially if duplicating the multiplication is a win with -Os.



More information about the Gcc-bugs mailing list