RFA: -mfpmath=sse -fpic vs double constants
Dale Johannesen
dalej@apple.com
Fri Jul 8 00:29:00 GMT 2005
Compiling a simple function like
double foo(double x) { return x+1.0; }
on x86 with -O2 -march=pentium4 -mtune=prescott -mfpmath=sse -fpic, the
load of 1.0 is done as
cvtss2sd .LC0@GOTOFF(%ecx), %xmm0
(this is Linux, the same happens on Darwin).
This is not really a good idea, as movsd of a double-precision 1.0 is
faster.
The change from double to single precision is done in
compress_float_constant,
and there's no cost computation there; presumably the RTL optimizers
are expected
to change it back if that's beneficial.
Without -fpic, this does happen in cse_insn. (mem/u/i:SF
(symbol_ref/u:SI ("*.LC0")
gets run through fold_rtx, which recognizes this as a pool constant.
This causes the
known equivalent CONST_DOUBLE 1.0 to be run through force_const_mem,
producing (mem/u/i:DF (symbol_ref/u:SI ("*.LC1"). Which is then tried
in place
of the FLOAT_EXTEND, and selected as valid and cheaper. This all seems
to
be working as expected.
With -fpic, first, fold_rtx doesn't recognize the PIC form as
representing a constant,
so cse_insn never tries forcing the CONST_DOUBLE into memory. Hacking
around
that doesn't help, because force_const_mem doesn't produce the PIC form
of
constant reference, even though we're in PIC mode; we get the same
(mem/u/i:DF (symbol_ref/u:SI ("*.LC1"), which doesn't test as valid in
PIC mode (correctly).
At this point I'm wondering if this is the right place to be attacking
the problem at all.
Advice? Thanks.
More information about the Gcc
mailing list