This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFA: -mfpmath=sse -fpic vs double constants


Compiling a simple function like

double foo(double x) { return x+1.0; }

on x86 with -O2 -march=pentium4 -mtune=prescott -mfpmath=sse -fpic, the load of 1.0 is done as

cvtss2sd .LC0@GOTOFF(%ecx), %xmm0

(this is Linux, the same happens on Darwin).
This is not really a good idea, as movsd of a double-precision 1.0 is faster.
The change from double to single precision is done in compress_float_constant,
and there's no cost computation there; presumably the RTL optimizers are expected
to change it back if that's beneficial.


Without -fpic, this does happen in cse_insn. (mem/u/i:SF (symbol_ref/u:SI ("*.LC0")
gets run through fold_rtx, which recognizes this as a pool constant. This causes the
known equivalent CONST_DOUBLE 1.0 to be run through force_const_mem,
producing (mem/u/i:DF (symbol_ref/u:SI ("*.LC1"). Which is then tried in place
of the FLOAT_EXTEND, and selected as valid and cheaper. This all seems to
be working as expected.


With -fpic, first, fold_rtx doesn't recognize the PIC form as representing a constant,
so cse_insn never tries forcing the CONST_DOUBLE into memory. Hacking around
that doesn't help, because force_const_mem doesn't produce the PIC form of
constant reference, even though we're in PIC mode; we get the same
(mem/u/i:DF (symbol_ref/u:SI ("*.LC1"), which doesn't test as valid in PIC mode (correctly).


At this point I'm wondering if this is the right place to be attacking the problem at all.
Advice? Thanks.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]