This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Complex move by parts (PR rtl-optimization/20306)
- From: David Edelsohn <dje at watson dot ibm dot com>
- To: Roger Sayle <roger at eyesopen dot com>, Richard Henderson <rth at redhat dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 10 Mar 2005 15:02:40 -0500
- Subject: Re: [PATCH] Complex move by parts (PR rtl-optimization/20306)
- References: <Pine.LNX.4.44.0503101238060.13922-100000@www.eyesopen.com>
>>>>> Roger Sayle writes:
Roger> The issue is that different processors have different preferences,
Roger> even for the mem->mem case. On PowerPC, apparently it's faster to
Roger> block move an array of doubles via FP registers than via integer
Roger> registers, as the peak FPU<->MEM bandwidth is higher than the peak
Roger> CPU<->MEM bandwidth. Clearly, this isn't the case on IA-32 for
Roger> memory to memory moves, which are most efficiently implemented by
Roger> using integer load/stores and/or IA-32's block move instructions.
To expand on this further, the current algorithm will set try_int
to true for the many of the complex float cases of interest.
I can make block_move conditional on try_int and add another test
to the decision tree setting try_int to false for MODE_FLOAT, but the
question is whether this generic cost model is correct for all targets.
Do we still need a target hook to force try_into to false instead of a
generic MODE_FLOAT test?
David