This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers
- From: "wschmidt at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Mon, 15 Aug 2016 17:35:21 +0000
- Subject: [Bug rtl-optimization/74585] powerpc64: Very poor code generation for homogeneous vector aggregates passed in registers
- Authentication-results: sourceware.org; auth=none
- Auto-submitted: auto-generated
- References: <bug-74585-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=74585
--- Comment #11 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
With the original test case, -mcpu=power8 is problematic because of the use of
the "swapping stores," whose RHS is a vec_select rather than a register or
subreg. This prevents us from saving the RHS of the store for use in replacing
subsequent loads, running afoul of this logic in dse.c:record_store ():
if (GET_CODE (body) == SET
/* No place to keep the value after ra. */
&& !reload_completed
&& (REG_P (SET_SRC (body)) <= this part
|| GET_CODE (SET_SRC (body)) == SUBREG
|| CONSTANT_P (SET_SRC (body)))
&& !MEM_VOLATILE_P (mem)
/* Sometimes the store and reload is used for truncation and
rounding. */
&& !(FLOAT_MODE_P (GET_MODE (mem)) && (flag_float_store)))
We can circumvent this if we can use stvx to force the parameters to the stack,
which is legal since the stack slots are properly aligned.
However, even using -mcpu=power9, we don't handle removing the stores and
replacing the partial loads with register logic.