This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/72863] Powerpc64le: redundant swaps when using vec_vsx_ld/st
- From: "wschmidt at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Wed, 10 Aug 2016 20:20:14 +0000
- Subject: [Bug target/72863] Powerpc64le: redundant swaps when using vec_vsx_ld/st
- Auto-submitted: auto-generated
- References: <bug-72863-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72863
--- Comment #3 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
This is a phase ordering issue involving the expanders for the built-ins. In
vsx.md:
;; Explicit load/store expanders for the builtin functions
(define_expand "vsx_load_<mode>"
[(set (match_operand:VSX_M 0 "vsx_register_operand" "")
(match_operand:VSX_M 1 "memory_operand" ""))]
"VECTOR_MEM_VSX_P (<MODE>mode)"
"")
(define_expand "vsx_store_<mode>"
[(set (match_operand:VSX_M 0 "memory_operand" "")
(match_operand:VSX_M 1 "vsx_register_operand" ""))]
"VECTOR_MEM_VSX_P (<MODE>mode)"
"")
This delays expanding into swaps until after the next split phase, instead of
right at expand time. Since the swap optimization runs immediately following
expand, this is too late.
A normal assignment, on the other hand, goes through the mov expander in
vector.md, which takes us here:
if (!BYTES_BIG_ENDIAN
&& VECTOR_MEM_VSX_P (<MODE>mode)
&& !TARGET_P9_VECTOR
&& !gpr_or_gpr_p (operands[0], operands[1])
&& (memory_operand (operands[0], <MODE>mode)
^ memory_operand (operands[1], <MODE>mode)))
{
rs6000_emit_le_vsx_move (operands[0], operands[1], <MODE>mode);
DONE;
}
thus generating the permuting load/store with the register permute.
We should be able to add similar logic to the intrinsic expanders in order to
get the swaps to show up in time to be optimized.