[Bug target/72863] Powerpc64le: redundant swaps when using vec_vsx_ld/st

wschmidt at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Aug 10 20:20:00 GMT 2016


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72863

--- Comment #3 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
This is a phase ordering issue involving the expanders for the built-ins.  In
vsx.md:

;; Explicit  load/store expanders for the builtin functions
(define_expand "vsx_load_<mode>"
  [(set (match_operand:VSX_M 0 "vsx_register_operand" "")
        (match_operand:VSX_M 1 "memory_operand" ""))]
  "VECTOR_MEM_VSX_P (<MODE>mode)"
  "")

(define_expand "vsx_store_<mode>"
  [(set (match_operand:VSX_M 0 "memory_operand" "")
        (match_operand:VSX_M 1 "vsx_register_operand" ""))]
  "VECTOR_MEM_VSX_P (<MODE>mode)"
  "")

This delays expanding into swaps until after the next split phase, instead of
right at expand time.  Since the swap optimization runs immediately following
expand, this is too late.

A normal assignment, on the other hand, goes through the mov expander in
vector.md, which takes us here:

  if (!BYTES_BIG_ENDIAN
      && VECTOR_MEM_VSX_P (<MODE>mode)
      && !TARGET_P9_VECTOR
      && !gpr_or_gpr_p (operands[0], operands[1])
      && (memory_operand (operands[0], <MODE>mode)
          ^ memory_operand (operands[1], <MODE>mode)))
    {
      rs6000_emit_le_vsx_move (operands[0], operands[1], <MODE>mode);
      DONE;
    }

thus generating the permuting load/store with the register permute.

We should be able to add similar logic to the intrinsic expanders in order to
get the swaps to show up in time to be optimized.


More information about the Gcc-bugs mailing list