Bug 84361

Summary: Fails to use vfmaddsub* for complex multiplication
Product: gcc Reporter: Richard Biener <rguenth>
Component: targetAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED DUPLICATE    
Severity: normal CC: crazylht, dimhen, hjl.tools, jakub, kyukhin
Priority: P3 Keywords: missed-optimization
Version: 8.0   
Target Milestone: ---   
See Also: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904
Host: Target: x86_64-*-*, i?86-*-*
Build: Known to work:
Known to fail: Last reconfirmed:
Bug Depends on: 81904    
Bug Blocks: 53947, 54939    

Description Richard Biener 2018-02-13 12:29:51 UTC
I see

        vfmadd132ps     %ymm12, %ymm8, %ymm2
        vfmsub132ps     %ymm12, %ymm8, %ymm7
        vblendps        $170, %ymm2, %ymm7, %ymm7

generated from

  _298 = -vect__174.663_871;
  vect__38.664_872 = vect__173.659_831 * vect__178.660_844 + _298;
  vect__38.665_873 = vect__173.659_831 * vect__178.660_844 + vect__174.663_871;
  _874 = VEC_PERM_EXPR <vect__38.664_872, vect__38.665_873, { 0, 9, 2, 11, 4, 13, 6, 15 }>;

which is similar to the addsub cases we already handle.  combine sees

(insn 391 390 392 21 (set (reg:V8SF 845 [ vect__38.664 ])
        (fma:V8SF (reg:V8SF 440 [ vect__173.659 ])
            (reg:V8SF 445 [ vect__178.660 ])
            (neg:V8SF (reg:V8SF 457 [ vect__174.663 ])))) 1886 {*fma_fmsub_v8sf}
     (nil))
(insn 392 391 393 21 (set (reg:V8SF 846 [ vect__38.665 ])
        (fma:V8SF (reg:V8SF 440 [ vect__173.659 ])
            (reg:V8SF 445 [ vect__178.660 ])
            (reg:V8SF 457 [ vect__174.663 ]))) 1842 {*fma_fmadd_v8sf}
     (expr_list:REG_DEAD (reg:V8SF 457 [ vect__174.663 ])
        (expr_list:REG_DEAD (reg:V8SF 445 [ vect__178.660 ])
            (expr_list:REG_DEAD (reg:V8SF 440 [ vect__173.659 ])
                (nil)))))
(insn 393 392 394 21 (set (reg:V8SF 460 [ _874 ])
        (vec_merge:V8SF (reg:V8SF 846 [ vect__38.665 ])
            (reg:V8SF 845 [ vect__38.664 ])
            (const_int 170 [0xaa]))) 3885 {avx_blendps256}
     (expr_list:REG_DEAD (reg:V8SF 846 [ vect__38.665 ])
        (expr_list:REG_DEAD (reg:V8SF 845 [ vect__38.664 ])
            (nil))))

I can find <avx512>_fmaddsub_<mode>_mask<round_name> which looks like
a patter for AVX512 but I miss the AVX256 case?  The non-fma
patterns look like

(define_insn "avx_addsubv8sf3"
  [(set (match_operand:V8SF 0 "register_operand" "=x")
        (vec_merge:V8SF
          (minus:V8SF
            (match_operand:V8SF 1 "register_operand" "x")
            (match_operand:V8SF 2 "nonimmediate_operand" "xm"))
          (plus:V8SF (match_dup 1) (match_dup 2))
          (const_int 85)))]
  "TARGET_AVX"
  "vaddsubps\t{%2, %1, %0|%0, %1, %2}"


This occurs in polyhedron capacita in the hot loop in fourir.  If you
build with -Ofast -march=core-avx2 -fno-vect-cost-model you should see the above.
Comment 1 Richard Biener 2018-02-13 12:53:52 UTC
Note the fma variants have addsub and subadd variants as well.
Comment 2 Marc Glisse 2018-02-17 19:47:49 UTC
Related to one part of bug 81904.
Comment 3 Richard Biener 2023-07-21 12:31:42 UTC
Really a duplicate of PR81904 which tracks a little bit more.

*** This bug has been marked as a duplicate of bug 81904 ***