Bug 84361 - Fails to use vfmaddsub* for complex multiplication
Summary: Fails to use vfmaddsub* for complex multiplication
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 8.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2018-02-13 12:29 UTC by Richard Biener
Modified: 2018-02-17 19:47 UTC (History)
3 users (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2018-02-13 12:29:51 UTC
I see

        vfmadd132ps     %ymm12, %ymm8, %ymm2
        vfmsub132ps     %ymm12, %ymm8, %ymm7
        vblendps        $170, %ymm2, %ymm7, %ymm7

generated from

  _298 = -vect__174.663_871;
  vect__38.664_872 = vect__173.659_831 * vect__178.660_844 + _298;
  vect__38.665_873 = vect__173.659_831 * vect__178.660_844 + vect__174.663_871;
  _874 = VEC_PERM_EXPR <vect__38.664_872, vect__38.665_873, { 0, 9, 2, 11, 4, 13, 6, 15 }>;

which is similar to the addsub cases we already handle.  combine sees

(insn 391 390 392 21 (set (reg:V8SF 845 [ vect__38.664 ])
        (fma:V8SF (reg:V8SF 440 [ vect__173.659 ])
            (reg:V8SF 445 [ vect__178.660 ])
            (neg:V8SF (reg:V8SF 457 [ vect__174.663 ])))) 1886 {*fma_fmsub_v8sf}
     (nil))
(insn 392 391 393 21 (set (reg:V8SF 846 [ vect__38.665 ])
        (fma:V8SF (reg:V8SF 440 [ vect__173.659 ])
            (reg:V8SF 445 [ vect__178.660 ])
            (reg:V8SF 457 [ vect__174.663 ]))) 1842 {*fma_fmadd_v8sf}
     (expr_list:REG_DEAD (reg:V8SF 457 [ vect__174.663 ])
        (expr_list:REG_DEAD (reg:V8SF 445 [ vect__178.660 ])
            (expr_list:REG_DEAD (reg:V8SF 440 [ vect__173.659 ])
                (nil)))))
(insn 393 392 394 21 (set (reg:V8SF 460 [ _874 ])
        (vec_merge:V8SF (reg:V8SF 846 [ vect__38.665 ])
            (reg:V8SF 845 [ vect__38.664 ])
            (const_int 170 [0xaa]))) 3885 {avx_blendps256}
     (expr_list:REG_DEAD (reg:V8SF 846 [ vect__38.665 ])
        (expr_list:REG_DEAD (reg:V8SF 845 [ vect__38.664 ])
            (nil))))

I can find <avx512>_fmaddsub_<mode>_mask<round_name> which looks like
a patter for AVX512 but I miss the AVX256 case?  The non-fma
patterns look like

(define_insn "avx_addsubv8sf3"
  [(set (match_operand:V8SF 0 "register_operand" "=x")
        (vec_merge:V8SF
          (minus:V8SF
            (match_operand:V8SF 1 "register_operand" "x")
            (match_operand:V8SF 2 "nonimmediate_operand" "xm"))
          (plus:V8SF (match_dup 1) (match_dup 2))
          (const_int 85)))]
  "TARGET_AVX"
  "vaddsubps\t{%2, %1, %0|%0, %1, %2}"


This occurs in polyhedron capacita in the hot loop in fourir.  If you
build with -Ofast -march=core-avx2 -fno-vect-cost-model you should see the above.
Comment 1 Richard Biener 2018-02-13 12:53:52 UTC
Note the fma variants have addsub and subadd variants as well.
Comment 2 Marc Glisse 2018-02-17 19:47:49 UTC
Related to one part of bug 81904.