This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/84361] New: Fails to use vfmaddsub* for complex multiplication
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 13 Feb 2018 12:29:51 +0000
- Subject: [Bug target/84361] New: Fails to use vfmaddsub* for complex multiplication
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84361
Bug ID: 84361
Summary: Fails to use vfmaddsub* for complex multiplication
Product: gcc
Version: 8.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Blocks: 53947
Target Milestone: ---
Target: x86_64-*-*, i?86-*-*
I see
vfmadd132ps %ymm12, %ymm8, %ymm2
vfmsub132ps %ymm12, %ymm8, %ymm7
vblendps $170, %ymm2, %ymm7, %ymm7
generated from
_298 = -vect__174.663_871;
vect__38.664_872 = vect__173.659_831 * vect__178.660_844 + _298;
vect__38.665_873 = vect__173.659_831 * vect__178.660_844 + vect__174.663_871;
_874 = VEC_PERM_EXPR <vect__38.664_872, vect__38.665_873, { 0, 9, 2, 11, 4,
13, 6, 15 }>;
which is similar to the addsub cases we already handle. combine sees
(insn 391 390 392 21 (set (reg:V8SF 845 [ vect__38.664 ])
(fma:V8SF (reg:V8SF 440 [ vect__173.659 ])
(reg:V8SF 445 [ vect__178.660 ])
(neg:V8SF (reg:V8SF 457 [ vect__174.663 ])))) 1886
{*fma_fmsub_v8sf}
(nil))
(insn 392 391 393 21 (set (reg:V8SF 846 [ vect__38.665 ])
(fma:V8SF (reg:V8SF 440 [ vect__173.659 ])
(reg:V8SF 445 [ vect__178.660 ])
(reg:V8SF 457 [ vect__174.663 ]))) 1842 {*fma_fmadd_v8sf}
(expr_list:REG_DEAD (reg:V8SF 457 [ vect__174.663 ])
(expr_list:REG_DEAD (reg:V8SF 445 [ vect__178.660 ])
(expr_list:REG_DEAD (reg:V8SF 440 [ vect__173.659 ])
(nil)))))
(insn 393 392 394 21 (set (reg:V8SF 460 [ _874 ])
(vec_merge:V8SF (reg:V8SF 846 [ vect__38.665 ])
(reg:V8SF 845 [ vect__38.664 ])
(const_int 170 [0xaa]))) 3885 {avx_blendps256}
(expr_list:REG_DEAD (reg:V8SF 846 [ vect__38.665 ])
(expr_list:REG_DEAD (reg:V8SF 845 [ vect__38.664 ])
(nil))))
I can find <avx512>_fmaddsub_<mode>_mask<round_name> which looks like
a patter for AVX512 but I miss the AVX256 case? The non-fma
patterns look like
(define_insn "avx_addsubv8sf3"
[(set (match_operand:V8SF 0 "register_operand" "=x")
(vec_merge:V8SF
(minus:V8SF
(match_operand:V8SF 1 "register_operand" "x")
(match_operand:V8SF 2 "nonimmediate_operand" "xm"))
(plus:V8SF (match_dup 1) (match_dup 2))
(const_int 85)))]
"TARGET_AVX"
"vaddsubps\t{%2, %1, %0|%0, %1, %2}"
This occurs in polyhedron capacita in the hot loop in fourir. If you
build with -Ofast -march=core-avx2 -fno-vect-cost-model you should see the
above.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations