This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/55295] [SH] Add support for fipr instruction
- From: "olegendo at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 09 Dec 2014 22:37:07 +0000
- Subject: [Bug target/55295] [SH] Add support for fipr instruction
- Auto-submitted: auto-generated
- References: <bug-55295-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55295
--- Comment #10 from Oleg Endo <olegendo at gcc dot gnu.org> ---
(In reply to Oleg Endo from comment #9)
> Created attachment 34213 [details]
> Combine patterns for matching fipr
>
> An updated patch for trunk. As for the redundant fp moves and/or ferries
> through fpul, those seem to be caused by the lack of various vec_* patterns.
> See also PR 13423.
An alternative pattern for the core fipr insn could be:
(define_insn "fipr_compact"
[(set (match_operand:V4SF 0 "fp_arith_reg_operand" "=f")
(vec_concat:V4SF
(vec_concat:V2SF
(vec_select:SF (match_operand:V4SF 1 "fp_arith_reg_operand" "%0")
(parallel [(const_int 0)]))
(vec_select:SF (match_dup 1) (parallel [(const_int 1)])))
(vec_concat:V2SF
(vec_select:SF (match_dup 1) (parallel [(const_int 2)]))
(plus:SF
(plus:SF (vec_select:SF (mult:V4SF (match_dup 1)
(match_operand:V4SF 2
"fp_arith_reg_operand" "f"))
(parallel [(const_int 0)]))
(vec_select:SF (mult:V4SF (match_dup 1) (match_dup 2))
(parallel [(const_int 1)])))
(plus:SF (vec_select:SF (mult:V4SF (match_dup 1) (match_dup 2))
(parallel [(const_int 2)]))
(vec_select:SF (mult:V4SF (match_dup 1) (match_dup 2))
(parallel [(const_int 3)])))))))
(clobber (reg:SI FPSCR_STAT_REG))
(use (reg:SI FPSCR_MODES_REG))]
"TARGET_SH4"
"fipr %2,%0"
[(set_attr "type" "fp")
(set_attr "fp_mode" "single")])
However, I'm not sure whether register allocation understands this properly.
Matching fipr insn during combine has other issues, such as v4sf register
construction from individual sf values. Before investigating the issue at
combine level, playing along with the vectorizer seems more promising. For
that vector load/store patterns need to be added (PR 13423) first.