This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Improve other 13 define_insns
- From: Kirill Yukhin <kirill dot yukhin at gmail dot com>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Uros Bizjak <ubizjak at gmail dot com>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 13 May 2016 14:46:36 +0300
- Subject: Re: [PATCH] Improve other 13 define_insns
- Authentication-results: sourceware.org; auth=none
- References: <20160504194302 dot GO26501 at tucnak dot zalov dot cz> <20160512141959 dot GC28945 at msticlxl57 dot ims dot intel dot com> <20160512184208 dot GF28550 at tucnak dot redhat dot com>
Hello,
On 12 May 20:42, Jakub Jelinek wrote:
> On Thu, May 12, 2016 at 05:20:02PM +0300, Kirill Yukhin wrote:
> > > 2016-05-04 Jakub Jelinek <jakub@redhat.com>
> > >
> > > * config/i386/sse.md (sse_shufps_<mode>, sse_storehps, sse_loadhps,
> > > sse_storelps, sse_movss, avx2_vec_dup<mode>, avx2_vec_dupv8sf_1,
> > > sse2_shufpd_<mode>, sse2_storehpd, sse2_storelpd, sse2_loadhpd,
> > > sse2_loadlpd, sse2_movsd): Use v instead of x in vex or maybe_vex
> > > alternatives, use maybe_evex instead of vex in prefix.
> > >
> > > ;; Avoid combining registers from different units in a single alternative,
> > > ;; see comment above inline_secondary_memory_needed function in i386.c
> > > (define_insn "sse2_storehpd"
> > > - [(set (match_operand:DF 0 "nonimmediate_operand" "=m,x,x,x,*f,r")
> > > + [(set (match_operand:DF 0 "nonimmediate_operand" "=m,x,v,x,*f,r")
> > > (vec_select:DF
> > > - (match_operand:V2DF 1 "nonimmediate_operand" " x,0,x,o,o,o")
> > > + (match_operand:V2DF 1 "nonimmediate_operand" " v,0,v,o,o,o")
> > Same (as [1]) here.
> > Testing this fix:
> > @@ -8426,7 +8426,7 @@
> > ;; Avoid combining registers from different units in a single alternative,
> > ;; see comment above inline_secondary_memory_needed function in i386.c
> > (define_insn "sse2_storehpd"
> > - [(set (match_operand:DF 0 "nonimmediate_operand" "=m,x,v,x,*f,r")
> > + [(set (match_operand:DF 0 "nonimmediate_operand" "=m,x,Yv,x,*f,r")
> > (vec_select:DF
> > (match_operand:V2DF 1 "nonimmediate_operand" " v,0,v,o,o,o")
> >
> >
>
> Sorry for that, yes, this is needed.
NP. Below is the patch which fixes both issues.
It also revealed, that for "*and" pattern 32 byte long
internal buffer is not enough.
I've extended bunch of such buffers to 128 bytes.
Probably we might want to re-factor all static char arrays w/
std::string or at least check how many bytes snprintf actually prints
and ICE if overflow.
Bootstrapped & regtested on x86, 32/64b. Will check into main trunk.
gcc/
* gcc/config/i386/sse.md (define_insn "*andnot<mode>3"): Extend static
array to 128 chars.
(define_insn "*andnottf3"): Ditto.
(define_insn "*<code><mode>3"/any_logic): Ditto.
(define_insn "*<code>tf3"/any_logic): Ditto.
(define_insn "*vec_concatv2sf_sse4_1"): Use Yv constraint for scalar
operand to block AVX-512VL insn variant emit when it is not enabled.
(define_insn "sse2_storehpd"): Ditto.
--
Thanks, K
>
> Jakub
commit fa17bc4026602e7bc79fd818ab97c991864c2f7c
Author: Kirill Yukhin <kirill.yukhin@intel.com>
Date: Fri May 13 12:38:50 2016 +0300
AVX-512. Fix constraints for scalar patterns. Extend 32b static buffers.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index d77227a..1ece86f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -3027,7 +3027,7 @@
(match_operand:MODEF 2 "register_operand" "x,x,v,v")))]
"SSE_FLOAT_MODE_P (<MODE>mode)"
{
- static char buf[32];
+ static char buf[128];
const char *ops;
const char *suffix
= (get_attr_mode (insn) == MODE_V4SF) ? "ps" : "<ssevecmodesuffix>";
@@ -3094,7 +3094,7 @@
(match_operand:TF 2 "vector_operand" "xBm,xm,vm,v")))]
"TARGET_SSE"
{
- static char buf[32];
+ static char buf[128];
const char *ops;
const char *tmp
= (which_alternative >= 2 ? "pandnq"
@@ -3150,7 +3150,7 @@
(match_operand:MODEF 2 "register_operand" "x,x,v,v")))]
"SSE_FLOAT_MODE_P (<MODE>mode)"
{
- static char buf[32];
+ static char buf[128];
const char *ops;
const char *suffix
= (get_attr_mode (insn) == MODE_V4SF) ? "ps" : "<ssevecmodesuffix>";
@@ -3225,7 +3225,7 @@
"TARGET_SSE
&& ix86_binary_operator_ok (<CODE>, TFmode, operands)"
{
- static char buf[32];
+ static char buf[128];
const char *ops;
const char *tmp
= (which_alternative >= 2 ? "p<logic>q"
@@ -6546,12 +6546,12 @@
;; unpcklps with register source since it is shorter.
(define_insn "*vec_concatv2sf_sse4_1"
[(set (match_operand:V2SF 0 "register_operand"
- "=Yr,*x,v,Yr,*x,v,v,*y ,*y")
+ "=Yr,*x, v,Yr,*x,v,v,*y ,*y")
(vec_concat:V2SF
(match_operand:SF 1 "nonimmediate_operand"
- " 0, 0,v, 0,0, v,m, 0 , m")
+ " 0, 0,Yv, 0,0, v,m, 0 , m")
(match_operand:SF 2 "vector_move_operand"
- " Yr,*x,v, m,m, m,C,*ym, C")))]
+ " Yr,*x,Yv, m,m, m,C,*ym, C")))]
"TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
"@
unpcklps\t{%2, %0|%0, %2}
@@ -8426,9 +8426,9 @@
;; Avoid combining registers from different units in a single alternative,
;; see comment above inline_secondary_memory_needed function in i386.c
(define_insn "sse2_storehpd"
- [(set (match_operand:DF 0 "nonimmediate_operand" "=m,x,v,x,*f,r")
+ [(set (match_operand:DF 0 "nonimmediate_operand" "=m,x,Yv,x,*f,r")
(vec_select:DF
- (match_operand:V2DF 1 "nonimmediate_operand" " v,0,v,o,o,o")
+ (match_operand:V2DF 1 "nonimmediate_operand" " v,0, v,o,o,o")
(parallel [(const_int 1)])))]
"TARGET_SSE2 && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
"@