This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] i386: Generate standard floating point scalar operation patterns
- From: "H.J. Lu" <hjl dot tools at gmail dot com>
- To: GCC Patches <gcc-patches at gcc dot gnu dot org>, Uros Bizjak <ubizjak at gmail dot com>, Marc Glisse <marc dot glisse at inria dot fr>, Richard Sandiford <richard dot sandiford at arm dot com>
- Date: Tue, 21 May 2019 08:54:18 -0700
- Subject: [PATCH] i386: Generate standard floating point scalar operation patterns
- References: <20190207174942.20825-1-hjl.tools@gmail.com> <CAMe9rOpukFLJ1qGv8kxoqaVf0U2Ktn-XOMj8aDrd9JORXa-w6Q@mail.gmail.com> <mpt5zqbe7a6.fsf@arm.com>
On Wed, May 15, 2019 at 2:29 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> "H.J. Lu" <hjl.tools@gmail.com> writes:
> > On Thu, Feb 7, 2019 at 9:49 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >>
> >> Standard scalar operation patterns which preserve the rest of the vector
> >> look like
> >>
> >> (vec_merge:V2DF
> >> (vec_duplicate:V2DF
> >> (op:DF (vec_select:DF (reg/v:V2DF 85 [ x ])
> >> (parallel [ (const_int 0 [0])]))
> >> (reg:DF 87))
> >> (reg/v:V2DF 85 [ x ])
> >> (const_int 1 [0x1])]))
> >>
> >> Add such pattens to i386 backend and convert VEC_CONCAT patterns to
> >> standard standard scalar operation patterns.
>
> It looks like there's some variety in the patterns used, e.g.:
>
> (define_insn "<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>"
> [(set (match_operand:VF_128 0 "register_operand" "=x,v")
> (vec_merge:VF_128
> (smaxmin:VF_128
> (match_operand:VF_128 1 "register_operand" "0,v")
> (match_operand:VF_128 2 "vector_operand" "xBm,<round_saeonly_scalar_constraint>"))
> (match_dup 1)
> (const_int 1)))]
> "TARGET_SSE"
> "@
> <maxmin_float><ssescalarmodesuffix>\t{%2, %0|%0, %<iptr>2}
> v<maxmin_float><ssescalarmodesuffix>\t{<round_saeonly_scalar_mask_op3>%2, %1, %0<mask_scalar_operand3>|%0<mask_scalar_operand3>, %1, %<iptr>2<round_saeonly_scalar_mask_op3>}"
> [(set_attr "isa" "noavx,avx")
> (set_attr "type" "sse")
> (set_attr "btver2_sse_attr" "maxmin")
> (set_attr "prefix" "<round_saeonly_scalar_prefix>")
> (set_attr "mode" "<ssescalarmode>")])
>
> makes the operand a full vector operation, which seems simpler.
This pattern is used to implement scalar smaxmin intrinsics.
> The above would then be:
>
> (vec_merge:V2DF
> (op:V2DF
> (reg:V2DF 85)
> (vec_duplicate:V2DF (reg:DF 87)))
> (reg/v:V2DF 85 [ x ])
> (const_int 1 [0x1])]))
>
> I guess technically the two have different faulting behaviour though,
> since the smaxmin gets applied to all elements, not just element 0.
This is the issue. We don't use the correct mode for scalar instructions:
---
#include <immintrin.h>
__m128d
foo1 (__m128d x, double *p)
{
__m128d y = _mm_load_sd (p);
return _mm_max_pd (x, y);
}
---
movq (%rdi), %xmm1
maxpd %xmm1, %xmm0
ret
Here is the updated patch to add standard floating point scalar
operation patterns to i386 backend. Then we can do
---
#include <immintrin.h>
extern __inline __m128d __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
_new_mm_max_pd (__m128d __A, __m128d __B)
{
__A[0] = __A[0] > __B[0] ? __A[0] : __B[0];
return __A;
}
__m128d
foo2 (__m128d x, double *p)
{
__m128d y = _mm_load_sd (p);
return _new_mm_max_pd (x, y);
}
maxsd (%rdi), %xmm0
ret
We should use generic vector operations to implement i386 intrinsics
as much as we can.
> The patch seems very specific. E.g. why just PLUS, MINUS, MULT and DIV?
This patch only adds +, -, *, /, > and <. We can add more if there
are testcases
for them.
> Thanks,
> Richard
>
>
> >>
> >> gcc/
> >>
> >> PR target/54855
> >> * simplify-rtx.c (simplify_binary_operation_1): Convert
> >> VEC_CONCAT patterns to standard standard scalar operation
> >> patterns.
> >> * config/i386/sse.md (*<sse>_vm<plusminus_insn><mode>3): New.
> >> (*<sse>_vm<multdiv_mnemonic><mode>3): Likewise.
> >>
> >> gcc/testsuite/
> >>
> >> PR target/54855
> >> * gcc.target/i386/pr54855-1.c: New test.
> >> * gcc.target/i386/pr54855-2.c: Likewise.
> >> * gcc.target/i386/pr54855-3.c: Likewise.
> >> * gcc.target/i386/pr54855-4.c: Likewise.
> >> * gcc.target/i386/pr54855-5.c: Likewise.
> >> * gcc.target/i386/pr54855-6.c: Likewise.
> >> * gcc.target/i386/pr54855-7.c: Likewise.
> >
> > PING:
> >
> > https://gcc.gnu.org/ml/gcc-patches/2019-02/msg00398.html
Thanks.
--
H.J.
From 5d91bf264c89541a79ca8f9121264416ce307420 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Sun, 3 Feb 2019 09:16:23 -0800
Subject: [PATCH] i386: Generate standard floating point scalar operation
patterns
Standard floating point scalar operation patterns for combiner, which
preserve the rest of the vector, look like
(vec_merge:V2DF
(vec_duplicate:V2DF (reg:DF 87))
(reg/v:V2DF 85 [ x ])
(const_int 1 [0x1])]))
and
(vec_merge:V2DF
(vec_duplicate:V2DF
(op:DF (vec_select:DF (reg/v:V2DF 85 [ x ])
(parallel [ (const_int 0 [0])]))
(reg:DF 87))
(reg/v:V2DF 85 [ x ])
(const_int 1 [0x1])]))
This patch adds and generates such standard floating point scalar
operation patterns for +, -, *, /, > and <.
Tested on x86-64.
gcc/
PR target/54855
* config/i386/i386-expand.c (ix86_expand_vector_set): Generate
standard scalar operation pattern for V2DF.
* config/i386/sse.md (*<sse>_vm<plusminus_insn><mode>3): New.
(*<sse>_vm<multdiv_mnemonic><mode>3): Likewise.
(*ieee_<ieee_maxmin><mode>3): Likewise.
(vec_setv2df_0): Likewise.
gcc/testsuite/
PR target/54855
* gcc.target/i386/pr54855-1.c: New test.
* gcc.target/i386/pr54855-2.c: Likewise.
* gcc.target/i386/pr54855-3.c: Likewise.
* gcc.target/i386/pr54855-4.c: Likewise.
* gcc.target/i386/pr54855-5.c: Likewise.
* gcc.target/i386/pr54855-6.c: Likewise.
* gcc.target/i386/pr54855-7.c: Likewise.
* gcc.target/i386/pr54855-8.c: Likewise.
* gcc.target/i386/pr54855-9.c: Likewise.
* gcc.target/i386/pr54855-10.c: Likewise.
---
gcc/config/i386/i386-expand.c | 12 +++
gcc/config/i386/sse.md | 88 ++++++++++++++++++++++
gcc/testsuite/gcc.target/i386/pr54855-1.c | 16 ++++
gcc/testsuite/gcc.target/i386/pr54855-10.c | 13 ++++
gcc/testsuite/gcc.target/i386/pr54855-2.c | 15 ++++
gcc/testsuite/gcc.target/i386/pr54855-3.c | 14 ++++
gcc/testsuite/gcc.target/i386/pr54855-4.c | 14 ++++
gcc/testsuite/gcc.target/i386/pr54855-5.c | 16 ++++
gcc/testsuite/gcc.target/i386/pr54855-6.c | 14 ++++
gcc/testsuite/gcc.target/i386/pr54855-7.c | 14 ++++
gcc/testsuite/gcc.target/i386/pr54855-8.c | 14 ++++
gcc/testsuite/gcc.target/i386/pr54855-9.c | 14 ++++
12 files changed, 244 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-10.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-2.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-3.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-4.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-5.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-6.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-7.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-8.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr54855-9.c
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 87e0973e1ca..18ab22cacb0 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -14092,6 +14092,17 @@ ix86_expand_vector_set (bool mmx_ok, rtx target, rtx val, int elt)
return;
case E_V2DFmode:
+ /* NB: For ELT == 0, use standard scalar operation patterns which
+ preserve the rest of the vector for combiner:
+
+ (vec_merge:V2DF
+ (vec_duplicate:V2DF (reg:DF))
+ (reg:V2DF)
+ (const_int 1))
+ */
+ if (elt == 0)
+ goto do_vec_merge;
+
{
rtx op0, op1;
@@ -14389,6 +14400,7 @@ quarter:
}
else if (use_vec_merge)
{
+do_vec_merge:
tmp = gen_rtx_VEC_DUPLICATE (mode, val);
tmp = gen_rtx_VEC_MERGE (mode, tmp, target,
GEN_INT (HOST_WIDE_INT_1U << elt));
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 677e7023eb2..f36537dfb3f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -1802,6 +1802,28 @@
(set_attr "type" "sseadd")
(set_attr "mode" "<MODE>")])
+;; Standard scalar operation patterns which preserve the rest of the
+;; vector for combiner.
+(define_insn "*<sse>_vm<plusminus_insn><mode>3"
+ [(set (match_operand:VF_128 0 "register_operand" "=x,v")
+ (vec_merge:VF_128
+ (vec_duplicate:VF_128
+ (plusminus:<ssescalarmode>
+ (vec_select:<ssescalarmode>
+ (match_operand:VF_128 1 "register_operand" "0,v")
+ (parallel [(const_int 0)]))
+ (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,vm")))
+ (match_dup 1)
+ (const_int 1)))]
+ "TARGET_SSE"
+ "@
+ <plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %0|%0, %<iptr>2}
+ v<plusminus_mnemonic><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %<iptr>2}"
+ [(set_attr "isa" "noavx,avx")
+ (set_attr "type" "sseadd")
+ (set_attr "prefix" "orig,vex")
+ (set_attr "mode" "<ssescalarmode>")])
+
(define_insn "<sse>_vm<plusminus_insn><mode>3<mask_scalar_name><round_scalar_name>"
[(set (match_operand:VF_128 0 "register_operand" "=x,v")
(vec_merge:VF_128
@@ -1856,6 +1878,29 @@
(set_attr "type" "ssemul")
(set_attr "mode" "<MODE>")])
+;; Standard scalar operation patterns which preserve the rest of the
+;; vector for combiner.
+(define_insn "*<sse>_vm<multdiv_mnemonic><mode>3"
+ [(set (match_operand:VF_128 0 "register_operand" "=x,v")
+ (vec_merge:VF_128
+ (vec_duplicate:VF_128
+ (multdiv:<ssescalarmode>
+ (vec_select:<ssescalarmode>
+ (match_operand:VF_128 1 "register_operand" "0,v")
+ (parallel [(const_int 0)]))
+ (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,vm")))
+ (match_dup 1)
+ (const_int 1)))]
+ "TARGET_SSE"
+ "@
+ <multdiv_mnemonic><ssescalarmodesuffix>\t{%2, %0|%0, %<iptr>2}
+ v<multdiv_mnemonic><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %<iptr>2}"
+ [(set_attr "isa" "noavx,avx")
+ (set_attr "type" "sse<multdiv_mnemonic>")
+ (set_attr "prefix" "orig,vex")
+ (set_attr "btver2_decode" "direct,double")
+ (set_attr "mode" "<ssescalarmode>")])
+
(define_insn "<sse>_vm<multdiv_mnemonic><mode>3<mask_scalar_name><round_scalar_name>"
[(set (match_operand:VF_128 0 "register_operand" "=x,v")
(vec_merge:VF_128
@@ -2205,6 +2250,30 @@
(set_attr "prefix" "<mask_prefix3>")
(set_attr "mode" "<MODE>")])
+;; Standard scalar operation patterns which preserve the rest of the
+;; vector for combiner.
+(define_insn "*ieee_<ieee_maxmin><mode>3"
+ [(set (match_operand:VF_128 0 "register_operand" "=x,v")
+ (vec_merge:VF_128
+ (vec_duplicate:VF_128
+ (unspec:<ssescalarmode>
+ [(vec_select:<ssescalarmode>
+ (match_operand:VF_128 1 "register_operand" "0,v")
+ (parallel [(const_int 0)]))
+ (match_operand:<ssescalarmode> 2 "nonimmediate_operand" "xm,vm")]
+ IEEE_MAXMIN))
+ (match_dup 1)
+ (const_int 1)))]
+ "TARGET_SSE"
+ "@
+ <ieee_maxmin><ssescalarmodesuffix>\t{%2, %0|%0, %2}
+ v<ieee_maxmin><ssescalarmodesuffix>\t{%2, %1, %0|%0, %1, %2}"
+ [(set_attr "isa" "noavx,avx")
+ (set_attr "type" "sseadd")
+ (set_attr "btver2_sse_attr" "maxmin")
+ (set_attr "prefix" "orig,vex")
+ (set_attr "mode" "<ssescalarmode>")])
+
(define_insn "<sse>_vm<code><mode>3<mask_scalar_name><round_saeonly_scalar_name>"
[(set (match_operand:VF_128 0 "register_operand" "=x,v")
(vec_merge:VF_128
@@ -7881,6 +7950,25 @@
[(set (match_dup 0) (match_dup 1))]
"operands[0] = adjust_address (operands[0], <ssescalarmode>mode, 0);")
+;; Standard scalar operation patterns which preserve the rest of the
+;; vector for combiner.
+(define_insn "vec_setv2df_0"
+ [(set (match_operand:V2DF 0 "register_operand" "=x,v,x,v")
+ (vec_merge:V2DF
+ (vec_duplicate:V2DF
+ (match_operand:DF 2 "nonimmediate_operand" " x,v,m,m"))
+ (match_operand:V2DF 1 "register_operand" " 0,v,0,v")
+ (const_int 1)))]
+ "TARGET_SSE2"
+ "@
+ movsd\t{%2, %0|%0, %2}
+ vmovsd\t{%2, %1, %0|%0, %1, %2}
+ movlpd\t{%2, %0|%0, %2}
+ vmovlpd\t{%2, %1, %0|%0, %1, %2}"
+ [(set_attr "isa" "noavx,avx,noavx,avx")
+ (set_attr "type" "ssemov")
+ (set_attr "mode" "DF")])
+
(define_expand "vec_set<mode>"
[(match_operand:V 0 "register_operand")
(match_operand:<ssescalarmode> 1 "register_operand")
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-1.c b/gcc/testsuite/gcc.target/i386/pr54855-1.c
new file mode 100644
index 00000000000..693aafa09ab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "addsd" 1 } } */
+/* { dg-final { scan-assembler-not "movapd" } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double __v2df __attribute__ ((__vector_size__ (16)));
+typedef double __m128d __attribute__ ((__vector_size__ (16), __may_alias__));
+
+__m128d
+_mm_add_sd (__m128d x, __m128d y)
+{
+ __m128d z = __extension__ (__m128d)(__v2df)
+ { (((__v2df) x)[0] + ((__v2df) y)[0]), ((__v2df) x)[1] };
+ return z;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-10.c b/gcc/testsuite/gcc.target/i386/pr54855-10.c
new file mode 100644
index 00000000000..9e08a85723e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-10.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "movlpd" 1 } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x, double *a)
+{
+ x[0] = *a;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-2.c b/gcc/testsuite/gcc.target/i386/pr54855-2.c
new file mode 100644
index 00000000000..20c6f8eb529
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-2.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "mulsd" 1 } } */
+/* { dg-final { scan-assembler-not "movapd" } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double __v2df __attribute__ ((__vector_size__ (16)));
+
+__v2df
+_mm_mul_sd (__v2df x, __v2df y)
+{
+ __v2df z = x;
+ z[0] = x[0] * y[0];
+ return z;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-3.c b/gcc/testsuite/gcc.target/i386/pr54855-3.c
new file mode 100644
index 00000000000..3c15dfc93d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-3.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "subsd" 1 } } */
+/* { dg-final { scan-assembler-not "movapd" } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x)
+{
+ x[0] -= 1.;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-4.c b/gcc/testsuite/gcc.target/i386/pr54855-4.c
new file mode 100644
index 00000000000..32eb28e852a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-4.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "subsd" 1 } } */
+/* { dg-final { scan-assembler-not "movapd" } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x, double a)
+{
+ x[0] -= a;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-5.c b/gcc/testsuite/gcc.target/i386/pr54855-5.c
new file mode 100644
index 00000000000..e06999074e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-5.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "subsd" 1 } } */
+/* { dg-final { scan-assembler-times "mulpd" 1 } } */
+/* { dg-final { scan-assembler-not "movapd" } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double __v2df __attribute__ ((__vector_size__ (16)));
+
+__v2df
+foo (__v2df x, __v2df y)
+{
+ x[0] -= y[0];
+ x *= y;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-6.c b/gcc/testsuite/gcc.target/i386/pr54855-6.c
new file mode 100644
index 00000000000..8f44d17b6d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-6.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "divss" 1 } } */
+/* { dg-final { scan-assembler-not "movaps" } } */
+/* { dg-final { scan-assembler-not "movss" } } */
+
+typedef float vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x, float f)
+{
+ x[0] /= f;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-7.c b/gcc/testsuite/gcc.target/i386/pr54855-7.c
new file mode 100644
index 00000000000..a551bd5c92f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-7.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "divss" 1 } } */
+/* { dg-final { scan-assembler-not "movaps" } } */
+/* { dg-final { scan-assembler-not "movss" } } */
+
+typedef float vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x)
+{
+ x[0] /= 2.1f;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-8.c b/gcc/testsuite/gcc.target/i386/pr54855-8.c
new file mode 100644
index 00000000000..7602dc293a7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-8.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "maxsd" 1 } } */
+/* { dg-final { scan-assembler-not "movapd" } } */
+/* { dg-final { scan-assembler-not "movsd" } } */
+
+typedef double vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x, double a)
+{
+ x[0] = x[0] > a ? x[0] : a;
+ return x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr54855-9.c b/gcc/testsuite/gcc.target/i386/pr54855-9.c
new file mode 100644
index 00000000000..40add5f6763
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr54855-9.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mfpmath=sse" } */
+/* { dg-final { scan-assembler-times "minss" 1 } } */
+/* { dg-final { scan-assembler-not "movaps" } } */
+/* { dg-final { scan-assembler-not "movss" } } */
+
+typedef float vec __attribute__((vector_size(16)));
+
+vec
+foo (vec x, float a)
+{
+ x[0] = x[0] < a ? x[0] : a;
+ return x;
+}
--
2.20.1