Bug 77621 - [6 Regression] Internal compiler error for mtune=atom + msse2
Summary: [6 Regression] Internal compiler error for mtune=atom + msse2
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 6.1.1
: P3 normal
Target Milestone: 6.3
Assignee: Richard Biener
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-16 16:07 UTC by Marcin Bajor
Modified: 2016-09-25 17:09 UTC (History)
3 users (show)

See Also:
Host:
Target: x86_64-*-*, i?86-*-*
Build:
Known to work: 7.0
Known to fail: 6.2.0
Last reconfirmed: 2016-09-19 00:00:00


Attachments
Preprocessed rawimage.cc (394.30 KB, application/x-7z-compressed)
2016-09-16 16:07 UTC, Marcin Bajor
Details
Build log with Q flag (183.41 KB, application/x-7z-compressed)
2016-09-16 16:07 UTC, Marcin Bajor
Details
Build log (20.10 KB, text/plain)
2016-09-16 16:08 UTC, Marcin Bajor
Details
Target-dependent patch that disables DFmode vectorization via vector costs (820 bytes, patch)
2016-09-20 08:56 UTC, Uroš Bizjak
Details | Diff
patch for the ICE (722 bytes, patch)
2016-09-20 11:20 UTC, Richard Biener
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marcin Bajor 2016-09-16 16:07:00 UTC
Created attachment 39634 [details]
Preprocessed rawimage.cc

Hi,
I'm a maintainer of Rawtherapee packages. I'm using Open Build Service (https://build.opensuse.org/package/show/home:rawtherapee/rawtherapee)
I have a problem with build the project for Fedora 24 (i686 + msse2) (gcc 6.1.1) with standard set of CXX flags recommended for building packages on OBS:

-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -m32 -march=i686 -mtune=atom -fasynchronous-unwind-tables

+ flag added by me: -msse2

Build finished with "internal compiler error: Segmentation fault"

There is no problem with build for Fedora 23 (gcc 5.1.1) where the same flags are used.

I've noticed too that there is no problem if I remove -mtune=atom.

To build the project locally use Open Build Service commander:
$ osc co home:rawtherapee rawtherapee
$ cd home\:rawtherapee/rawtherapee/
(optional) modify rawherapee.spec file
$ osc build  --no-verify Fedora_24 i586 rawtherapee.spec --clean

Regards
Marcin Bajor
Comment 1 Marcin Bajor 2016-09-16 16:07:54 UTC
Created attachment 39635 [details]
Build log with Q flag
Comment 2 Marcin Bajor 2016-09-16 16:08:20 UTC
Created attachment 39636 [details]
Build log
Comment 3 Uroš Bizjak 2016-09-18 18:30:54 UTC
It compiles OK with vanilla gcc-6.

However, you are using RedHat provided compiler, please report the failure to RedHat bugzilla, as the compiler instructed you when it crashed.
Comment 4 Jakub Jelinek 2016-09-19 07:32:32 UTC
I actually can reproduce it even with latest vanilla gcc-6-branch, it doesn't compile with the trunk due to some intrinsic header changes, so I'm reducing first.
Comment 5 Jakub Jelinek 2016-09-19 10:21:11 UTC
Reduced testcase:
/* { dg-do compile } */
/* { dg-options "-O3" } */
/* { dg-additional-options "-march=i686 -mtune=atom -msse2" { target ia32 } } */

void
foo (double *x, int *y)
{
  int i;
  for (i = 0; i < 8; i++)
    x[i] -= y[i] * x[i + 1];
}

Started to ICE with r230310.
Comment 6 Uroš Bizjak 2016-09-19 10:53:26 UTC
(In reply to Jakub Jelinek from comment #5)
> Started to ICE with r230310.

Confirmed.

Reduced testcase fails with -O3 -msse2 -mtune=atom for 32bit and 64bit x86 targets in:

Program received signal SIGSEGV, Segmentation fault.
0x00000000006de13e in tree_class_check (__t=0x0, __class=tcc_type, __f=0x215b580 "../../git/gcc/gcc/tree-vect-data-refs.c", __l=761, 
    __g=0x215d260 <vect_compute_data_ref_alignment(data_reference*)::__FUNCTION__> "vect_compute_data_ref_alignment")
    at ../../git/gcc/gcc/tree.h:3148
3148      if (TREE_CODE_CLASS (TREE_CODE (__t)) != __class)
(gdb) bt
#0  0x00000000006de13e in tree_class_check (__t=0x0, __class=tcc_type, __f=0x215b580 "../../git/gcc/gcc/tree-vect-data-refs.c", __l=761, 
    __g=0x215d260 <vect_compute_data_ref_alignment(data_reference*)::__FUNCTION__> "vect_compute_data_ref_alignment")
    at ../../git/gcc/gcc/tree.h:3148
#1  0x000000000199f40d in vect_compute_data_ref_alignment (dr=0x29688e0) at ../../git/gcc/gcc/tree-vect-data-refs.c:759

(gdb) f 1
#1  0x000000000199f40d in vect_compute_data_ref_alignment (dr=0x29688e0) at ../../git/gcc/gcc/tree-vect-data-refs.c:759
759           if (tree_fits_shwi_p (step)
(gdb) list
754       else
755         {
756           tree step = DR_STEP (dr);
757           unsigned vf = loop ? LOOP_VINFO_VECT_FACTOR (loop_vinfo) : 1;
758
759           if (tree_fits_shwi_p (step)
760               && ((tree_to_shwi (step) * vf)
761                   % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
762             {
763               if (dump_enabled_p ())
(gdb) p vectype
$1 = (tree) 0x0

CC author.
Comment 7 Richard Biener 2016-09-19 14:14:50 UTC
Mine.  The issue is that we have a non-vectorizable load as part of an interleaving group (that stmt is later not used in the SLP).

But the odd part of this testcase is that we have

t.c:7:1: note: not vectorized: no vectype for stmt: _51 = *x_18(D);
 scalar_type: double
t.c:7:1: note: got vectype for stmt: _32 = *_33;
vector(4) int
t.c:7:1: note: got vectype for stmt: _25 = *_28;
vector(2) double

this seems to be a backend issue with targetm.vectorize.preferred_simd_mode (DFmode) which seems to return SImode (!?) but once we fixed vector size
by looking for a SImode vector mode (V4SImode) mode_for_vector happily
returns V2DFmode for us to use.

So it seems V2DFmode is available but discouraged via the above hook when
tuning for atom.

Indeed:

static machine_mode
ix86_preferred_simd_mode (machine_mode mode)
{
...
    case DFmode:
      if (!TARGET_VECTORIZE_DOUBLE)
        return word_mode;

but targetm.vector_mode_supported_p happily returns true for V2DFmode.

This means the above is _not_ a good way to achieve what it tries to
(make the vectorizer not use V2DFmode).  A more proper way would be to
handle this in ix86_add_stmt_cost, increasing the cost for double type
vectorization.

Nevertheless the vectorizer shouldn't ICE on this inconsistency, I'll see
what it takes to "fix" it on the vectorizer side.
Comment 8 Uroš Bizjak 2016-09-19 15:25:44 UTC
(In reply to Richard Biener from comment #7)

> So it seems V2DFmode is available but discouraged via the above hook when
> tuning for atom.
> 
> Indeed:
> 
> static machine_mode
> ix86_preferred_simd_mode (machine_mode mode)
> {
> ...
>     case DFmode:
>       if (!TARGET_VECTORIZE_DOUBLE)
>         return word_mode;

This part should be OK, the hook documentation says:

 -- Target Hook: machine_mode TARGET_VECTORIZE_PREFERRED_SIMD_MODE
          (machine_mode MODE)
     This hook should return the preferred mode for vectorizing scalar
     mode MODE.  The default is equal to 'word_mode', because the
     vectorizer can do some transformations even in absence of
     specialized SIMD hardware.

IIUC, word_mode should be returned for unsupported scalar modes.

> but targetm.vector_mode_supported_p happily returns true for V2DFmode.

Yes, also following the documentation:

 -- Target Hook: bool TARGET_VECTOR_MODE_SUPPORTED_P (machine_mode MODE)
     Define this to return nonzero if the port is prepared to handle
     insns involving vector mode MODE.  At the very least, it must have
     move patterns for this mode.

We *do* have V2DF move patterns.

> This means the above is _not_ a good way to achieve what it tries to
> (make the vectorizer not use V2DFmode).  A more proper way would be to
> handle this in ix86_add_stmt_cost, increasing the cost for double type
> vectorization.
> 
> Nevertheless the vectorizer shouldn't ICE on this inconsistency, I'll see
> what it takes to "fix" it on the vectorizer side.

Please note that we have similar situation with TARGET_PREFER_AVX128, the difference is that we still vectorize with narrower vector mode, whereas V2DFmode falls back to a word_mode.
Comment 9 rguenther@suse.de 2016-09-20 07:37:23 UTC
On Mon, 19 Sep 2016, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621
> 
> --- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
> (In reply to Richard Biener from comment #7)
> 
> > So it seems V2DFmode is available but discouraged via the above hook when
> > tuning for atom.
> > 
> > Indeed:
> > 
> > static machine_mode
> > ix86_preferred_simd_mode (machine_mode mode)
> > {
> > ...
> >     case DFmode:
> >       if (!TARGET_VECTORIZE_DOUBLE)
> >         return word_mode;
> 
> This part should be OK, the hook documentation says:
> 
>  -- Target Hook: machine_mode TARGET_VECTORIZE_PREFERRED_SIMD_MODE
>           (machine_mode MODE)
>      This hook should return the preferred mode for vectorizing scalar
>      mode MODE.  The default is equal to 'word_mode', because the
>      vectorizer can do some transformations even in absence of
>      specialized SIMD hardware.
> 
> IIUC, word_mode should be returned for unsupported scalar modes.

It says "preferred" -- it only implicitely suggests that maybe
returning word_mode disables vectorization (it does not).

> > but targetm.vector_mode_supported_p happily returns true for V2DFmode.
> 
> Yes, also following the documentation:
> 
>  -- Target Hook: bool TARGET_VECTOR_MODE_SUPPORTED_P (machine_mode MODE)
>      Define this to return nonzero if the port is prepared to handle
>      insns involving vector mode MODE.  At the very least, it must have
>      move patterns for this mode.
> 
> We *do* have V2DF move patterns.

from the testcase it's clear you also have conversion patterns (ok, maybe
not -- eventually vectorization would fail if we didn't ICE -- will
check that).

I believe atom _does_ have full SSE2 support, no?  Using intrinsics
(even those expanding to GCC generic vector extension code) should
end up emitting SSE2 double instructions?

So what you want to tell the vectorizer is to not introduce vectorized
code using V2DFmode.  I still think a better way is to handle this
via costs (like a loop with mostly integer ops but a single FP double
op is probably still profitable to vectorize).

At least if you want the preferred_simd_mode to _disable_ vectorization
for V2DFmode that needs additional changes in the vectorizer code.  And
using word_mode from that hook when constructing the vector type
(using word_mode -- you're only lucky that word_mode is smaller than
DFmode that it fails) will end up with a vector type that has V2DF
mode anyway because stor-layout uses vector_mode_supported_p
to decide whether to use V2DFmode or BLKmode.

> > This means the above is _not_ a good way to achieve what it tries to
> > (make the vectorizer not use V2DFmode).  A more proper way would be to
> > handle this in ix86_add_stmt_cost, increasing the cost for double type
> > vectorization.
> > 
> > Nevertheless the vectorizer shouldn't ICE on this inconsistency, I'll see
> > what it takes to "fix" it on the vectorizer side.
> 
> Please note that we have similar situation with TARGET_PREFER_AVX128, the
> difference is that we still vectorize with narrower vector mode, whereas
> V2DFmode falls back to a word_mode.

Yeah.

Note that preferred_simd_mode was introduced to choose a prefered vector
size for a given mode, not really to disable vectorization for selected
modes.  See it's originally only use:

  /* If no size was supplied use the mode the target prefers.   Otherwise
     lookup a vector mode of the specified size.  */
  if (size == 0)
    simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
  else
    simd_mode = mode_for_vector (inner_mode, size / nbytes);

it seems later there was at least one additional use added that follows
your idea:

int
estimate_move_cost (tree type, bool ARG_UNUSED (speed_p))
{
  HOST_WIDE_INT size;

  gcc_assert (!VOID_TYPE_P (type));

  if (TREE_CODE (type) == VECTOR_TYPE)
    {
      machine_mode inner = TYPE_MODE (TREE_TYPE (type));
      machine_mode simd
        = targetm.vectorize.preferred_simd_mode (inner);
      int simd_mode_size = GET_MODE_SIZE (simd);
      return ((GET_MODE_SIZE (TYPE_MODE (type)) + simd_mode_size - 1)
              / simd_mode_size);
    }

not sure why we override TYPE_MODE with preferred_simd_mode.  It's not
that the x86 backend will emit word_mode loads/stores for V2DFmode
loads/stores on i?86 with -mtune=atom?
Comment 10 Uroš Bizjak 2016-09-20 08:56:35 UTC
Created attachment 39653 [details]
Target-dependent patch that disables DFmode vectorization via vector costs

Proposed target-dependent patch.
Comment 11 Uroš Bizjak 2016-09-20 09:03:05 UTC
(In reply to rguenther@suse.de from comment #9)

> I believe atom _does_ have full SSE2 support, no?  Using intrinsics
> (even those expanding to GCC generic vector extension code) should
> end up emitting SSE2 double instructions?

True.

> So what you want to tell the vectorizer is to not introduce vectorized
> code using V2DFmode.  I still think a better way is to handle this
> via costs (like a loop with mostly integer ops but a single FP double
> op is probably still profitable to vectorize).

The patch, attached in the previous message implements the above suggestion, and also fixes the testcase with -mtune=atom. However, I have no performance data to base cost values on, so the patch artificially rises the cost of DFmode vector insns for 20:

+  /* FIXME: The value here is arbitrary
+     and could potentially be improved with analysis.  */
+  if (vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode
+      && !TARGET_VECTORIZE_DOUBLE)
+    cost += 20;

[...]

> not sure why we override TYPE_MODE with preferred_simd_mode.  It's not
> that the x86 backend will emit word_mode loads/stores for V2DFmode
> loads/stores on i?86 with -mtune=atom?

Oh... no. We *do* have V2DFmode, but we want to avoid it as much as possible.
Comment 12 Richard Biener 2016-09-20 11:03:55 UTC
(In reply to Uroš Bizjak from comment #11)
> (In reply to rguenther@suse.de from comment #9)
> 
> > I believe atom _does_ have full SSE2 support, no?  Using intrinsics
> > (even those expanding to GCC generic vector extension code) should
> > end up emitting SSE2 double instructions?
> 
> True.
> 
> > So what you want to tell the vectorizer is to not introduce vectorized
> > code using V2DFmode.  I still think a better way is to handle this
> > via costs (like a loop with mostly integer ops but a single FP double
> > op is probably still profitable to vectorize).
> 
> The patch, attached in the previous message implements the above suggestion,
> and also fixes the testcase with -mtune=atom. However, I have no performance
> data to base cost values on, so the patch artificially rises the cost of
> DFmode vector insns for 20:
> 
> +  /* FIXME: The value here is arbitrary
> +     and could potentially be improved with analysis.  */
> +  if (vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode
> +      && !TARGET_VECTORIZE_DOUBLE)
> +    cost += 20;
> 
> [...]

If V2DFmode moves are fine(?) then maybe not do this for the load/store
kinds - this means only handling vector_stmt this way (and maybe vect_promote_demote?) - at least make sure to not handle scalar_*
(not sure if vectype is always NULL for those -- docs say only
memory ops may depend on vectype).
Instead of += 20 I'd have done *= <factor> to
make it more independent of the absolute value of the cost numbers.

If you'd do the cost adjustment in ix86_add_stmt_cost you have more control
over the details (there's also similar offsetting for silvermont)

> > not sure why we override TYPE_MODE with preferred_simd_mode.  It's not
> > that the x86 backend will emit word_mode loads/stores for V2DFmode
> > loads/stores on i?86 with -mtune=atom?
> 
> Oh... no. We *do* have V2DFmode, but we want to avoid it as much as possible.

That's what I thought.
Comment 13 Richard Biener 2016-09-20 11:20:10 UTC
Created attachment 39656 [details]
patch for the ICE
Comment 14 Uroš Bizjak 2016-09-20 11:24:41 UTC
(In reply to Richard Biener from comment #13)
> Created attachment 39656 [details]
> patch for the ICE

+/* { dg-additional-options "-march=i686 -mtune=atom -msse2" { target ia32 } } */

You can use

/* { dg-additional-options "-mtune=atom -msse2" { target i?86-*-* x86_64-*-* } } */

and the test will also break on x86_64.
Comment 15 Uroš Bizjak 2016-09-20 12:06:15 UTC
(In reply to Richard Biener from comment #12)

> If V2DFmode moves are fine(?) then maybe not do this for the load/store
> kinds - this means only handling vector_stmt this way (and maybe
> vect_promote_demote?) - at least make sure to not handle scalar_*
> (not sure if vectype is always NULL for those -- docs say only
> memory ops may depend on vectype).

Moves are fine, V2DFmode vector arithmetic insns (addpd, subpd, mulpd) have much higher latencies (e.g. 6 for addpd, 9 for mulpd), comparing to their {SF,DF}mode (or V4SFmode) versions (1 for addps, 2 for mulps).

> Instead of += 20 I'd have done *= <factor> to
> make it more independent of the absolute value of the cost numbers.

IMO, having no other data at hand than Agner Fog's instruction tables, it looks that penalizing vector_stmt cost with a factor of 5 should be OK for a start.

> If you'd do the cost adjustment in ix86_add_stmt_cost you have more control
> over the details (there's also similar offsetting for silvermont)

ix86_builtin_vectorization_cost is also called from there. OTOH, ix86_add_stmt_cost uses some other arguments (e.g. location), which I think are irrelevant to the insn type cost adjustment.

Let me play a bit with the patch.
Comment 16 rguenther@suse.de 2016-09-20 12:12:49 UTC
On Tue, 20 Sep 2016, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621
> 
> --- Comment #15 from Uroš Bizjak <ubizjak at gmail dot com> ---
> (In reply to Richard Biener from comment #12)
> 
> > If V2DFmode moves are fine(?) then maybe not do this for the load/store
> > kinds - this means only handling vector_stmt this way (and maybe
> > vect_promote_demote?) - at least make sure to not handle scalar_*
> > (not sure if vectype is always NULL for those -- docs say only
> > memory ops may depend on vectype).
> 
> Moves are fine, V2DFmode vector arithmetic insns (addpd, subpd, mulpd) have
> much higher latencies (e.g. 6 for addpd, 9 for mulpd), comparing to their
> {SF,DF}mode (or V4SFmode) versions (1 for addps, 2 for mulps).
> 
> > Instead of += 20 I'd have done *= <factor> to
> > make it more independent of the absolute value of the cost numbers.
> 
> IMO, having no other data at hand than Agner Fog's instruction tables, it looks
> that penalizing vector_stmt cost with a factor of 5 should be OK for a start.
> 
> > If you'd do the cost adjustment in ix86_add_stmt_cost you have more control
> > over the details (there's also similar offsetting for silvermont)
> 
> ix86_builtin_vectorization_cost is also called from there. OTOH,
> ix86_add_stmt_cost uses some other arguments (e.g. location), which I think are
> irrelevant to the insn type cost adjustment.

At least you won't get called for the scalar loop copy and you have
definite acccess to vectype.
Comment 17 Uroš Bizjak 2016-09-20 12:33:48 UTC
(In reply to rguenther@suse.de from comment #16)

> At least you won't get called for the scalar loop copy and you have
> definite acccess to vectype.

Thanks for the hint, the following patch is effective as well:

--cut here--
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 60b81bb..9d72681 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -49554,9 +49554,7 @@ ix86_preferred_simd_mode (machine_mode mode)
        return V4SFmode;
 
     case DFmode:
-      if (!TARGET_VECTORIZE_DOUBLE)
-       return word_mode;
-      else if (TARGET_AVX512F)
+      if (TARGET_AVX512F)
        return V8DFmode;
       else if (TARGET_AVX && !TARGET_PREFER_AVX128)
        return V4DFmode;
@@ -49647,6 +49645,11 @@ ix86_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind,
   tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
   int stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
 
+  /* Penalize DFmode vector operations for !TARGET_VECTORIZE_DOUBLE.  */
+  if (kind == vector_stmt && !TARGET_VECTORIZE_DOUBLE
+      && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
+    stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
+
   /* Statements in an inner loop relative to the loop being
      vectorized are weighted more heavily.  The value here is
       arbitrary and could potentially be improved with analysis.  */
--cut here--
Comment 18 rguenther@suse.de 2016-09-20 12:42:05 UTC
On Tue, 20 Sep 2016, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621
> 
> --- Comment #17 from Uroš Bizjak <ubizjak at gmail dot com> ---
> (In reply to rguenther@suse.de from comment #16)
> 
> > At least you won't get called for the scalar loop copy and you have
> > definite acccess to vectype.
> 
> Thanks for the hint, the following patch is effective as well:

Looks good to me.

> --cut here--
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 60b81bb..9d72681 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -49554,9 +49554,7 @@ ix86_preferred_simd_mode (machine_mode mode)
>         return V4SFmode;
> 
>      case DFmode:
> -      if (!TARGET_VECTORIZE_DOUBLE)
> -       return word_mode;
> -      else if (TARGET_AVX512F)
> +      if (TARGET_AVX512F)
>         return V8DFmode;
>        else if (TARGET_AVX && !TARGET_PREFER_AVX128)
>         return V4DFmode;
> @@ -49647,6 +49645,11 @@ ix86_add_stmt_cost (void *data, int count, enum
> vect_cost_for_stmt kind,
>    tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
>    int stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
> 
> +  /* Penalize DFmode vector operations for !TARGET_VECTORIZE_DOUBLE.  */
> +  if (kind == vector_stmt && !TARGET_VECTORIZE_DOUBLE
> +      && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
> +    stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
> +
>    /* Statements in an inner loop relative to the loop being
>       vectorized are weighted more heavily.  The value here is
>        arbitrary and could potentially be improved with analysis.  */
> --cut here--
Comment 19 uros 2016-09-20 17:36:35 UTC
Author: uros
Date: Tue Sep 20 17:36:03 2016
New Revision: 240277

URL: https://gcc.gnu.org/viewcvs?rev=240277&root=gcc&view=rev
Log:
	PR target/77621
	* config/i386/i386.c (ix86_preferred_simd_mode) <case DFmode>:
	Don't return word_mode for !TARGET_VECTORIZE_DOUBLE.
	(ix86_add_stmt_cost): Penalize DFmode vector operations
	for !TARGET_VECTORIZE_DOUBLE.

testsuite/ChangeLog:

	PR target/77621
	* gcc.target/i386/pr77621.c: New test.
	* gcc.target/i386/vect-double-2.c: Update scan-tree-dump-times
	pattern, loop should vectorize with -mtune=atom.


Added:
    trunk/gcc/testsuite/gcc.target/i386/pr77621.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/i386.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/vect-double-2.c
Comment 20 Richard Biener 2016-09-21 07:37:50 UTC
Author: rguenth
Date: Wed Sep 21 07:37:18 2016
New Revision: 240302

URL: https://gcc.gnu.org/viewcvs?rev=240302&root=gcc&view=rev
Log:
2016-09-21  Richard Biener  <rguenther@suse.de>
	Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/77621
	* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Split
	group at non-vectorizable stmts.

	* gcc.dg/pr77621.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/pr77621.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-vect-data-refs.c
Comment 21 Richard Biener 2016-09-21 07:55:56 UTC
Fixed on trunk sofar.
Comment 22 uros 2016-09-25 17:08:09 UTC
Author: uros
Date: Sun Sep 25 17:07:37 2016
New Revision: 240475

URL: https://gcc.gnu.org/viewcvs?rev=240475&root=gcc&view=rev
Log:
	Backport from mainline
	2016-09-21  Richard Biener  <rguenther@suse.de>
		    Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/77621
	* tree-vect-data-refs.c (vect_analyze_data_ref_accesses): Split
	group at non-vectorizable stmts.

	Backport from mainline
	2016-09-20  Uros Bizjak  <ubizjak@gmail.com>

	PR target/77621
	* config/i386/i386.c (ix86_preferred_simd_mode) <case DFmode>:
	Don't return word_mode for !TARGET_VECTORIZE_DOUBLE.
	(ix86_add_stmt_cost): Penalize DFmode vector operations
	for !TARGET_VECTORIZE_DOUBLE.

testsuite/ChangeLog:

	Backport from mainline
	2016-09-21  Richard Biener  <rguenther@suse.de>
		    Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/77621
	* gcc.dg/pr77621.c: New testcase.

	Backport from mainline
	2016-09-20  Uros Bizjak  <ubizjak@gmail.com>

	PR target/77621
	* gcc.target/i386/pr77621.c: New test.
	* gcc.target/i386/vect-double-2.c: Update scan-tree-dump-times
	pattern, loop should vectorize with -mtune=atom.


Added:
    branches/gcc-6-branch/gcc/testsuite/gcc.dg/pr77621.c
    branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/pr77621.c
Modified:
    branches/gcc-6-branch/gcc/ChangeLog
    branches/gcc-6-branch/gcc/config/i386/i386.c
    branches/gcc-6-branch/gcc/testsuite/ChangeLog
    branches/gcc-6-branch/gcc/testsuite/gcc.target/i386/vect-double-2.c
    branches/gcc-6-branch/gcc/tree-vect-data-refs.c
Comment 23 Uroš Bizjak 2016-09-25 17:09:39 UTC
Fixed for gcc-6.3+