Bug 31699 - [4.3 Regression] -march=opteron -ftree-vectorize generates wrong code
Summary: [4.3 Regression] -march=opteron -ftree-vectorize generates wrong code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: 4.3.0
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
: 31697 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-04-25 13:27 UTC by Tobias Burnus
Modified: 2007-05-03 14:18 UTC (History)
7 users (show)

See Also:
Host:
Target: x86_64-unknown-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-04-26 10:36:01


Attachments
patch (2.14 KB, patch)
2007-04-26 19:34 UTC, Dorit Naishlos
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2007-04-25 13:27:19 UTC
This is with the Polyhedron rnflow.f90 test case
(http://www.polyhedron.co.uk/pb05/polyhedron_benchmark_suite.html)

It was working on 2007-04-23 (should be r124055) and started to fail since
2007-04-24 (should be r124093). (Same with PR 31697)

Crash after 1.5 seconds with
  gfortran -m32 -march=opteron -ftree-vectorize -O2 rnflow.f90

Same crash with -m64.

no crash:
- without: -march=opteron
- without: -ftree-vectorize
- with: -O1

Program received signal SIGSEGV, Segmentation fault.
#0  0x00000000004058a9 in invima (__result=<value optimized out>, a=0x2ac0ff48d010, j=@0x7fffac442b18, k=<value optimized out>,
    m=@0x7fffac442b1c) at rnflow.f90:2904
#1  0x0000000000407446 in evlrnf_ (ptrs0t=<value optimized out>, nclsm=<value optimized out>, prnf0t=0xdaf380) at rnflow.f90:2771

  2899        elseif (n > 1) then
  2900           allocate (da (1:n,1:n))
  2901           lw = n * m
  2902           allocate (dw (1:lw))
  2903           allocate (ipivt (1:n))
  2904           da (1:n, 1:n) = - a (j:k-1, j:k-1)
  2905           do i = 1, n
  2906              da (i, i) = da (i, i) + 1.0d0
  2907           enddo
Comment 1 Janne Blomqvist 2007-04-26 10:36:01 UTC
Confirmed. It occurs also on i686-pc-linux-gnu.

My observations:

- -march specific optimizations do not seem to have any effect. What does have an effect is that on i686-pc-linux-gnu I need either -march= or -msse2 or else the vectorizer is disabled.

- Like you, crash occurs with -O2 and -O3, not -O1

I.e. lowest optimization that triggers the bug for me is "-O2 -msse2 -ftree-vectorize".
Comment 2 Uroš Bizjak 2007-04-26 11:20:57 UTC
This bug is due to my commit:

http://gcc.gnu.org/ml/gcc-cvs/2007-04/msg00657.html

This patch introduces "vec_unpacks_hi_v4sf" and "vec_unpacks_lo_v4sf" to sse.md an these patterns trigger generic vectorizer problem (related to multiple data types in the loop having VEC_UNPAC_HI/LO and not related to convesions at all) in alignment handling.

Polyhedron crashes in:

Dump of assembler code from 0x804def2 to 0x804dff2:
>>  0x0804def2 <invima+594>:        movapd %xmm1,(%eax)
    0x0804def6 <invima+598>:        cvtps2pd 0xffffff48(%ebp),%xmm0
    0x0804defd <invima+605>:        movapd %xmm0,0x10(%eax)
    0x0804df02 <invima+610>:        add    $0x20,%eax
    0x0804df05 <invima+613>:        cmp    %edi,%ebx

Dorit has confirmed the problem and she is already testing a fix.

The C testcase that crashes (you need -O2 -msse2 -ftree-vectorize [-m32]) to trigger the problem, as "z" needs to be aligned to 8 bytes:

--cut here--
float x[256];

void foo(void)
{
 double *z = malloc (sizeof(double) * 256);

 int i;
 for (i=0; i<256; ++i)
   z[i] = x[i] + 1.0f;
}


int main()
{
 int i;

 for (i = 0; i < 256; i++)
   x[i] = (float) i;

 foo();

 return 0;
}
--cut here--

If an urgent fix is needed, then simply rename "vec_unpacks_hi_v4sf" to "*vec_unpacks_hi_v4sf".
Comment 3 Dorit Naishlos 2007-04-26 19:34:32 UTC
Created attachment 13450 [details]
patch
Comment 4 Dorit Naishlos 2007-04-26 19:37:12 UTC
I'm testing the attched patch. The problem is that we don't compute the peel factor correctly (when peeling to align a store) when we have multiple data-types in the loop (the computation assumes that VF is the number of elements in a vector, but that doesn't hold for all the datarefs in the loop if their types are of different sizes)
Comment 5 Uroš Bizjak 2007-04-27 11:35:02 UTC
(In reply to comment #3)
> Created an attachment (id=13450) [edit]

This patch fixes the testcase from comment #2 and Polyhedron rnflow failure in both 32bit (-m32 -msse2) and 64bit modes. The loop is still vectorized and cvtps2pd insns are still present in the asm dump.
Comment 6 Dorit Naishlos 2007-05-02 20:38:18 UTC
patch: http://gcc.gnu.org/ml/gcc-patches/2007-05/msg00111.html
Comment 7 dorit 2007-05-03 13:55:03 UTC
Subject: Bug 31699

Author: dorit
Date: Thu May  3 12:54:45 2007
New Revision: 124375

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124375
Log:
        PR tree-optimization/31699
        * tree-vect-analyze.c (vect_update_misalignment_for_peel): Remove wrong
        code.
        (vect_enhance_data_refs_alignment): Compute peel amount using
        TYPE_VECTOR_SUBPARTS instead of vf.
        * tree-vect-transform.c (vect_gen_niters_for_prolog_loop): Likewise.


Added:
    trunk/gcc/testsuite/gcc.dg/vect/pr31699.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-11.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/vect-floatint-conversion-1.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-intfloat-conversion-1.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-iv-4.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
    trunk/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
    trunk/gcc/testsuite/lib/target-supports.exp
    trunk/gcc/tree-vect-analyze.c
    trunk/gcc/tree-vect-transform.c

Comment 8 Tobias Burnus 2007-05-03 14:16:40 UTC
*** Bug 31697 has been marked as a duplicate of this bug. ***
Comment 9 Tobias Burnus 2007-05-03 14:18:41 UTC
As the fix has been checked in and it works (at least here ;-), mark as FIXED.
Comment 10 irar 2007-06-08 06:31:52 UTC
Subject: Bug 31699

Author: irar
Date: Fri Jun  8 06:31:39 2007
New Revision: 125560

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=125560
Log:
	Backport from mainline:

	2007-05-03  Dorit Nuzman  <dorit@il.ibm.com>

	PR tree-optimization/31699
	* tree-vect-analyze.c (vect_update_misalignment_for_peel): Remove wrong
	code.
	(vect_enhance_data_refs_alignment): Compute peel amount using
	TYPE_VECTOR_SUBPARTS instead of vf.
	* tree-vect-transform.c (vect_gen_niters_for_prolog_loop): Likewise.


Added:
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/pr31699.c
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/vect-multitypes-11.c
Modified:
    branches/autovect-branch/gcc/ChangeLog.autovect
    branches/autovect-branch/gcc/testsuite/ChangeLog.autovect
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/vect-floatint-conversion-1.c
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/vect-intfloat-conversion-1.c
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/vect-iv-4.c
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/vect-multitypes-1.c
    branches/autovect-branch/gcc/testsuite/gcc.dg/vect/vect-multitypes-4.c
    branches/autovect-branch/gcc/testsuite/lib/target-supports.exp
    branches/autovect-branch/gcc/tree-vect-analyze.c
    branches/autovect-branch/gcc/tree-vect-transform.c