This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RPC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY (zlib segfault)


Hi,

There's a function in zlib with a simple loop initializing a local array.
The vectorizer asks the compiler to force 128 bit alignment to this array
(i.e. sets DECL_ALIGN to 128 bit), but sometimes, on some targets (e.g.
i686-linux-gnu), the compiler doesn't respect that, and so we end up
accessing an unaligned address using an aligned memory instruction (and
segfault). It has been reported that several applications that link with
zlib - "firefox/mozilla/thunderbird/seamonkey/xulrunner, rpm (notably
rpm2cpio), openoffice" - suffer from this problem.
This is probably because the STACK_BOUNDARY is not guaranteed to be 128bit
aligned on these systems. The vectorizer however is checking the
PREFERRED_STACK_BOUNDARY instead, and hopes for the best (i.e. we knowingly
do something that may be wrong, but is ok most of the time). This, by the
way, is also what would happen if we ask to align this array in the source
code (e.g. using:
__attribute__ ((__aligned__(16))),
i.e. no alignment tweaks from within the vectorizer. I believe this is what
PR16660 is about?).

There are two possible solutions:

1) just do the conservative right thing in the vectorizer, and check the
STACK_BOUNDARY instead of the PREFERRED_STACK_BOUNDARY. The downside of
course is that we will now never be able to force alignment of local arrays
on most x86 systems, and will have to work with unaligned accesses (using
unaligned-moves, or peeling when possible, or versioning, or not vectorize
at all), which can be *much* less efficient. This amounts to the following
small change:

Index: gcc/tree-vectorizer.c
===================================================================
*** gcc/tree-vectorizer.c       (revision 128976)
--- gcc/tree-vectorizer.c       (working copy)
*************** vect_can_force_dr_alignment_p (const_tre
*** 1606,1617 ****
    if (TREE_STATIC (decl))
      return (alignment <= MAX_OFILE_ALIGNMENT);
    else
!     /* This is not 100% correct.  The absolute correct stack alignment
!        is STACK_BOUNDARY.  We're supposed to hope, but not assume, that
!        PREFERRED_STACK_BOUNDARY is honored by all translation units.
!        However, until someone implements forced stack alignment, SSE
!        isn't really usable without this.  */
!     return (alignment <= PREFERRED_STACK_BOUNDARY);
  }


--- 1606,1612 ----
    if (TREE_STATIC (decl))
      return (alignment <= MAX_OFILE_ALIGNMENT);
    else
!     return (alignment <= STACK_BOUNDARY);
  }


The above patch has been reported to work well and solve the problem:

"FWIW, i've been running GCC-4.2 svn with the patch at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413#c17 for a couple months
now
and have built a sizable chunk of our package repository with
-ftree-vectorize
enabled several times over and have yet to run into any trouble whatsoever.
"

It passed bootstrap with vectorization enabled on i386-linux. However, the
other problem with this patch is that it involves making a lot of changes
in the vectorizer testsuite, cause a lot of tests that used to get
vectorized before, now either don't get vectorized on x86, or get
vectorized differently. I started to go over the testsuite and make the
required changes, but it's a pretty tedious work, so I'm attaching what I
have so far as an RFC, cause hopefully there is a different, probably
better, solution that we can consider instead:

2) implement forced stack alignment, which I believe is what Pinksy's patch
supposed to do: http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01409.html ?
(by the way, looks like this patch has been waiting for approval for quite
a while now). Unfortunately, there's something in his patch that still
needs some tweaking cause it doesn't solve the PR as is:

"
Of course after my patch for PR 16660, the patch here should be
changed to just return true always.

Thanks,
Andrew Pinski
----------------------
> I also tested using Andrew's patch from bug #16660 and always returning
true in
> vect_can_force_dr_alignment_p but it does not fix this error.

Andrew, makes sense to you?
----------------------
I think my patch only checks PREFERRED_STACK_BOUNDARY and not
STACK_BOUNDARY
which is why it does not work but I have not looked into it at all.
"

So I think the main question is - Andrew, do you think your patch can be
made to solve this problem?

If not, we'd have to resort to the patch attached below.

The patch is not ready to be committed as is, cause we'd still have a bunch
of failures on i*86-linux:

FAIL: gcc.dg/vect/vect-31.c scan-tree-dump-times Alignment of access forced
using peeling 2
FAIL: gcc.dg/vect/vect-34.c scan-tree-dump-times Vectorizing an unaligned
access 0
FAIL: gcc.dg/vect/vect-36.c scan-tree-dump-times Vectorizing an unaligned
access 0
FAIL: gcc.dg/vect/vect-36.c scan-tree-dump-times Alignment of access forced
using peeling 0
FAIL: gcc.dg/vect/vect-64.c scan-tree-dump-times Alignment of access forced
using peeling 2
FAIL: gcc.dg/vect/vect-65.c scan-tree-dump-times Vectorizing an unaligned
access 0
FAIL: gcc.dg/vect/vect-66.c scan-tree-dump-times Alignment of access forced
using peeling 1
FAIL: gcc.dg/vect/vect-68.c scan-tree-dump-times Alignment of access forced
using peeling 2
FAIL: gcc.dg/vect/vect-72.c scan-tree-dump-times Alignment of access forced
using peeling 0
FAIL: gcc.dg/vect/vect-73.c scan-tree-dump-times Vectorizing an unaligned
access 0
FAIL: gcc.dg/vect/vect-76.c scan-tree-dump-times Vectorizing an unaligned
access 2
FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times Alignment of access forced
using peeling 0
FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times Alignment of access forced
using peeling 0
FAIL: gcc.dg/vect/vect-86.c scan-tree-dump-times Alignment of access forced
using peeling 0
FAIL: gcc.dg/vect/vect-all.c scan-tree-dump-times Vectorizing an unaligned
access 0
FAIL: gcc.dg/vect/vect-all.c scan-tree-dump-times Alignment of access
forced using peeling 0
FAIL: gcc.dg/vect/slp-25.c scan-tree-dump-times Alignment of access forced
using peeling 2
FAIL: gcc.dg/vect/wrapv-vect-7.c scan-tree-dump-times Vectorizing an
unaligned access 0
FAIL: gcc.dg/vect/no-scevccp-outer-6.c scan-tree-dump-times OUTER LOOP
VECTORIZED. 1

I am considering to maybe throw away all these alignment checks all
together, cause it's starting to be a pain to maintain them.

thanks,
dorit

ChangeLog:

      * tree-vectorizer.c (vect_can_force_dr_alignment_p): Check
STACK_BOUNDARY
      instead of PREFERRED_STACK_BOUNDARY.

      * testsuite/lib/target-supports.exp (
check_effective_target_unaligned_stack): New.
      * testsuite/gcc.dg/vect/vect-2.c: xfail for unaligned_stack targets.
      * testsuite/gcc.dg/vect/vect-3.c: Likewise.
      * testsuite/gcc.dg/vect/vect-4.c: Likewise.
      * testsuite/gcc.dg/vect/vect-5.c: Likewise.
      * testsuite/gcc.dg/vect/vect-6.c: Likewise.
      * testsuite/gcc.dg/vect/vect-7.c: Likewise.
      * testsuite/gcc.dg/vect/vect-13.c: Likewise.
      * testsuite/gcc.dg/vect/vect-17.c: Likewise.
      * testsuite/gcc.dg/vect/vect-18.c: Likewise.
      * testsuite/gcc.dg/vect/vect-19.c: Likewise.
      * testsuite/gcc.dg/vect/vect-20.c: Likewise.
      * testsuite/gcc.dg/vect/vect-21.c: Likewise.
      * testsuite/gcc.dg/vect/vect-22.c: Likewise.
      * testsuite/gcc.dg/vect/vect-27.c: Likewise.
      * testsuite/gcc.dg/vect/vect-29.c: Likewise.

(See attached file: stackboundary.txt)

Attachment: stackboundary.txt
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]