This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

ping [RFC] [patch] fix PR32893 - forcing alignment >= STACK_BOUNDARY


So I'm not sure how to proceed with PR32893 (see details and proposed patch
below).

I didn't yet get a response from Pinsky on whether his fix for PR16660
would also fix this PR,
and recently H.J. marked this PR as dependent on the newly opened PR33721.
It seems like there's a more general issue here that needs to be solved,
and the simple fix I suggested below in the vectorizer is only a workaround
(that would also involve a lot of testcase changes). I believe (hope...)
that a fix to either PR16660 or PR33721 would also fix this PR, but I'm not
sure what's going on with these PRs (what are the prospects of either of
these PRs to get resolved in the near future?). In the meantime we are
generating wrong code in the vectorizer...

thanks,
dorit

----- Forwarded by Dorit Nuzman/Haifa/IBM on 21/10/2007 21:44 -----

>
> Hi,
>
> There's a function in zlib with a simple loop initializing a local array.
> The vectorizer asks the compiler to force 128 bit alignment to this array
> (i.e. sets DECL_ALIGN to 128 bit), but sometimes, on some targets (e.g.
> i686-linux-gnu), the compiler doesn't respect that, and so we end up
> accessing an unaligned address using an aligned memory instruction (and
> segfault). It has been reported that several applications that link with
> zlib - "firefox/mozilla/thunderbird/seamonkey/xulrunner, rpm (notably
> rpm2cpio), openoffice" - suffer from this problem.
> This is probably because the STACK_BOUNDARY is not guaranteed to be
128bit
> aligned on these systems. The vectorizer however is checking the
> PREFERRED_STACK_BOUNDARY instead, and hopes for the best (i.e. we
knowingly
> do something that may be wrong, but is ok most of the time). This, by the
> way, is also what would happen if we ask to align this array in the
source
> code (e.g. using:
> __attribute__ ((__aligned__(16))),
> i.e. no alignment tweaks from within the vectorizer. I believe this is
what
> PR16660 is about?).
>
> There are two possible solutions:
>
> 1) just do the conservative right thing in the vectorizer, and check the
> STACK_BOUNDARY instead of the PREFERRED_STACK_BOUNDARY. The downside of
> course is that we will now never be able to force alignment of local
arrays
> on most x86 systems, and will have to work with unaligned accesses (using
> unaligned-moves, or peeling when possible, or versioning, or not
vectorize
> at all), which can be *much* less efficient. This amounts to the
following
> small change:
>
> Index: gcc/tree-vectorizer.c
> ===================================================================
> *** gcc/tree-vectorizer.c       (revision 128976)
> --- gcc/tree-vectorizer.c       (working copy)
> *************** vect_can_force_dr_alignment_p (const_tre
> *** 1606,1617 ****
>     if (TREE_STATIC (decl))
>       return (alignment <= MAX_OFILE_ALIGNMENT);
>     else
> !     /* This is not 100% correct.  The absolute correct stack alignment
> !        is STACK_BOUNDARY.  We're supposed to hope, but not assume, that
> !        PREFERRED_STACK_BOUNDARY is honored by all translation units.
> !        However, until someone implements forced stack alignment, SSE
> !        isn't really usable without this.  */
> !     return (alignment <= PREFERRED_STACK_BOUNDARY);
>   }
>
>
> --- 1606,1612 ----
>     if (TREE_STATIC (decl))
>       return (alignment <= MAX_OFILE_ALIGNMENT);
>     else
> !     return (alignment <= STACK_BOUNDARY);
>   }
>
>
> The above patch has been reported to work well and solve the problem:
>
> "FWIW, i've been running GCC-4.2 svn with the patch at
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25413#c17 for a couple months
> now
> and have built a sizable chunk of our package repository with
> -ftree-vectorize
> enabled several times over and have yet to run into any trouble
whatsoever.
> "
>
> It passed bootstrap with vectorization enabled on i386-linux. However,
the
> other problem with this patch is that it involves making a lot of changes
> in the vectorizer testsuite, cause a lot of tests that used to get
> vectorized before, now either don't get vectorized on x86, or get
> vectorized differently. I started to go over the testsuite and make the
> required changes, but it's a pretty tedious work, so I'm attaching what I
> have so far as an RFC, cause hopefully there is a different, probably
> better, solution that we can consider instead:
>
> 2) implement forced stack alignment, which I believe is what Pinksy's
patch
> supposed to do: http://gcc.gnu.org/ml/gcc-patches/2007-09/msg01409.html ?
> (by the way, looks like this patch has been waiting for approval for
quite
> a while now). Unfortunately, there's something in his patch that still
> needs some tweaking cause it doesn't solve the PR as is:
>
> "
> Of course after my patch for PR 16660, the patch here should be
> changed to just return true always.
>
> Thanks,
> Andrew Pinski
> ----------------------
> > I also tested using Andrew's patch from bug #16660 and always returning
> true in
> > vect_can_force_dr_alignment_p but it does not fix this error.
>
> Andrew, makes sense to you?
> ----------------------
> I think my patch only checks PREFERRED_STACK_BOUNDARY and not
> STACK_BOUNDARY
> which is why it does not work but I have not looked into it at all.
> "
>
> So I think the main question is - Andrew, do you think your patch can be
> made to solve this problem?
>
> If not, we'd have to resort to the patch attached below.
>
> The patch is not ready to be committed as is, cause we'd still have a
bunch
> of failures on i*86-linux:
>
> FAIL: gcc.dg/vect/vect-31.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/vect-34.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-36.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-36.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-64.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/vect-65.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-66.c scan-tree-dump-times Alignment of access
forced
> using peeling 1
> FAIL: gcc.dg/vect/vect-68.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/vect-72.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-73.c scan-tree-dump-times Vectorizing an unaligned
> access 0
> FAIL: gcc.dg/vect/vect-76.c scan-tree-dump-times Vectorizing an unaligned
> access 2
> FAIL: gcc.dg/vect/vect-77.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-78.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-86.c scan-tree-dump-times Alignment of access
forced
> using peeling 0
> FAIL: gcc.dg/vect/vect-all.c scan-tree-dump-times Vectorizing an
unaligned
> access 0
> FAIL: gcc.dg/vect/vect-all.c scan-tree-dump-times Alignment of access
> forced using peeling 0
> FAIL: gcc.dg/vect/slp-25.c scan-tree-dump-times Alignment of access
forced
> using peeling 2
> FAIL: gcc.dg/vect/wrapv-vect-7.c scan-tree-dump-times Vectorizing an
> unaligned access 0
> FAIL: gcc.dg/vect/no-scevccp-outer-6.c scan-tree-dump-times OUTER LOOP
> VECTORIZED. 1
>
> I am considering to maybe throw away all these alignment checks all
> together, cause it's starting to be a pain to maintain them.
>
> thanks,
> dorit
>
> ChangeLog:
>
>       * tree-vectorizer.c (vect_can_force_dr_alignment_p): Check
> STACK_BOUNDARY
>       instead of PREFERRED_STACK_BOUNDARY.
>
>       * testsuite/lib/target-supports.exp (
> check_effective_target_unaligned_stack): New.
>       * testsuite/gcc.dg/vect/vect-2.c: xfail for unaligned_stack
targets.
>       * testsuite/gcc.dg/vect/vect-3.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-4.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-5.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-6.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-7.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-13.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-17.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-18.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-19.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-20.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-21.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-22.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-27.c: Likewise.
>       * testsuite/gcc.dg/vect/vect-29.c: Likewise.
>
> (See attached file: stackboundary.txt)[attachment "stackboundary.
> txt" deleted by Dorit Nuzman/Haifa/IBM]


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]