This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH, tree-optimization]: Fix PR 40550


The problem here is with generic vectorizer which produces vector operations in wider mode than original mode. In this particular example, where generic v2sf mode addition is disabled due to problems with [f]emms, type_for_widest_vector_mode () returns v4sf mode. Unfortunately, this is not in line with the comment that says:

/* For very wide vectors, try using a smaller vector mode. */

- since v4sf mode is obviously wider than v2sf mode. Operating in wider mode leads to all sort of problems with uninitialized values (NaNs, subnormals, etc...) and overwritten stack slots.

The solution is to reject all modes, wider than the original mode.

2009-06-28 Uros Bizjak <>

    PR tree-optimization/40550
    * tree-vect-generic.c (expand_vector_operations_1): Compute in
    vector_compute_type only when the size of vector_compute_type is
    less or equal to the size of type.


2009-06-28 Uros Bizjak <>

    PR tree-optimization/40550
    * gcc.dg/pr40550.c: New test.

Patch was bootstrapped and regression tested on 4.4 branch, where the testcase breaks for unpatched gcc on i686 and x86_64 targets. There is the same problem present on 4.3 and earlier branches. The problem is also present on mainline, but masked by different stack layout.

I have also checked, that v8sf mode still gets decomposed to v4sf mode on SSE targets (FWIW, when mmx_addv2sf3 is renamed to addv2sf3 gcc generates expected pfadd with -m3dnow).

OK for 4.3, 4.4 and mainline?


Attachment: p.diff.txt
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]