[PATCH, tree-optimization]: Fix PR 40550
Richard Guenther
richard.guenther@gmail.com
Sun Jun 28 16:37:00 GMT 2009
2009/6/28 Uros Bizjak <ubizjak@gmail.com>:
> Hello!
>
> The problem here is with generic vectorizer which produces vector operations
> in wider mode than original mode. In this particular example, where generic
> v2sf mode addition is disabled due to problems with [f]emms,
> type_for_widest_vector_mode () returns v4sf mode. Unfortunately, this is not
> in line with the comment that says:
>
> /* For very wide vectors, try using a smaller vector mode. */
>
> - since v4sf mode is obviously wider than v2sf mode. Operating in wider mode
> leads to all sort of problems with uninitialized values (NaNs, subnormals,
> etc...) and overwritten stack slots.
>
> The solution is to reject all modes, wider than the original mode.
+ if (vector_compute_type != NULL_TREE
+ && (int_size_in_bytes (vector_compute_type)
+ <= int_size_in_bytes (compute_type)))
it should work to simply compare TYPE_VECTOR_SUBPARTS here.
Ok with that change.
Thanks,
Richard.
> 2009-06-28 Uros Bizjak <ubizjak@gmail.com>
>
> PR tree-optimization/40550
> * tree-vect-generic.c (expand_vector_operations_1): Compute in
> vector_compute_type only when the size of vector_compute_type is
> less or equal to the size of type.
>
> testsuite/ChangeLog:
>
> 2009-06-28 Uros Bizjak <ubizjak@gmail.com>
>
> PR tree-optimization/40550
> * gcc.dg/pr40550.c: New test.
>
> Patch was bootstrapped and regression tested on 4.4 branch, where the
> testcase breaks for unpatched gcc on i686 and x86_64 targets. There is the
> same problem present on 4.3 and earlier branches. The problem is also
> present on mainline, but masked by different stack layout.
>
> I have also checked, that v8sf mode still gets decomposed to v4sf mode on
> SSE targets (FWIW, when mmx_addv2sf3 is renamed to addv2sf3 gcc generates
> expected pfadd with -m3dnow).
>
> OK for 4.3, 4.4 and mainline?
>
> Uros.
>
More information about the Gcc-patches
mailing list