[Bug tree-optimization/92819] [10 Regression] Worse code generated on avx2 due to simplify_vector_constructor
rguenth at gcc dot gnu.org
gcc-bugzilla@gcc.gnu.org
Thu Dec 5 11:46:00 GMT 2019
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92819
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |rsandifo at gcc dot gnu.org
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Richard - when we have
_7 = { _2, _2 };
VEC_PERM <x_3(D), _7, { 0, 3 }>
then we somehow run into
/* See if the permutation is performing a single element
insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR
in that case. But only if the vector mode is supported,
otherwise this is invalid GIMPLE. */
if (TYPE_MODE (type) != BLKmode
&& (TREE_CODE (cop0) == VECTOR_CST
|| TREE_CODE (cop0) == CONSTRUCTOR
|| TREE_CODE (cop1) == VECTOR_CST
|| TREE_CODE (cop1) == CONSTRUCTOR))
{
if (sel.series_p (1, 1, nelts + 1, 1))
{
/* After canonicalizing the first elt to come from the
first vector we only can insert the first elt from
the first vector. */
at = 0;
if ((ins = fold_read_from_vector (cop0, sel[0])))
op0 = op1;
but of course cop0 isn't something we can simplify. So - why can
we only insert the first elt from the first vector? Is series_p
falsely triggering because both vectors are "series_p"? That said,
we can insert both ways but only one way succeeds in the end.
The code also looks like it would fail for V1m vectors since then
the base element number is out of range. That is, does the above
even make sense for nelts <= 2?
This is
v2df
qux (v2df x, double *p)
{
return (v2df) { x[0], *p };
}
it works for
v2df
qux (v2df x, double *p)
{
return (v2df) { *p, x[1] };
}
when guarding the special case with
TREE_CODE (cop0) == VECTOR_CST || TREE_CODE (cop0) == CONSTRUCTOR
the later code runs into
bool
vec_perm_indices::series_p (unsigned int out_base, unsigned int out_step,
element_type in_base, element_type in_step) const
{
/* Check the base value. */
if (maybe_ne (clamp (m_encoding.elt (out_base)), clamp (in_base)))
return false;
and thus doesn't handle insertion at the very last element? I can
"fix" that by doing
@@ -6047,9 +6049,11 @@ (define_operator_list COND_TERNARY
for (at = 0; at < encoded_nelts; ++at)
if (maybe_ne (sel[at], at))
break;
- if (at < encoded_nelts && sel.series_p (at + 1, 1, at + 1, 1))
+ if (at < encoded_nelts
+ && (known_eq (at + 1, nelts)
+ || sel.series_p (at + 1, 1, at + 1, 1)))
{
maybe the earlier series_p query needs to be adjusted similarly? Or do you
think that
@@ -6032,7 +6032,9 @@ (define_operator_list COND_TERNARY
|| TREE_CODE (cop1) == VECTOR_CST
|| TREE_CODE (cop1) == CONSTRUCTOR))
{
- if (sel.series_p (1, 1, nelts + 1, 1))
+ if (sel.series_p (1, 1, nelts + 1, 1)
+ && (TREE_CODE (cop0) == VECTOR_CST
+ || TREE_CODE (cop0) == CONSTRUCTOR))
{
/* After canonicalizing the first elt to come from the
first vector we only can insert the first elt from
is fine?
More information about the Gcc-bugs
mailing list