This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication

From: "pinskia at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sat, 28 Jan 2017 08:36:31 +0000
Subject: [Bug tree-optimization/18438] vectorizer failed for vector matrix multiplication
Auto-submitted: auto-generated
References: <bug-18438-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=18438

--- Comment #14 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Maxim Kuvyrkov from comment #12) 
> You are making an orthogonal point to this bug report: whether or not to
> vectorize such a loop.  But if loop is vectorized, then on any
> microarchitecture it is better to have "st2" vs "umov; st1; str".

Yes but thinking about the problem some more I do think there are some vector
cost model issue in the aarch64 backend where we don't model int vs floating
point cost differences.  For an example ^ for scalar int might be one cycle but
vector it is 4 cycles but for floating point scalar addition, it is 4 cycles
while the floating point vector addition is just 4 cycles.
struct cpu_vector_cost
{
  const int scalar_stmt_cost;            /* Cost of any scalar operation,
                                            excluding load and store.  */
...

  const int vec_stmt_cost;               /* Cost of any vector operation,
                                            excluding load, store, permute,
                                            vector-to-scalar and
                                            scalar-to-vector operation.  */


Anyways I filed PR 79262 for the regression.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]