Bug 116762 - gcc.dg/vect/pr52252-ld.c shows store permutation runs into three vector limit
Summary: gcc.dg/vect/pr52252-ld.c shows store permutation runs into three vector limit
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 15.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2024-09-18 12:56 UTC by Richard Biener
Modified: 2024-09-18 12:57 UTC (History)
0 users

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2024-09-18 12:56:49 UTC
When failing vectorization without SLP we see that gcc.dg/vect/pr52252-ld.c
ends up using single-lane SLP.  That's way better than what GCC 14 does which
is hybrid SLP but it might be possible to use a better strathegy for lowering

  node 0x4b25cf0 (max_nunits=1, refcnt=1) vector(16) unsigned char
      op: VEC_PERM_EXPR
      { }
      lane permutation { 0[0] 0[1] 0[2] 1[0] }
      children 0x4b25750 0x4b25990

that merges the three lane and single-lane values.