Bug 96208 - non-power-of-2 group size can be vectorized for 2-element vectors case
Summary: non-power-of-2 group size can be vectorized for 2-element vectors case
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
Reported: 2020-07-15 15:01 UTC by Dmitrij Pochepko
Modified: 2020-07-17 06:44 UTC (History)
1 user (show)

See Also:
Known to work:
Known to fail:
Last reconfirmed:

initial implementation (2.97 KB, patch)
2020-07-15 15:01 UTC, Dmitrij Pochepko
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitrij Pochepko 2020-07-15 15:01:36 UTC
Created attachment 48879 [details]
initial implementation

Current loop vectorizer only vectorize loops with groups size being power-of-2 or 3 due to vector permutation generation algorithm specifics.
However, in case of 2-element vectors, simple permutation schema can be used to support any group size: insert each vector element into required position, which leads to reasonable amount of operations in case of 2-element vectors.

Initial version is attached.
Comment 1 Richard Biener 2020-07-17 06:44:14 UTC
Note the code path you are changing will go away and "improving" it puts burden onto the replacement implementation ...

The testcase suggests the issue is missing SLP support for the not grouped
load of *k, something I've been looking at recently.