This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/56829] Feature request: "generic" builtin to support control flow in vectorized code ("movemask", "vec_any/all_*")


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56829

Peter Cordes <peter at cordes dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter at cordes dot ca

--- Comment #3 from Peter Cordes <peter at cordes dot ca> ---
x86 has packed-compare and movemask instructions, but it also has a PTEST
instruction that sets flags directly from the result of a vector op.  In some
cases it's more efficient than movemsk + test/jcc (esp. if you can make use of
the AND / ANDN ops it does, instead of just testing a vector against itself).

I recently wrote an answer on stackoverflow comparing PTEST vs. PCMPEQB /
PMOVMSKB for comparing two vectors for equality.  Lower latency, but only equal
or fewer uops in this case that was ideal for PTEST.

http://stackoverflow.com/a/31198132/224132

Just something to keep in mind when designing gcc's arch-agnostic vector
support, that at least x86 can branch on vector PTEST, without needing any
compare / movemask.  Requiring things to be written in terms of a movemask
wouldn't be horrible for x86, though.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]