This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Optimizing bit extract


On Fri, Feb 14, 2014 at 2:23 PM, Allan Sandfeld Jensen
<linux@carewolf.com> wrote:
> Hello gcc
>
> I have been looking at optimizations of pixel-format conversion recently and
> have noticed that gcc does take advantage of SSE4a extrq, BMI1 bextr TBM
> bextri or BMI2 pext instructions when it could be useful.
>
> As far as I can tell it should not be that hard. A bextr expression can
> typically be recognized as ((x >> s)  & mask) or ((x << s1)) >> s2). But I am
> unsure where to do such a matching since the mask needs to have specific form
> to be valid for bextr, so it seems it needs to be done before instruction
> selection.
>
> Secondly the bextr instruction in itself only replace two already fast
> instructions so is very minor (unless extracting variable bit-fields which is
> harder recognize).  The real optimization comes from being able to use pext
> (parallel bit extract), which can implement several bextr expressions in
> parallel.
>
> So, where would be the right place to implement such instructions. Would it
> make sense to recognize bextr early before we get to i386 code, or would it be
> better to recognize it late. And where do I put such instruction selection
> optimizations?
>
> Motivating example:
>
> unsigned rgb32_to_rgb16(unsigned rgb32) {
>         unsigned char red = (rgb32 >> 19) & 0x1f;
>         unsigned char green = (rgb32 >> 10) & 0x3f;
>         unsigned char blue = rgb32  & 0x1f;
>        return (red << 11) | (green << 5) | blue;
> }
>
> can be implemented as pext(rgb32, 0x001f3f1f)

We have a special pass that already deals with similar patterns,
the "bswap" pass in tree-ssa-math-opts.c.  It does symbolic
execution to produce the composition of a value.  It currently
handles byte-shifts only I think (not shifting by 19 or 10) but
this is certainly the way I'd recognize pext() (and other generic
shuffles supported by vector ISAs).

You'd have to extend the representation it uses to handle
these more arbitrary shifts/masks of course.

Richard.

> Best regards
> `Allan Sandfeld


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]