This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH], Generate XXBR{H,W,D} for bswap{16,32,64} on PowerPC ISA 3.0 (power9)


On Fri, Nov 10, 2017 at 11:45:23AM -0600, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Wed, Nov 08, 2017 at 03:02:30PM -0500, Michael Meissner wrote:
> > > Should this somehow be joined with p9_xxbrd_<mode>?  Or maybe you should
> > > generate that, instead.
> > 
> > No, since p9_xxbrd_<mode> doesn't include DImode.  We could add yet another
> > mode iterator to include DI + V2DI/V2DF modes, but that seems to be overkill.
> 
> Having separate DI and V2DI patterns for the same instruction isn't great
> either.  If it is only this insn it is fine of course, but I suspect we'll
> get many more later.  Well I guess we can solve it later ;-)
> 
> > I simplified it to only change the one insn that would normally match the
> > register to register case to skip if ISA 3.0.  I put a test on bswapdi2_reg as
> > well.
> 
> Thanks.
> 
> > 
> > > > @@ -2507,6 +2513,8 @@ (define_expand "bswapdi2"
> > > >  	emit_insn (gen_bswapdi2_load (dest, src));
> > > >        else if (MEM_P (dest))
> > > >  	emit_insn (gen_bswapdi2_store (dest, src));
> > > > +      else if (TARGET_P9_VECTOR)
> > > > +	emit_insn (gen_bswapdi2_xxbrd (dest, src));
> > > >        else
> > > >  	emit_insn (gen_bswapdi2_reg (dest, src));
> > > >        DONE;
> > > 
> > > Pity that you cannot easily do similar for HI and SI.
> > 
> > Not really.  Bswap64 generates 9 separate insns, while bswap32 and bswap16 only
> > generate 3 insns.  So, having to add the move direct to/from the vector
> > registers would mean it would be slower than the normal case that is currently
> > generated.  But if the value happens to be in a VSX register, then we can do it
> > in one instruction.
> 
> I meant, have just two lines as above :-)
> 
> > After a burn-in period, I plan to backport a reduced version of the patch
> > (XXBRD only) to gcc 7, unless you think it shouldn't go into gcc 7.
> 
> Well, why do we want it on 7?

Advanced customers and the kernel team using pre-production power9 servers have
asked for it.  Since GCC 8 likely won't be out for several months, I figured to
ask for it to go into GCC 7.

> > +(define_insn "bswapdi2_xxbrd"
> > +  [(set (match_operand:DI 0 "gpc_reg_operand" "=wo")
> > +	(bswap:DI (match_operand:DI 1 "gpc_reg_operand" "wo")))]
> > +  "TARGET_POWERPC64 && TARGET_P9_VECTOR"
> > +  "xxbrd %x0,%x1"
> > +  [(set_attr "type" "vecperm")])
> 
> This doesn't need TARGET_POWERPC64 I think.
> 
> Please look at that.  The patch is okay for trunk.  Thanks!

I removed the test for POWERPC64.  It isn't needed, but I don't have any way of
testing whether the resulting code is slower or faster.  I'll attach the patch
of the changes I checked in.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: gcc-power9.patch301c
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]