This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] PowerPC PR target/84154, fix floating point to small integer conversion regression


On Mon, Feb 05, 2018 at 08:01:32AM -0600, Segher Boessenkool wrote:
> On Mon, Feb 05, 2018 at 07:54:58AM -0500, Michael Meissner wrote:
> > On Mon, Feb 05, 2018 at 05:57:25AM -0600, Segher Boessenkool wrote:
> > > On Thu, Feb 01, 2018 at 02:31:17PM -0500, Michael Meissner wrote:
> > > > This patch fixes the optimization regression that occurred on GCC 7 where
> > > > conversions from the various floating point types to small integers would at
> > > > times generate a store and a load.
> > > 
> > > [ snip big explanation; thanks for that! ]
> > > 
> > > Could you merge the signed and unsigned patterns, using any_fix?  Or is
> > > there a reason that cannot work (other than that <su> unsigned_fix seems
> > > buggy, it should say "u")?
> > 
> > Well that's the way the instructions are.  For traditional FPR instructions we
> > have FCTIWZ vs. FCTIWUZ, while on the VSX side we have XSCVDPSXWS
> > vs. XSCVDPUXWS.  If you mean the name of the insn, I can change it if desired,
> > but originally it was based on the FPR insn name.
> 
> We have
> 
> (define_code_attr su [(sign_extend      "s")
>                       (zero_extend      "u")
>                       (fix              "s")
>                       (unsigned_fix     "s")
>                       (float            "s")
>                       (unsigned_float   "u")])
> 
> and "s" for unsigned_fix seems like it should be "u".  Very surprising
> otherwise (if this is needed in some case, it should just write "s" there
> instead of "<su>").
> 
> > > Okay for trunk even without that (but please try).  Also okay for 7 after
> > > looking for fallout.
> > 
> > In the past, I have found that combining code iterators with two mode iterators
> > has a bug where it would use the wrong code iterator, so I just avoided doing
> > that.  I'll see if it is still a bug.
> 
> Hrm.  If you have multiple iterators you often need to use ":" syntax,
> and you might want that anyway because the precedence rules are non-obvious;
> but you are hitting something else?  Please open a PR if so :-)

I already do use the : syntax.

I found as long as I avoid putting the <su> or <u> in the output template
(i.e. use an output statement instead of a template) it works.  It only
seems to fail in the IEEE128 case, and not in the SFDF case.  I will submit a
bug report  on it after this gets checked in, as it will be simpler to provide
a patch that people can test.

This patch cleans up all of the {,unsigned_}fix patterns that convert to QImode
or HImode.  It adds the memory combiner to include SImode to the mix.  While I
was at it, I combined the IEEE insns to use any_fix.

I have checked this patch on both little endian and big endian power8 systems,
and I have have hand checked the code for power7 and power9 systems.  There
were no regressions in the test suite, the new tests were verified to run, and
I did bootstrap builds on both systems.  The big endian power8 system also
32-bit tests as well as 64-bit tests.  Can I apply this patch to the trunk and
after a waiting period, apply the patch to the GCC 7 branch?

[gcc]
2018-02-05  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84154
	* config/rs6000/rs6000.md (fix_trunc<SFDF:mode><QHI:mode>2):
	Convert from define_expand to be define_insn_and_split.  Rework
	float/double/_Float128 conversions to QI/HI/SImode to work with
	both ISA 2.07 (power8) or ISA 3.0 (power9).  Fix regression where
	conversions to QI/HImode types did a store and then a load to
	truncate the value.  For conversions to VSX registers, don't split
	the insn, instead emit the code directly.  Use the code iterator
	any_fix to combine signed and unsigned conversions.
	(fix<uns>_trunc<SFDF:mode>si2_p8): Likewise.
	(fixuns_trunc<SFDF:mode><QHI:mode>2): Likewise.
	(fix_trunc<IEEE128:mode><QHI:mode>2): Likewise.
	(fix<uns>_trunc<SFDF:mode><QHI:mode>2): Likewise.
	(fix_<mode>di2_hw): Likewise.
	(fixuns_<mode>di2_hw): Likewise.
	(fix_<mode>si2_hw): Likewise.
	(fixuns_<mode>si2_hw): Likewise.
	(fix<uns>_<IEEE128:mode><SDI:mode>2_hw): Likewise.
	(fix<uns>_trunc<IEEE128:mode><QHI:mode>2): Likewise.
	(fctiw<u>z_<mode>_smallint): Rename fctiw<u>z_<mode>_smallint to
	fix<uns>_trunc<SFDF:mode>si2_p8.
	(fix_trunc<SFDF:mode><QHI:mode>2_internal): Delete, no longer
	used.
	(fixuns_trunc<SFDF:mode><QHI:mode>2_internal): Likewise.
	(fix<uns>_<mode>_mem): Likewise.
	(fctiw<u>z_<mode>_mem): Likewise.
	(fix<uns>_<mode>_mem): Likewise.
	(fix<uns>_trunc<SFDF:mode><QHSI:mode>2_mem): On ISA 3.0, prevent
	the register allocator from doing a direct move to the GPRs to do
	a store, and instead use the ISA 3.0 store byte/half-word from
	vector register instruction.  For IEEE 128-bit floating point,
	also optimize stores of 32-bit ints.
	(fix<uns>_trunc<IEEE128:mode><QHSI:mode>2_mem): Likewise.

[gcc/testsuite]
2018-02-05  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/84154
	* gcc.target/powerpc/pr84154-1.c: New tests.
	* gcc.target/powerpc/pr84154-2.c: Likewise.
	* gcc.target/powerpc/pr84154-3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Attachment: pr84154.patch02b
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]