Bug 71276 - frndint generation should not depend on flag_unsafe_math_optimizations
Summary: frndint generation should not depend on flag_unsafe_math_optimizations
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 6.0
: P3 enhancement
Target Milestone: 7.0
Assignee: Joseph S. Myers
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2016-05-25 14:02 UTC by Joseph S. Myers
Modified: 2016-07-03 21:03 UTC (History)
0 users

See Also:
Host:
Target: i?86-*-* x86_64-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2016-05-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Joseph S. Myers 2016-05-25 14:02:38 UTC
When using x87 floating point, the x86 back end supports generating inline code sequences using the frndint instruction for the rint / ceil / floor / trunc built-in functions (for SFmode, DFmode and XFmode).  But those are conditioned on flag_unsafe_math_optimizations.

There is no need for them to be conditioned on flag_unsafe_math_optimizations.  For rint, frndint is fully correct.  For the others, the only issue is that it raises "inexact" for non-integer operands, whereas TS 18661-1 specifies that these functions should not raise "inexact".  But:

(a) We don't have any options to select TS 18661-1 requirements, and C99 and C11 leave it unspecified whether "inexact" is raised, so raising it is OK for currently supported standards.

(b) Even with TS 18661-1 requirements, it would be OK to use this instruction if !flag_trapping_math.

(c) The documentation of the .md file patterns for ceil / floor / trunc says nothing about whether "inexact" is raised or not.

So these inlines should be enabled whenever x87 floating point is in use (maybe subject to code size considerations; you'd need to check how long the sequences setting / restoring the rounding mode are compared to a call).  And future TS 18661-1 support could disable those for ceil / floor / trunc if flag_ts_18661_1 && flag_trapping_math.
Comment 1 Joseph S. Myers 2016-05-25 21:24:36 UTC
Testing a patch.
Comment 2 Joseph S. Myers 2016-06-03 15:49:36 UTC
Author: jsm28
Date: Fri Jun  3 15:49:04 2016
New Revision: 237074

URL: https://gcc.gnu.org/viewcvs?rev=237074&root=gcc&view=rev
Log:
Add option for whether ceil etc. can raise "inexact", adjust x86 conditions.

In ISO C99/C11, the ceil, floor, round and trunc functions may or may
not raise the "inexact" exception for noninteger arguments.  Under TS
18661-1:2014, the C bindings for IEEE 754-2008, these functions are
prohibited from raising "inexact", in line with the general rule that
"inexact" is only when the mathematical infinite precision result of a
function differs from the result after rounding to the target type.

GCC has no option to select TS 18661 requirements for not raising
"inexact" when expanding built-in versions of these functions inline.
Furthermore, even given such requirements, the conditions on the x86
insn patterns for these functions are unnecessarily restrictive.  I'd
like to make the out-of-line glibc versions follow the TS 18661
requirements; in the cases where this slows them down (the cases using
x87 floating point), that makes it more important for inline versions
to be used when the user does not care about "inexact".

This patch fixes these issues.  A new option
-fno-fp-int-builtin-inexact is added to request TS 18661 rules for
these functions; the default -ffp-int-builtin-inexact reflects that
such exceptions are allowed by C99 and C11.  (The intention is that if
C2x incorporates TS 18661-1, then the default would change in C2x
mode.)

The x86 built-ins for rint (x87, SSE2 and SSE4.1) are made
unconditionally available (no longer depending on
-funsafe-math-optimizations or -fno-trapping-math); "inexact" is
correct for noninteger arguments to rint.  For floor, ceil and trunc,
the x87 and SSE2 built-ins are OK if -ffp-int-builtin-inexact or
-fno-trapping-math (they may raise "inexact" for noninteger
arguments); the SSE4.1 built-ins are made to use ROUND_NO_EXC so that
they do not raise "inexact" and so are OK unconditionally.

Now, while there was no semantic reason for depending on
-funsafe-math-optimizations, the insn patterns had such a dependence
because of use of gen_truncxf<mode>2_i387_noop to truncate back to
SFmode or DFmode after using frndint in XFmode.  In this case a no-op
truncation is safe because rounding to integer always produces an
exactly representable value (the same reason why IEEE semantics say it
shouldn't produce "inexact") - but of course that insn pattern isn't
safe because it would also match cases where the truncation is not in
fact a no-op.  To allow frndint to be used for SFmode and DFmode
without that unsafe pattern, the relevant frndint patterns are
extended to SFmode and DFmode or new SFmode and DFmode patterns added,
so that the frndint operation can be represented in RTL as an
operation acting directly on SFmode or DFmode without the extension
and the problematic truncation.

A generic test of the new option is added, as well as x86-specific
tests, both execution tests including the generic test with different
x86 options and scan-assembler tests verifying that functions that
should be inlined with different options are indeed inlined.

I think other architectures are OK for TS 18661-1 semantics already.
Considering those defining "ceil" patterns: aarch64, arm, rs6000, s390
use instructions that do not raise "inexact"; nvptx does not support
floating-point exceptions.  (This does mean the -f option in fact only
affects one architecture, but I think it should still be a -f option;
it's logically architecture-independent and is expected to be affected
by future -std options, so is similar to e.g. -fexcess-precision=,
which also does nothing on most architectures but is implied by -std
options.)

Bootstrapped with no regressions on x86_64-pc-linux-gnu.  OK to
commit?

	PR target/71276
	PR target/71277
gcc:
	* common.opt (ffp-int-builtin-inexact): New option.
	* doc/invoke.texi (-fno-fp-int-builtin-inexact): Document.
	* doc/md.texi (floor@var{m}2, btrunc@var{m}2, round@var{m}2)
	(ceil@var{m}2): Document dependence on this option.
	* ipa-inline-transform.c (inline_call): Handle
	flag_fp_int_builtin_inexact.
	* ipa-inline.c (can_inline_edge_p): Likewise.
	* config/i386/i386.md (rintxf2): Do not test
	flag_unsafe_math_optimizations.
	(rint<mode>2_frndint): New define_insn.
	(rint<mode>2): Do not test flag_unsafe_math_optimizations for 387
	or !flag_trapping_math for SSE.  Just use gen_rint<mode>2_frndint
	for 387 instead of extending and truncating.
	(frndintxf2_<rounding>): Test flag_fp_int_builtin_inexact ||
	!flag_trapping_math instead of flag_unsafe_math_optimizations.
	Change to frndint<mode>2_<rounding>.
	(frndintxf2_<rounding>_i387): Likewise.  Change to
	frndint<mode>2_<rounding>_i387.
	(<rounding_insn>xf2): Likewise.
	(<rounding_insn><mode>2): Test flag_fp_int_builtin_inexact ||
	!flag_trapping_math instead of flag_unsafe_math_optimizations for
	x87.  Test TARGET_ROUND || !flag_trapping_math ||
	flag_fp_int_builtin_inexact instead of !flag_trapping_math for
	SSE.  Use ROUND_NO_EXC in constant operand of
	gen_sse4_1_round<mode>2.  Just use gen_frndint<mode>2_<rounding>
	for 387 instead of extending and truncating.

gcc/testsuite:
	* gcc.dg/torture/builtin-fp-int-inexact.c,
	gcc.target/i386/387-builtin-fp-int-inexact.c,
	gcc.target/i386/387-rint-inline-1.c,
	gcc.target/i386/387-rint-inline-2.c,
	gcc.target/i386/sse2-builtin-fp-int-inexact.c,
	gcc.target/i386/sse2-rint-inline-1.c,
	gcc.target/i386/sse2-rint-inline-2.c,
	gcc.target/i386/sse4_1-builtin-fp-int-inexact.c,
	gcc.target/i386/sse4_1-rint-inline.c: New tests.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/builtin-fp-int-inexact.c
    trunk/gcc/testsuite/gcc.target/i386/387-builtin-fp-int-inexact.c
    trunk/gcc/testsuite/gcc.target/i386/387-rint-inline-1.c
    trunk/gcc/testsuite/gcc.target/i386/387-rint-inline-2.c
    trunk/gcc/testsuite/gcc.target/i386/sse2-builtin-fp-int-inexact.c
    trunk/gcc/testsuite/gcc.target/i386/sse2-rint-inline-1.c
    trunk/gcc/testsuite/gcc.target/i386/sse2-rint-inline-2.c
    trunk/gcc/testsuite/gcc.target/i386/sse4_1-builtin-fp-int-inexact.c
    trunk/gcc/testsuite/gcc.target/i386/sse4_1-rint-inline.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/common.opt
    trunk/gcc/config/i386/i386.md
    trunk/gcc/doc/invoke.texi
    trunk/gcc/doc/md.texi
    trunk/gcc/ipa-inline-transform.c
    trunk/gcc/ipa-inline.c
    trunk/gcc/testsuite/ChangeLog
Comment 3 Joseph S. Myers 2016-06-03 15:50:51 UTC
Fixed for GCC 7.