This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: Patch #1 to add SSE5 support to the x86 GCC compiler
- From: "Uros Bizjak" <ubizjak at gmail dot com>
- To: "GCC Patches" <gcc-patches at gcc dot gnu dot org>
- Cc: "Michael Meissner" <michael dot meissner at amd dot com>, "Jan Hubicka" <jh at suse dot cz>
- Date: Thu, 6 Sep 2007 08:33:00 +0200
- Subject: Re: Patch #1 to add SSE5 support to the x86 GCC compiler
- References: <46D96E87.1090907@gmail.com> <20070905233431.GA8247@mmeissner-gold.amd.com>
On 9/6/07, Michael Meissner <michael.meissner@amd.com> wrote:
> Good Catch. Originally, SSE5 had separate round instructions, and when SSE 4.1
> was announced, we decided to use the same opcode. I missed the precision bit
> in the documentation. I think for rint, we want the precision bit cleared if
> we have trapping math, but for nearbyint we want it set (and make rint the same
> as nearbyint if no trapping math).
[...]
> In looking at the documentation for the instructions and at the C99 standard, I
> think it is safe to remove the tests for trapping math, etc. for TARGET_ROUND
> (obviously they need to be on for the non-TARGET_ROUND case). I also added
> round<mode>2 support which I missed in the first go around.
The problem with round insn is, that it generates only subset of
required FP exceptions - only precision exception, but it doesn't
generate i.e. exception for denormal operands. Due to this, it is
valid only for !flag_trapping_math.
Following the expansion of rint and nearbyint through middle-end, we
find in builtins.c that:
--cut here--
CASE_FLT_FN (BUILT_IN_NEARBYINT):
builtin_optab = nearbyint_optab;
if (flag_trapping_math)
break;
/* Else fallthrough and expand as rint. */
CASE_FLT_FN (BUILT_IN_RINT):
--cut here--
So, considering that for !flag_trapping_math we expand nearbyint as
rint, we don't need to change nearbyint expander (as round insn is
valid only for !flag_trapping_math). So, IMO the best way is to mask P
exception for all round instructions and use them only for
!flag_trapping_math.
(The above code is indeed a bit suspicious - nearbyint() is the
function that _doesn't_ generate inexact exception, so probably rint
should be expanded as nearbyint for !flag_trapping_math).
Regarding round<mode>2 support:
In your patch, roundsX insn is used with SSE_4_1_ROUND_NEAREST_EVEN.
However round() manpage says that:
--cut here--
These functions round x to the nearest integer, but round halfway cases away
from zero (regardless of the current rounding direction), instead of
to the nearest
even integer like rint().
--cut here--
I can't check for new insns, but (similar) x87's frndint can't be used
to directly implement round() function. Are you sure that your round()
expansion is correct?
> I'm enclosing the new patch. Is this acceptable?
I'd suggest for this moment to leave bodies of rounding insns as they
are (please add TARGET_ROUND into insn constraint, but don't introduce
SSE4_1_ROUND constants yet); we will revisit minor open issues later.
For new functionality, I think that Jan is the right person to review,
but it would be nice if you reformat new patterns to fit 80 columns.
BTW: Luckily, options infrastructure was fixed some time ago, so we
have plenty of bits to accomodate new options ;)
Thanks,
Uros.