This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: SSE and x86 FP ABI (again)

To: Tim Prince <tprince at computer dot org>
Subject: Re: SSE and x86 FP ABI (again)
From: Drew Hess <dhess at ilm dot com>
Date: Fri, 2 Mar 2001 17:08:45 -0800 (PST)
cc: <gcc at gcc dot gnu dot org>

On Fri, 2 Mar 2001, Tim Prince wrote:

> I thought that linux people had long since accepted the idea of
> supporting alignments; 16-byte alignment support has been needed since
> the Pentium Pro, and 32-byte alignments are needed for performance on
> the current generation architectures.  The compilers I've used didn't
> pass arguments in SSE registers, so there is no problem inter-mixing SSE
> and non-SSE objects.  Are you talking about changing that?  I doubt that
> people running gcc/g77 on Windows, as I do, will be standing in the way
> of SSE, although the alignment problems in gcc-2.95 g++ seem severe.

The only discussion I've seen on the list re: breaking the ABI was about
the loss of precision in SSE versus double-extended x87.

> Now that the major bugs in -ffast-math seem to be out, I'm much happier
> with it.  AFAIK the major implication is the unpredictable behavior of
> comparisons involving NaN.  That hasn't been such a problem for me as it
> was on notorious NaN-generating architectures like R8000.

Even without using -ffast-math, with x87 FP and gcc we've had problems
with very small numbers that should round to zero in single-precision, but
are representable in double-extended.  Depending on the optimization level
used, these values may or may not get spilled (and therefore may or may
not get converted to single-precision), and the same code behaves
differently.  Before the SSE modes, we were forced to use -ffloat-store
for predictable behavior across different optimization levels in these
cases.

> The major Windows compilers are happy with setting 53-bit precision
> mode; then, the behavior must resemble some rs6k implementations; and
> the performance of sqrt() and divide increases.  I myself prefer to see
> it left to default, only to change when the programmer requests it.

Yeah, I think that's fine.

> > The SSE modes give us
> > the best of both worlds.
> >
> It's got a long way to go; the loss of out-of-order with SSE can kill
> the performance when it is applied with conditional branching.

Am I misunderstanding you, or are you saying that SSE instructions are not
executed out-of-order or speculatively (across branches) on Pentium 3
and/or Pentium 4?  That's hard to believe.  At the very least, I can't
find any mention of that behavior in the Pentium 4 optimization guide.

> > I'm just
> > hoping to maintain the status quo.
>
> In which way?  Do you mean to support high performance without
> necessarily using -ffast-math, or continuing not to support more than
> 4-byte alignments in Windows?

I mean that I would like to see the -msse and -msse2 flags implemented as
separate options and not part of -ffast-math.  That's how they're
currently implemented, and I'm hoping it doesn't change.

-dwh-

References:
- Re: SSE and x86 FP ABI (again)
  - From: Tim Prince

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]