This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: PATCH for loop.c, SIGFPE with bad integer operands on host (linux-)ix86

To: egcs-patches at egcs dot cygnus dot com
Subject: Re: PATCH for loop.c, SIGFPE with bad integer operands on host (linux-)ix86
From: Geoff Keating <geoffk at ozemail dot com dot au>
Date: Sun, 8 Aug 1999 18:51:42 +1000
References: <Pine.LNX.4.10.9908070909380.1330-100000@penguin.transmeta.com>

Linus Torvalds <torvalds@transmeta.com> writes:

>    For example, if you really _are_ concerned about divide performance,
>    you would sure as hell not want to trap anyway. Either you know that
>    it's not going to trap (so you don't have to slow your code down by
>    testing), or you're concerned about it trapping so you migth as well
>    add the two cycles in order to avoid the 1000+ cycles for the trap
>    overhead.

This is the wrong comparison.

The actual average trap overhead is 2^-32 * 1000+ cycles, or about
2^-20 cycles, assuming evenly distributed values.  So the two cycles
of comparison are about a million times more expensive than the trap.

[Actually, now that I think about it it's possibly more like 2^-64 *
1000+, which is even worse.]

>    For the patch to gcc in question, I don't think the patch makes _any_
>    performance difference what-so-ever, and the patch will make gcc use
>    standards-defined behaviour. In short, it's certainly the right thing
>    to do do change gcc rather than the kernel.

Well, I don't oppose having gcc produce an error when a division overflows.
This probably indicates a bug in the original program anyway.

>  - It can look horribly bad on benchmarks
> 
>    Lets say that you were somebody trying to concoct a benchmark showing
>    how bad the competition was. It's been done before, with software
>    fixups on IEEE behaviour, for example.
> 
>    So you choose a problem set that traps all the time on one
>    architecture, and the other architecture just does a simple test and
>    does the divide in 2 cycles.

Sure.  Now, suppose someone had done that to you, and you wanted to
produce a benchmark that showed the opposite.  Well, you just take any
_real_ data set, which spends most of its time doing divides but
doesn't spend much of its time dividing 0x80000000 by -1, and run
that, and show the 10% performance increase.  Then you say "look, you
can't trust the competition's benchmarks, we optimise _our_ system for
_real_ data sets, and _real_ applications".  Then you say "but, even
if you wanted to run their data set, all you have to do is add this one
line of code, and you get exactly the same performance as they do."

The benchmarks you quoted before were, I believe, comparing different
architectures.  No-one would ever think of, for instance, compiling FP
code on an Alpha with explicit tests around every FP operation to
check for denormal values and emulate them.

>  - it's hard as hell.

The remaining comments are more-or-less saying that writing
instruction emulation handlers is hard, which is generally true, and
that therefore it shouldn't be done, which doesn't follow.  x86 have
lots of special problems with their trap handling and working around
them is just one of the things that makes writing operating systems
so much fun :-).

> This is why I really think that if you feel strongly about divide overflow
> not trapping, you should do it in user mode with a signal handler. In user
> mode you (a) do not have any of the security implications (trivially
> proven: the signal handler does not have any special privileges) and (b)
> user mode CAN validly know about what mode it was executing in, so a pure
> gcc-compiled binary doesn't have to even consider the non-flat modes.

I would suggest that it's probably a better idea to simply report it as
an error.

One good reason for this is that for 0x80000000 / -1, I get the
following results:

PowerPC 601:	0x80000000
PowerPC 750:	0xffffffff
Ultrasparc:	0x7fffffff

and although my opinion is that 0xffffffff was probably not a good
choice, either of the other two are perfectly reasonable.  It would
be annoying to be cross-compiling between two PowerPC machines and get
different executables.

-- 
Geoffrey Keating <geoffk@cygnus.com>

References:
- Re: PATCH for loop.c, SIGFPE with bad integer operands on host (linux-)ix86
  - From: Linus Torvalds

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]