This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Miscompilation of remainder expressions
Gabriel Paubert wrote:
On Mon, Jan 15, 2007 at 10:34:23PM +0200, Michael Veksler wrote:
Once the kernel sees the FP trap (whatever its i368 name is),
it decodes the machine code and finds:
idivl (%ecx).
As far as I remember, this will put the result in two registers
one for div_res and one for mod_res.
Since MIN_INT/-1 is undefined, the kernel may put MIN_INT
in div_res, and mod_res=1. Then return to the following instruction.
Should I open a request for the kernel?
No, because the instruction has actually two result values:
- the remainder, which you could safely set to zero (not 1!)
My typo, right.
- the quotient, which is affected by the overflow and there may be
compiler and languages that rely on the exception being generated.
Right, there are languages besides C/C++ and compilers other than GCC,
not to speak of assembly code.
Solution: let GCC generate a redundant prefix, such as ecs. If
ecs eds idivl (%ecx).
or
ecs ess idivl(%ecx)
or
ecs ecs idivl(%ecx)
etc.
generates a trap, only then will the kernel set the remainder to 0
and the quotient to MIN_INT. (Starting with i686 decoding redundant
prefixes costs zero cycles, am I right?)
The kernel cannot know whether you are going to use
the quotient or not by simply decoding the instruction.
It does not matter if quotient is used because in C/C++ its value is
undefined.
For -fwrapv -MIN_INT/-1 == -MIN_INT*-1 == -MIN_INT.
For -ftrapv, then yes we want the signal (otherwise more instructions are
required to trap on the overflow).
Actually I believe that a%b and a%(-b) always return the same value
if we follow the C99 specification. So if you are only interested
in the remainder, you can simply use a%abs(b) instead of a%b. The
overhead of abs() is really small.
Compared to the slow idivl, abs could be negligible, right. However, abs
does introduce new data dependence which might add a noticeable cost.
My x86 assembly days have gone twelve years ago, so I can't remember:
Is there an abs instruction in the i386 instruction set?
If not, then adding a conditional branch for: a=a<0?-a:a is not that
cheap. It uses up branch-prediction resources, and if it misses predictions
then branching can be quite slow (depending on e.g. pipeline depth).
Such a change has to be benchmarked. People are not going to be
happy if this makes their code regress by 5%.
--
Michael Veksler
http:///tx.technion.ac.il/~mveksler