28634 – rounding problem with -fdelayed-branch on hppa/mips

Bug 28634 - rounding problem with -fdelayed-branch on hppa/mips

Summary: rounding problem with -fdelayed-branch on hppa/mips

Status:	RESOLVED FIXED

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	rtl-optimization (show other bugs)
Version:	4.1.1

Importance:	P3 normal
Target Milestone:	4.1.2
Assignee:	Richard Sandiford

URL:
Keywords:	wrong-code

Depends on:
Blocks:

Reported:	2006-08-07 12:52 UTC by Martin Michlmayr
Modified:	2006-09-09 11:01 UTC (History)
CC List:	5 users (show)

See Also:
Host:
Target:	hppa-linux-gnu, mips-linux-gnu
Build:
Known to work:	4.0.3 4.1.2 4.2.0
Known to fail:	4.1.1
Last reconfirmed:	2006-08-13 08:34:50

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Martin Michlmayr 2006-08-07 12:52:33 UTC

[ forwarded from http://bugs.debian.org/381710 ]

The following report has been submitted by Kurt Roeckx:

I've been looking at the perl testsuite failure on hppa.  See
http://bugs.debian.org/374396

This code:
        while (cdouble < 0.0)
                cdouble += adouble;

Generated by gcc-4.1 with -O2 and -fdelayed-branch gives:
        fadd,dbl %fr13,%fr22,%fr13
 .L1447:
        fcmp,dbl,!< %fr13,%fr0
        ftest
        b .L1447
        fadd,dbl %fr13,%fr22,%fr13
        fsub,dbl %fr13,%fr22,%fr13

With -O2 and -fno-delayed-branch:
.L1239:
        fadd,dbl %fr13,%fr22,%fr13
        fcmp,dbl,!< %fr13,%fr0
        ftest
        b,n .L1239

As you can see, in case of the delayed branches it always
executes an fadd at the start and fsub at the end, which it
doesn't do without the delayed branches.

This is causing unwanted rounding problems, since the mantisa
doesn't have enough bits to keep the the required information.
I think atleast in this case, it's not a good idea to do this
optimization with floating point numbers.

The same code on gcc-4.0 with -fdelayed-branch seems to generate
this code:
.L661:
        fadd,dbl %fr12,%fr22,%fr12
        fcmp,dbl,!< %fr12,%fr0
        ftest
        b .L661
        ldo -256(%r30),%r20

With -fno-delayed-branch:
.L643:
        fadd,dbl %fr12,%fr22,%fr12
        fcmp,dbl,!< %fr12,%fr0
        ftest
        b,n .L643

So gcc-4.0 looks good.

gcc-snapshot 20060721-1 gives with -fdelayed-branch:
        fadd,dbl %fr12,%fr22,%fr12
.L1449:
        fcmp,dbl,!< %fr12,%fr0
        ftest
        b .L1449
        fadd,dbl %fr12,%fr22,%fr12
        fsub,dbl %fr12,%fr22,%fr12

So that has the same problem.

For those not familiar with hppa assembler, a branch normally
executes the instruction following it too, before branching.
The ",n" in "b,n" will prevent the next instruction from being
executed, so has the same effect as following it with a nop
instruction.


The following code has the same effect:

#include <stdio.h>
double cdouble = -1;
int     main()
{
        double adouble;

        adouble = 9007199254740992.0; /* 2^53 */
        while (cdouble < 0.0)
                cdouble += adouble;
        printf("%lf\n", cdouble);
        return 0;
}

With delayed branches it prints:
9007199254740992.000000
without:
9007199254740991.000000

Comment 1 Andrew Pinski 2006-08-08 00:29:26 UTC

This sounds like two target problems rather than generic ones.

Comment 2 Richard Sandiford 2006-08-13 08:34:50 UTC

Re comment #1: it's a generic bug in reorg.c (fill_slots_from_thread).

I'm testing a patch.

Comment 3 Richard Sandiford 2006-08-14 11:56:04 UTC

Subject: Bug 28634

Author: rsandifo
Date: Mon Aug 14 12:55:52 2006
New Revision: 116124

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116124
Log:
gcc/
	PR rtl-optimization/28634
	* reorg.c (fill_slots_from_thread): Do not assume A + X - X == A
	for floating-point modes unless flag_unsafe_math_optimizations.

gcc/testsuite/
	PR rtl-optimization/28634
	* gcc.c-torture/execute/ieee/pr28634.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/ieee/pr28634.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/reorg.c
    trunk/gcc/testsuite/ChangeLog

Comment 4 Richard Sandiford 2006-08-14 11:58:10 UTC

Patch applied to mainline.  It has been approved for 4.1,
so I'll apply it there after testing.

Comment 5 Richard Sandiford 2006-09-09 10:56:40 UTC

Subject: Bug 28634

Author: rsandifo
Date: Sat Sep  9 10:56:31 2006
New Revision: 116796

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=116796
Log:
gcc/
	PR rtl-optimization/28634
	* reorg.c (fill_slots_from_thread): Do not assume A + X - X == A
	for floating-point modes unless flag_unsafe_math_optimizations.

gcc/testsuite/
	PR rtl-optimization/28634
	* gcc.c-torture/execute/ieee/pr28634.c: New test.

Added:
    branches/gcc-4_1-branch/gcc/testsuite/gcc.c-torture/execute/ieee/pr28634.c
Modified:
    branches/gcc-4_1-branch/gcc/ChangeLog
    branches/gcc-4_1-branch/gcc/reorg.c
    branches/gcc-4_1-branch/gcc/testsuite/ChangeLog

Comment 6 Richard Sandiford 2006-09-09 11:01:21 UTC

Applied to 4.1 after testing on mipsisa64-elf and mips64-linux-gnu.
Although the bug has been around for a long time, it isn't known to
be a regression from 4.0 to some earlier release, so it doesn't
qualify for a 4.0 backport.  I'll therefore close this PR as fixed.