User account creation filtered due to spam.

Bug 19506 - [4.0 Regression] PovRay produces wrong pictures with -mfpmath=sse -ffast-math.
Summary: [4.0 Regression] PovRay produces wrong pictures with -mfpmath=sse -ffast-math.
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.0
Assignee: Richard Henderson
URL:
Keywords: ssemmx, wrong-code
Depends on:
Blocks:
 
Reported: 2005-01-18 15:44 UTC by Uroš Bizjak
Modified: 2005-01-22 23:23 UTC (History)
3 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2005-01-21 00:26:57


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Uroš Bizjak 2005-01-18 15:44:07 UTC
I think it is a good idea to open a PR for this.

There is the problem description:
http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00797.html

A kind of "fix" that shows that problem is related to FP compares:
http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01027.html

A comment from Richard Henderson:
http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01087.html

The proposed NaN trap didn't trigger, so there is no NaNs compared:
http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01096.html

This PR is not related to similar PR19496.
Comment 1 Andrew Pinski 2005-01-18 15:48:06 UTC
I should note that -ffast-math enables -ffinite-math-only also.
Comment 2 Uroš Bizjak 2005-01-18 15:58:10 UTC
This patch unfortunatelly didn't help:
http://gcc.gnu.org/ml/gcc-cvs/2005-01/msg00651.html

With the patch, tracing abyss.pov generates these warnings:
Parsing.............File: ../scenes/advanced/abyss.pov  Line: 434
Warning: Degenerate CSG bounding box (not used!).

......File: ../scenes/advanced/abyss.pov  Line: 642
Warning: Degenerate CSG bounding box (not used!).

...............................................File:
../scenes/advanced/abyss.pov  Line: 728
Warning: Degenerate CSG bounding box (not used!).

Trying to get a smaller testcase, I have found that this problem can be somehow
fixed by following change in ix86_fp_comparison_codes():

Index: i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.780
diff -u -p -r1.780 i386.c
--- i386.c      18 Jan 2005 11:08:33 -0000      1.780
+++ i386.c      18 Jan 2005 15:51:18 -0000
@@ -8446,6 +8446,8 @@ ix86_fp_comparison_codes (enum rtx_code 
     case LT:                   /* LTU - CF=1 - fails on unordered */
       *first_code = UNLT;
       *bypass_code = UNORDERED;
+      return;
+
       break;
     case LE:                   /* LEU - CF=1 | ZF=1 - fails on unordered */
       *first_code = UNLE;

This change protects *bypass_code from being cleared back to UNKNOWN by
following code at the end of function:
  ...
  if (!TARGET_IEEE_FP)
    {
      *second_code = UNKNOWN;
      *bypass_code = UNKNOWN;
    }
}

It looks that something is wrong with LT compares.
Comment 3 Uroš Bizjak 2005-01-18 16:05:20 UTC
(In reply to comment #1)
> I should note that -ffast-math enables -ffinite-math-only also.

The same (bad) results could be achieved with -funsafe-math-optimizations.
Comment 4 Uroš Bizjak 2005-01-19 08:17:22 UTC
This functionality is missing after FP compares rewrite:

==i386_old.md==
...

;; We can't represent the LT test directly.  Do this by swapping the operands.

(define_split
  [(set (match_operand:SF 0 "fp_register_operand" "")
	(if_then_else:SF (lt (match_operand:SF 1 "register_operand" "")
			     (match_operand:SF 2 "register_operand" ""))
			 (match_operand:SF 3 "register_operand" "")
			 (match_operand:SF 4 "register_operand" "")))
   (clobber (reg:CC 17))]
  "reload_completed
   && ((operands_match_p (operands[1], operands[3])
	&& operands_match_p (operands[2], operands[4]))
       || (operands_match_p (operands[1], operands[4])
	   && operands_match_p (operands[2], operands[3])))"
  [(set (reg:CCFP 17)
	(compare:CCFP (match_dup 2)
		      (match_dup 1)))
   (set (match_dup 0)
	(if_then_else:SF (ge (reg:CCFP 17) (const_int 0))
			 (match_dup 1)
			 (match_dup 2)))])
...
Comment 5 Uroš Bizjak 2005-01-20 13:25:09 UTC
Adding ssemmx keyword, because this problem is triggered by -mfpmath=sse.
Comment 6 Richard Henderson 2005-01-20 19:12:22 UTC
(In reply to comment #4)
> This functionality is missing after FP compares rewrite...

No, it got moved to ix86_expand_fp_movcc.
Comment 7 Uroš Bizjak 2005-01-21 06:48:05 UTC
I hope this analysis is of some help:

- the patch in comment #2 fixes the problem
- if bypass_code is cleared for (code == LT) in ix86_expand_branch() _and_
ix86_fp_comparison_fcomi_cost() the problem shows again. It should be cleared in
both places, other places doesn't affect the problem.
- TARGET_SSE_MATH part in ix86_expand_fp_movcc() can be disabled without
affecting the problem.

(following is a patch that disables failure according to comment #2, but enables
it again according to this comment):

Index: i386.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.c,v
retrieving revision 1.787
diff -u -p -r1.787 i386.c
--- i386.c	20 Jan 2005 18:34:11 -0000	1.787
+++ i386.c	21 Jan 2005 06:40:22 -0000
@@ -8470,6 +8470,8 @@ ix86_fp_comparison_codes (enum rtx_code 
     case LT:			/* LTU - CF=1 - fails on unordered */
       *first_code = UNLT;
       *bypass_code = UNORDERED;
+      return;
+
       break;
     case LE:			/* LEU - CF=1 | ZF=1 - fails on unordered */
       *first_code = UNLE;
@@ -8549,6 +8551,11 @@ ix86_fp_comparison_fcomi_cost (enum rtx_
   if (!TARGET_CMOVE)
     return 1024;
   ix86_fp_comparison_codes (code, &bypass_code, &first_code, &second_code);
+
+  // ENABLE FAILURE
+  if ((code == LT) && !TARGET_IEEE_FP)
+    bypass_code = UNKNOWN;
+
   return (bypass_code != UNKNOWN || second_code != UNKNOWN) + 2;
 }
 
@@ -8605,6 +8612,8 @@ ix86_expand_fp_compare (enum rtx_code co
     *bypass_test = NULL_RTX;
 
   ix86_fp_comparison_codes (code, &bypass_code, &first_code, &second_code);
+  //  if ((code == LT) && !TARGET_IEEE_FP)
+  //    bypass_code = UNKNOWN;
 
   /* Do fcomi/sahf based test when profitable.  */
   if ((bypass_code == UNKNOWN || bypass_test)
@@ -8801,6 +8810,9 @@ ix86_fp_jump_nontrivial_p (enum rtx_code
   if (!TARGET_CMOVE)
     return true;
   ix86_fp_comparison_codes (code, &bypass_code, &first_code, &second_code);
+  //  if ((code == LT) && !TARGET_IEEE_FP)
+  //    bypass_code = UNKNOWN;
+
   return bypass_code != UNKNOWN || second_code != UNKNOWN;
 }
 
@@ -8835,6 +8847,10 @@ ix86_expand_branch (enum rtx_code code, 
 
 	ix86_fp_comparison_codes (code, &bypass_code, &first_code, &second_code);
 
+	//ENABLE FAILURE
+	if ((code == LT) && !TARGET_IEEE_FP)
+	  bypass_code = UNKNOWN;
+
 	/* Check whether we will use the natural sequence with one jump.  If
 	   so, we can expand jump early.  Otherwise delay expansion by
 	   creating compound insn to not confuse optimizers.  */
@@ -9767,7 +9783,7 @@ ix86_expand_fp_movcc (rtx operands[])
   enum rtx_code code = GET_CODE (operands[1]);
   rtx tmp, compare_op, second_test, bypass_test;
 
-  if (TARGET_SSE_MATH && SSE_FLOAT_MODE_P (mode))
+  if (TARGET_SSE_MATH && SSE_FLOAT_MODE_P (mode) && 0)
     {
       rtx cmp_op0, cmp_op1, if_true, if_false;
       rtx clob;
Comment 8 Uroš Bizjak 2005-01-21 12:51:00 UTC
Similar problems as described in comment #2 happens for -mfpmath=sse -mno-80387.
However, a patch at http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01464.html is
needed, otherwise compilation fails.
Comment 9 Richard Henderson 2005-01-21 22:03:56 UTC
I've found the problem, via binary search on the povray source and then 
visual inspection of the file found to be miscompiled.  The mov[sd]fcc_1_sse
patterns are missing an earlyclobber.  The fix isn't quite as simple as just
adding '&' to the pattern, but I should have it done today.
Comment 10 Uroš Bizjak 2005-01-22 11:34:03 UTC
Should patch to sse_comparison_operator
(http://gcc.gnu.org/ml/gcc-patches/2005-01/msg01087.html) be reverted in  this
case? It didn't fix the problem, and according to comment #9 the problem is not
 in this area.
Comment 11 CVS Commits 2005-01-22 23:08:10 UTC
Subject: Bug 19506

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2005-01-22 23:07:57

Modified files:
	gcc            : ChangeLog 
	gcc/config/i386: i386.c i386.md 

Log message:
	PR target/19506
	* config/i386/i386.md (movsfcc_1_sse_max): Use nonimmediate_operand
	in both compare operands.
	(movdfcc_1_sse_max): Likewise.
	(movsfcc_1_sse): Likewise.  Add earlyclobber for scratch.
	(movdfcc_1_sse): Likewise.
	* config/i386/i386.c (ix86_split_sse_movcc): Emit copies into the
	scratch register as needed.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7238&r2=2.7239
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.c.diff?cvsroot=gcc&r1=1.788&r2=1.789
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&r1=1.609&r2=1.610

Comment 12 Richard Henderson 2005-01-22 23:23:12 UTC
(In reply to comment #10)
> Should patch to sse_comparison_operator [...] be reverted in this case?

Nah.  As I said in that message, allowing these operators in this manner
doesn't actually give the register allocator any more freedom, and therefore
doesn't do what was hoped.  See e.g. alpha's movsicc_internal for an example
of how the comparisons would have to be reworked in order to give that freedom.

Anyway, fixed.