Bug 19252 - sub optimal use of fpu comparisons in SSE code
Summary: sub optimal use of fpu comparisons in SSE code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: 4.0.0
Assignee: Richard Henderson
URL:
Keywords: missed-optimization, ssemmx
Depends on:
Blocks:
 
Reported: 2005-01-04 12:38 UTC by tbp
Modified: 2005-01-14 00:56 UTC (History)
3 users (show)

See Also:
Host: cygwin
Target: pentium3-*-*
Build: pentium3-*-*
Known to work:
Known to fail:
Last reconfirmed: 2005-01-12 02:48:49


Attachments
All hell broke lose (856 bytes, text/plain)
2005-01-04 12:39 UTC, tbp
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tbp 2005-01-04 12:38:33 UTC
Somehow fpu comparisons are used in SSE heavy code with -mfpmath=sse.

Case in point (not from the testcase):
 401447:       fldz
[snip lots of SSE only operations]
 401535:       movss  %xmm2,0xc(%esp)
 40153b:       flds   0xc(%esp)
 40153f:       fcomip %st(1),%st
 401541:       jbe    401650

I couldn't reduce further the attached testcase, but it's really obvious
something's wrong the generated code for fp_compare vs pristine_intersection.

Switches: -O3 -march=k8 -mfpmath=sse -ffast-math -fomit-frame-pointer
Comment 1 tbp 2005-01-04 12:39:51 UTC
Created attachment 7870 [details]
All hell broke lose
Comment 2 Uroš Bizjak 2005-01-04 13:00:55 UTC
Some discussion:
http://gcc.gnu.org/ml/gcc/2004-12/msg01027.html

This PR could be related to:

PR 19009: Loading of FP constants into FP reg via SSE reg
PR 19250: minss/maxss SSE insn not generated for -mfpmath=sse
Comment 3 Uroš Bizjak 2005-01-04 14:05:40 UTC
Adding -mno-80387 to flags doesn't generate fcomip anymore.

But when I try to compile povray with '-O3 -march=pentium4 -mno-80387
-mfpmath=sse -ffast-math' I got into:

bbox.cpp: In function 'void build_area_table(BBOX_TREE**, long int, long int,
double*)':
bbox.cpp:1753: error: insn does not satisfy its constraints:
(insn 216 200 217 5 (set (reg:CCFP 17 flags)
        (compare:CCFP (reg:DF 10 st(2))
            (reg/v:DF 9 st(1) [orig:68 bmin$1 ] [68]))) 25 {*cmpfp_i_sse_only} (nil)
    (nil))
bbox.cpp:1753: internal compiler error: in copyprop_hardreg_forward_1, at
regrename.c:1567
Please submit a full bug report,
Comment 4 Wolfgang Bangerth 2005-01-06 14:40:21 UTC
Some more discussion: 
  http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00176.html 
 
Comment 5 Uroš Bizjak 2005-01-07 11:32:24 UTC
Also some discussion here:
  http://gcc.gnu.org/ml/gcc-patches/2005-01/msg00394.html
Comment 6 CVS Commits 2005-01-14 00:34:02 UTC
Subject: Bug 19252

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2005-01-14 00:33:51

Modified files:
	gcc            : ChangeLog 
	gcc/config/i386: i386-protos.h i386.c i386.md 

Log message:
	PR target/19099
	PR target/19250
	PR target/19252
	* config/i386/i386.md (cmpdf, cmpsf, bunordered, bordered, buneq,
	bunge, bungt, bunle, bunlt, bltgt): Enable for TARGET_SSE_MATH,
	not just TARGET_SSE.
	(cmpfp_i_387): Rename from cmpfp_i.  Move after sse patterns.
	(cmpfp_i_mixed): Rename from cmpfp_i_sse; use for TARGET_MIX_SSE_I387.
	(cmpfp_i_sse): Rename from cmpfp_i_sse_only; use for TARGET_SSE_MATH.
	(cmpfp_iu_mixed, cmpfp_iu_sse, cmpfp_iu_387): Similarly.
	(fp_jcc_1_mixed, fp_jcc_1_sse, fp_jcc_1_387): Similarly.
	(fp_jcc_2_mixed, fp_jcc_2_sse, fp_jcc_2_387): Similarly.
	(fp_jcc_3_387, fp_jcc_4_387, fp_jcc_5_387, fp_jcc_6_387,
	fp_jcc_7_387, fp_jcc_8_387): Rename from fp_jcc_N.
	(movdicc_c_rex64): Rename with '*'.
	(movsfcc, movdfcc): Add checks for 387 and sse math to condition.
	(movsfcc_1_sse_min, movsfcc_1_sse_max, movsfcc_1_sse): New.
	(movsfcc_1_387): Rename from movsfcc_1.
	(movdfcc_1_sse_min, movdfcc_1_sse_max, movdfcc_1_sse): New.
	(movdfcc_1, movdfcc_1_rex64): Add check for 387.
	(sminsf3, smaxsf3, smindf3, smaxdf3): New.
	(minsf3, minsf, minsf_nonieee, minsf_sse, mindf3, mindf,
	mindf_nonieee, mindf_sse, maxsf3, maxsf, maxsf_nonieee, maxsf_sse,
	maxdf3, maxdf, maxdf_nonieee, maxdf_sse, sse_movsfcc, sse_movsfcc_eq,
	sse_movdfcc, sse_movdfcc_eq, sse_movsfcc_const0_1,
	sse_movsfcc_const0_2, sse_movsfcc_const0_3, sse_movsfcc_const0_4,
	sse_movdfcc_const0_1, sse_movdfcc_const0_2, sse_movdfcc_const0_3,
	sse_movdfcc_const0_4): Remove.
	* config/i386/i386.c (ix86_expand_fp_movcc): For TARGET_SSE_MATH,
	recognize min/max early.  Update for changed sse cmove patterns.
	(ix86_split_sse_movcc): New.
	* config/i386/i386-protos.h: Update.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.7114&r2=2.7115
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386-protos.h.diff?cvsroot=gcc&r1=1.125&r2=1.126
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.c.diff?cvsroot=gcc&r1=1.776&r2=1.777
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&r1=1.605&r2=1.606

Comment 7 Richard Henderson 2005-01-14 00:55:45 UTC
Fixed.