Bug 40983 - The scheduler incorrectly swaps MMX and floating point instructions
Summary: The scheduler incorrectly swaps MMX and floating point instructions
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.4.1
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-06 04:11 UTC by mikulas
Modified: 2009-11-11 23:06 UTC (History)
2 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2009-08-06 10:47:08


Attachments
A failing example (312 bytes, text/plain)
2009-08-06 04:12 UTC, mikulas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description mikulas 2009-08-06 04:11:23 UTC
Hi

This example fails, because in function "f", the scheduler incorrectly swapped floating point store to "c" and load of mmx registers.

Compile with -O2 -march=pentium-mmx
Comment 1 mikulas 2009-08-06 04:12:01 UTC
Created attachment 18310 [details]
A failing example
Comment 2 mikulas 2009-08-06 04:15:19 UTC
Assembler output:

f:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $16, %esp
        movq    %mm0, -8(%ebp)
        movq    %mm1, -16(%ebp)
        emms
        fldl    a
        faddl   b
        movq    -8(%ebp), %mm0
        paddd   -16(%ebp), %mm0
        fstpl   c
        leave
        ret
Comment 3 Uroš Bizjak 2009-08-06 10:47:07 UTC
Hm, we can fix this by teaching scheduler that every access to %mm registers clobbers FP state. Since scheduler already depend all x87 instructions on Top-Of-Stack (TOS) register, we can perhaps extend this requirement for all instructions that use %mm reg.

Something like the proof-of-concept patch bellow:

--cut here--
Index: sched-deps.c
===================================================================
--- sched-deps.c	(revision 150503)
+++ sched-deps.c	(working copy)
@@ -1898,7 +1898,12 @@
 	    sched_analyze_reg (deps, FIRST_STACK_REG, mode, code, insn);
 	  sched_analyze_reg (deps, FIRST_STACK_REG, mode, USE, insn);
 	}
+
+#ifdef FIRST_MMX_REG
+      if (regno >= FIRST_MMX_REG && regno <= LAST_MMX_REG)
+	sched_analyze_reg (deps, FIRST_STACK_REG, VOIDmode, CLOBBER, insn);
 #endif
+#endif
     }
   else if (MEM_P (dest))
     {
@@ -2044,7 +2049,12 @@
 	    sched_analyze_reg (deps, FIRST_STACK_REG, mode, USE, insn);
 	  sched_analyze_reg (deps, FIRST_STACK_REG, mode, SET, insn);
 	}
+
+#ifdef FIRST_MMX_REG
+      if (regno >= FIRST_MMX_REG && regno <= LAST_MMX_REG)
+	sched_analyze_reg (deps, FIRST_STACK_REG, VOIDmode, CLOBBER, insn);
 #endif
+#endif
 
 	if (cslr_p && sched_deps_info->finish_rhs)
 	  sched_deps_info->finish_rhs ();
--cut here--

Otherwise, you can just add "asm volatile ("")" as a scheduling barrier.
Comment 4 mikulas 2009-08-23 19:18:56 UTC
The patch works fine.
Comment 5 Ozkan Sezer 2009-10-16 19:45:50 UTC
Any progress on this? 
Comment 6 mikulas 2009-11-11 23:06:00 UTC
I think you can commit it to gcc. I don't see a reason why it shouldn't be committed.