Bug 9240 - weird scheduling on v850 unless -fno-sched-spec specified
Summary: weird scheduling on v850 unless -fno-sched-spec specified
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.4.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2003-01-09 01:06 UTC by miles
Modified: 2010-02-24 21:20 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2003-06-02 04:29:26


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description miles 2003-01-09 01:06:00 UTC
When given the source file below (`gcc-3.x-sched-bug.c'), gcc cross-compiling for the v850 generates very oddly scheduled assembly.

Old versions of gcc (e.g., 2.9x) or using the `-fno-sched-spec' option both cause much better code to be generated.

Here's the (slightly edited) assembly output generated from
`v850e-elf-gcc -S -O5 gcc-3.x-sched-bug.c':

   _bogus:
	   /* compute r10 */
	   mov r10,r12
	   subr r0,r12
	   movea lo(-125),r0,r5
	   cmp r5,r10
	   bge .L3
   .L2:
	   jmp [r31]
   .L3:
	   mov hilo(_errno),r5
	   st.w r12,0[r5]
	   mov -1,r10
	   br .L2

Note that:

 (1) It's moved much of the calculation for the 
     error case before the test, which I can see
     no reason for (especially since the v850 is
     such a simple processor -- no confusing
     speculative scheduling here!), and which
     pessimizes the normal path.

 (2) Even if there were a good reason for such
     scheduling for the error branch, use of
     __builtin_expect as shown arguably should
     have suppressed it, since it adversely
     effects the normal path.

Here's the output when using the -fno-sched-spec option (other args the same), which is much, much, better:

   _bogus:
	   /* compute r10 */
	   movea lo(-125),r0,r5
	   cmp r5,r10
	   bge .L3
   .L2:
	   jmp [r31]
   .L3:
	   subr r0,r10
	   mov hilo(_errno),r5
	   st.w r10,0[r5]
	   mov -1,r10
	   br .L2

Release:
gcc (GCC) 3.4 20030108 (experimental)

Environment:
gcc compiled on debian unstable i686
cross-compiling for v850e-elf

configure args are: 

configure --prefix=... --target=v850e-elf --host=i386-pc-linux-gnu --with-newlib --enable-languages=c --configure-cache=../config.cache

How-To-Repeat:
Here's the source (`gcc-3.x-sched-bug.c'):

   extern int errno;
   int bogus ()
   {
     register int val asm ("r10");

     asm volatile ("/* compute r10 */" : "=r" (val));

     if (__builtin_expect (val >= -125, 0))
       {
	 errno = -val;
	 val = -1;
       }

     return val;
   }

Compile it with:  v850e-elf-gcc -S -O5 gcc-3.x-sched-bug.c
Comment 1 miles 2003-01-09 01:06:00 UTC
Fix:
This particular problem can be avoided by using the -fno-sched-spec option; however I do no know if doing so adversely affects other code.
Comment 2 Dara Hazeghi 2003-05-31 07:00:32 UTC
Hello,

with gcc 3.3 branch I get the same code as reported. With gcc mainline I get the following asm 
with -O3 -fno-sched-spec
        .file   "lib.c"
        .section .text
        .align 1
        .global _bogus
        .type   _bogus, @function
_bogus:
#APP
        /* compute r10 */
#NO_APP
        movea lo(-125),r0,r5
        cmp r5,r10
        bge .L4
        jmp [r31]
.L4:
        subr r0,r10
        movhi hi(_errno),r0,r5
        st.w r10,lo(_errno)[r5]
        mov -1,r10
        jmp [r31]
        .size   _bogus, .-_bogus
        .ident  "GCC: (GNU) 3.4 20030530 (experimental)"


Without -fno-sched-spec, I get:
        .file   "lib.c"
        .section .text
        .align 1
        .global _bogus
        .type   _bogus, @function
_bogus:
#APP
        /* compute r10 */
#NO_APP
        mov r10,r12
        subr r0,r12
        movea lo(-125),r0,r5
        cmp r5,r10
        bge .L4
        jmp [r31]
.L4:
        movhi hi(_errno),r0,r5
        st.w r12,lo(_errno)[r5]
        mov -1,r10
        jmp [r31]
        .size   _bogus, .-_bogus
        .ident  "GCC: (GNU) 3.4 20030530 (experimental)"


Can you confirm that the -fno-sched-spec code is still better than the regular code in this case? 
Thanks,

Dara
Comment 3 miles 2003-05-31 08:08:05 UTC
Subject: Re:  wierd scheduling on v850 unless -fno-sched-spec specified

On Sat, May 31, 2003 at 07:00:33AM -0000, dhazeghi@yahoo.com wrote:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9240
>
> with gcc 3.3 branch I get the same code as reported. With gcc mainline I
> get the following asm
...
> Can you confirm that the -fno-sched-spec code is still better than the
> regular code in this case?

Yes, -fno-sched-spec is still better.

[I'm surprised, actually, that it _ever_ moves code back past an `unlikely'
(as declared by __builtin_expect) branch, regardless of the architecture...]

Thanks,

-Miles
Comment 4 Andrew Pinski 2003-06-02 04:29:26 UTC
Acording to the submitter the problem is still there and looks like -fno-sched-spec fixes 
the problem.
Comment 5 Vladimir Makarov 2004-01-15 17:22:24 UTC
Some analysis:

1. Scheduler uses own algorithm of probablity calculation (3 basic
   blocks of the test have probability 100/50/100%).

2. It should use common infrastructure bb frequency and edge
   probablility.

3. Even if it used the common infrastructure, bb frequnecies would be
   still wrong (10000, 9001, 10000 or 100/90/100%).  So builtin-expect
   is ignored in the infrastructure too.  I saw broken builtin-expect
   many times during my 6 years work on gcc.

4. Becuase the probability of the 2nd bb is 50%, it includes 2
   insns from it to schedule into the 1st bb.

So fixing the problem could be:

  1. Taking builtin_expect into account in scheduler algorithm of
     probablity calculation.

  2. Start using the common infrastructure bb frequency and edge
     probablility and fixing there builtin_expect.

  Neither the 1st solution nor the 2nd one is easy. And they include
the risk of new bugs.  So I think it is not reasonable to fix it for
3.4.  It is a just generation of non-optimal code (scheduler works by
heuristics.  there will be always some unoptimal schedules.  we are
just oriented to guarantee that scheduler improves code in overall on
credible benchmarks like Spec).  The problem was pressent in 3.3.

  I'll try to fix the problem in 3-4 months.

Vlad
Comment 6 Andrew Pinski 2005-09-26 15:21:34 UTC
I think the patch at
http://gcc.gnu.org/ml/gcc-patches/2005-09/msg00370.html
will fix this.
Comment 7 Pete Steinmetz 2005-09-26 22:07:34 UTC
Unless the common infrastructure bb frequency and edge probabilities have been 
updated to reflect builtin_expect, http://gcc.gnu.org/ml/gcc-patches/2005-
09/msg00370.html probably won't fix this problem.
Comment 8 Jeffrey A. Law 2010-02-24 21:20:13 UTC
This was fixed at some point; I don't know if it was the patch referenced in comment #6, or some other patch or some combination thereof.

You can see that probabilities are being propagated down through to the scheduler by looking at the .sched dump:

(insn 9 7 10 2 j.c:8 (set (cc0)
        (compare (reg:SI 39 [ prephitmp___4 ])
            (reg:SI 42))) 8 {*cmpsi} (expr_list:REG_DEAD (reg:SI 42)
        (expr_list:REG_EQUAL (compare (reg:SI 39 [ prephitmp___4 ])
                (const_int -125 [0xffffffffffffff83]))
            (nil))))

(jump_insn 10 9 11 2 j.c:8 (set (pc)
        (if_then_else (lt (cc0)
                (const_int 0 [0x0]))
            (label_ref 15)
            (pc))) 41 {*branch_normal} (expr_list:REG_BR_PROB (const_int 9996 [0x270c])
        (nil))
 -> 15)
;; End of basic block 2 -> ( 3 4)
;; lr  out       3 [sp] 29 [r29] 32 [.fp] 33 [.ap] 39
;; live  out     3 [sp] 29 [r29] 32 [.fp] 33 [.ap] 39


;; Succ edge  3 [0.0%]  (fallthru)
;; Succ edge  4 [100.0%]

Note how the probability of the edge 2->3 is now zero, reflecting the unlikely attribute while the edge 2->4 is 100%.  With the correct probabilities the scheduler naturally does the right thing.