User account creation filtered due to spam.

Bug 56451 - [5/6/7/8 regression] Wrong code for gcc.c-torture/execute/941015-1.c on SH
Summary: [5/6/7/8 regression] Wrong code for gcc.c-torture/execute/941015-1.c on SH
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.8.0
: P4 normal
Target Milestone: 5.5
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2013-02-26 00:15 UTC by Kazumoto Kojima
Modified: 2016-08-03 10:59 UTC (History)
3 users (show)

See Also:
Host:
Target: sh*-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-11-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kazumoto Kojima 2013-02-26 00:15:05 UTC
On sh4-linux, the function foo1 of gcc.c-torture/execute/941015-1.c

int
foo1 (value)
     long long value;
{
  register const long long constant = 0xc000000080000000LL;

  if (value < constant)
    return 1;
  else
    return 2;
}

is assembled to a wrong code with -O1

foo1:
        mov.l   .L7,r1
        cmp/gt  r1,r5
        bt/s    .L10
        mov     #2,r0
        cmp/ge  r1,r5
        bf/s    .L10
        mov     #1,r0
.L10:
        rts     
        nop
.L9:
        .align 2
.L7:
        .long   -1073741824

It looks delayed branch optimization causes this.
Comment 1 Kazumoto Kojima 2013-02-26 00:32:07 UTC
Before dbr_schedule, the insns look like:

	r1 := 0xc0000000
(A)	if (r5 > r1) goto L0
(B)	if (r5 < r1) goto L1
	r1 := 0x80000000
(C)	if (r4 >= r1) goto L0
L1:	r0 := 1
	goto L2
L0:	r0 := 2
L2:

On SH, insns (A),(B),(C) can have delayed slot. dbr_schedule fills
a slot of (A) first:

	r1 := 0xc0000000
(A)	[if (r5 > r1) goto L2 | r0 := 2]
(B)	if (r5 < r1) goto L1
	r1 := 0x80000000
(C)	if (r4 >= r1) goto L0
L1:	r0 := 1
	goto L2
L0:	r0 := 2
L2:

Curiously, in the problematic case, dbr_schedule tries (C) next and
redundant_insn checks that r0 := 2 is whether redundant or not with
scanning insns backward from (C).  It finds the insn in the slot of
(A) and deletes the insn r0 := 2 at L0 as an unnecessary one.

	r1 := 0xc0000000
(A)	[if (r5 > r1) goto L2 | r0 := 2]
(B)	if (r5 < r1) goto L1
	r1 := 0x80000000
(C)	if (r4 >= r1) goto L0
L1:	r0 := 1
	goto L2
L0:
L2:

Then dbr_schedule fills the slot of (B):

	r1 := 0xc0000000
(A)	[if (r5 > r1) goto L2 | r0 := 2]
(B)	[if (r5 < r1) goto L2 | r0 := 1]
	r1 := 0x80000000
(C)	if (r4 >= r1) goto L0
L1:	r0 := 1
	goto L2
L0:
L2:

Now (C) and r0 := 1 at L1 are useless because r0 is 1 before (C).
It seems to me that the above is an unexpected scenario to redundant_insn,
i.e. dbr_schedule uses it unsafely somewhere, though I gave up to find
where problematic use occurs.  An easy patch below might be a bit
overkill but it fixes the problem for me.

--- ORIG/trunk/gcc/reorg.c	2013-01-11 11:35:58.000000000 +0900
+++ trunk/gcc/reorg.c	2013-02-25 17:17:34.000000000 +0900
@@ -1514,7 +1514,7 @@ redundant_insn (rtx insn, rtx target, rt
        trial && insns_to_search > 0;
        trial = PREV_INSN (trial))
     {
-      if (LABEL_P (trial))
+      if (LABEL_P (trial) || JUMP_P (trial))
 	return 0;
 
       if (!INSN_P (trial))
Comment 2 Jakub Jelinek 2013-03-22 14:44:01 UTC
GCC 4.8.0 is being released, adjusting target milestone.
Comment 3 Oleg Endo 2013-05-12 10:02:26 UTC
I'd like to add Steven to the CC list, as he recently brought up a DBR topic on the mailing list: http://gcc.gnu.org/ml/gcc/2013-04/msg00163.html

I didn't follow it, but it looks related.
Comment 4 Jakub Jelinek 2013-05-31 10:58:17 UTC
GCC 4.8.1 has been released.
Comment 5 Jakub Jelinek 2013-10-16 09:48:53 UTC
GCC 4.8.2 has been released.
Comment 6 Oleg Endo 2013-11-25 21:34:17 UTC
Hm, I'm just poking around here, sorry if it's just useless noise ...

In dbr_schedule, doing:

      fill_simple_delay_slots (1);
      fill_simple_delay_slots (0);
      fill_eager_delay_slots ();
      relax_delay_slots (first);

or
//      fill_simple_delay_slots (1);
      fill_simple_delay_slots (0);
      fill_eager_delay_slots ();
      relax_delay_slots (first);

-> produces bad code.



      fill_simple_delay_slots (1);
      fill_simple_delay_slots (0);
//      fill_eager_delay_slots ();
      relax_delay_slots (first);

-> doesn't do anything to the code.

//      fill_simple_delay_slots (1);
//      fill_simple_delay_slots (0);
      fill_eager_delay_slots ();
      relax_delay_slots (first);

or
      fill_simple_delay_slots (1);
//      fill_simple_delay_slots (0);
      fill_eager_delay_slots ();
      relax_delay_slots (first);


-> results in the following code (looks OK):

        mov.l   .L8,r1
        cmp/ge  r1,r5
        bf/s    .L11
        mov     #1,r0

        cmp/gt  r1,r5
        bt/s    .L6
        mov     #2,r0

        mov.l   .L9,r1
        cmp/hs  r1,r4
        bt      .L6
        mov     #1,r0
.L6:
        .align 1
.L11:
        rts	
        nop
.L10:
Comment 7 Richard Biener 2014-05-22 09:03:19 UTC
GCC 4.8.3 is being released, adjusting target milestone.
Comment 8 Jakub Jelinek 2014-12-19 13:32:34 UTC
GCC 4.8.4 has been released.
Comment 9 Richard Biener 2015-06-23 08:14:54 UTC
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
Comment 10 Jakub Jelinek 2015-06-26 19:56:58 UTC
GCC 4.9.3 has been released.
Comment 11 Richard Biener 2016-08-03 10:58:40 UTC
GCC 4.9 branch is being closed