Bug 33755 - Gcc 4.2.2 broken for mips linux kernel builds
Summary: Gcc 4.2.2 broken for mips linux kernel builds
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.2.2
: P3 normal
Target Milestone: 4.2.3
Assignee: Richard Sandiford
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-12 18:32 UTC by David Daney
Modified: 2007-10-24 17:55 UTC (History)
4 users (show)

See Also:
Host:
Target: mipsel-linux
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-10-13 10:47:26


Attachments
preprocessed source (66.78 KB, application/octet-stream)
2007-10-12 19:19 UTC, Martin Michlmayr
Details
4.2 version of the proposed patch (3.50 KB, patch)
2007-10-22 20:47 UTC, Richard Sandiford
Details | Diff
Mainline version of proposed patch (3.67 KB, patch)
2007-10-22 20:47 UTC, Richard Sandiford
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Daney 2007-10-12 18:32:16 UTC
As noted in:

http://www.linux-mips.org/cgi-bin/mesg.cgi?a=linux-mips&i=20071012172254.GA10835%40linux-mips.org

GCC + binutils-2.18 cannot build the mips linux kernel.

There is the possibility of this being a binutils bug.
Comment 1 Martin Michlmayr 2007-10-12 19:19:27 UTC
Created attachment 14350 [details]
preprocessed source

(sid)160:tbm@swarm: ~] gcc-4.2 -c -O2 -march=mips32r2 33755.i
(sid)161:tbm@swarm: ~] ld -m elf32ltsmip -r 33755.o
ld: 33755.o: Can't find matching LO16 reloc against `$LC6' for R_MIPS_GOT16 at 0x521490 in section `.text'
(sid)162:tbm@swarm: ~]
Comment 2 Martin Michlmayr 2007-10-12 19:30:22 UTC
Reducing.
Comment 3 Martin Michlmayr 2007-10-12 20:32:35 UTC
/* Testcase by Martin Michlmayr <tbm@cyrius.com> */

struct mtd_blktrans_ops
{
  int (*readsect) (void);
  int exiting;
};
static void do_blktrans_request (struct mtd_blktrans_ops *tr, long flags)
{
  switch (flags & 1)
    {
    case 0:
      if (tr->readsect ())
        return;
    case 1:
      return;
    default:
      printk ("foo\n");
    }
}
mtd_blktrans_thread (struct mtd_blktrans_ops *tr)
{
  long flags;
  while (!tr->exiting)
      do_blktrans_request (tr, flags);
}
Comment 4 David Daney 2007-10-12 20:44:20 UTC
Not that I have looked into the problem, but this sounds similar to this problem:

http://sourceware.org/ml/binutils/2006-11/msg00059.html
Comment 5 Andrew Pinski 2007-10-12 20:56:16 UTC
This does sound like an gas/ld issue.
Comment 6 David Daney 2007-10-12 21:06:18 UTC
From the reduced testcase could you post the result of:

objdump -r -j .text 33755.o

Thanks.
Comment 7 David Daney 2007-10-12 22:08:38 UTC
The reduced testcase is different than the reported problem.  At a minimum the relocation types are different.  Also the kernel is compiled with -G 0 -mno-abicalls

If we compile the reduced testcase thusly:
(sid)ddaney@swarm:~/junk$ gcc-4.2 -c -O2 -G0 -mno-abicalls -march=mips32r2 j.c
(sid)ddaney@swarm:~/junk$ objdump -r  j.o

j.o:     file format elf32-tradlittlemips

RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000068 R_MIPS_26         printk
00000024 R_MIPS_HI16       $LC0
0000006c R_MIPS_LO16       $LC0
0000007c R_MIPS_26         .text

(sid)ddaney@swarm:~/junk$ ld -m elf32ltsmip -r j.o

No errors.

Comment 8 David Daney 2007-10-12 22:35:06 UTC
It appears to be a GCC bug not binutils:

If you look at the assembly output for gcc-4.2, we emit:

    lui     $2,%hi($LC6)

But there is no corresponding %lo($LC6) to be found.  No amount of relocation sorting could overcome this problem.

The corresponding printk calls are also missing.  Very strange.

This is fixed in 4.3.

(sid)ddaney@swarm:~/junk$ gcc -c -O2 -G0 -mno-abicalls -march=mips32r2 33755.i 
(sid)ddaney@swarm:~/junk$ ld -m elf32ltsmip -r 33755.o
(sid)ddaney@swarm:~/junk$ gcc --version
gcc (GCC) 4.3.0 20071011 (experimental) [trunk revision 129239]
Comment 9 Richard Sandiford 2007-10-13 10:47:25 UTC
The problem comes from dbr_schedule, although it's not really a bug there.
We have:

	bne	$5,$0,L1	# A
	...stuff...
L1:
	bne	$5,$0,L2	# B
	...printk call...
L2:

and nothing before dbr_schedule has managed to thread A to L2.
dbr_schedule first fills B's delay slot with an lui from the printk
block, then steal_delay_list_from_target realises that A can steal B's
delay slot and branch directly to L2.  There is no other path to L1,
so the rest of the printk call is now dead.

For most targets, this is at worst a missed optimisation; we should have
threaded A to L2 much earlier than dbr_schedule, and deleted the whole
printk block as dead.  I don't think the MIPS port can rely on that
happening for correctness.  So (alas!) I think the upshot is simply
that we need to add some special code to mips_reorg to delete high-part
relocations that have no matching lows.

I'll have a poke.
Comment 10 ddaney 2007-10-14 02:02:49 UTC
Subject: Re:  Gcc 4.2.2 broken for mips linux kernel builds

rsandifo at gcc dot gnu dot org wrote:
> ------- Comment #9 from rsandifo at gcc dot gnu dot org  2007-10-13 10:47 -------
> The problem comes from dbr_schedule, although it's not really a bug there.
> We have:
>
>         bne     $5,$0,L1        # A
>         ...stuff...
> L1:
>         bne     $5,$0,L2        # B
>         ...printk call...
> L2:
>
> and nothing before dbr_schedule has managed to thread A to L2.
> dbr_schedule first fills B's delay slot with an lui from the printk
> block, then steal_delay_list_from_target realises that A can steal B's
> delay slot and branch directly to L2.  There is no other path to L1,
> so the rest of the printk call is now dead.
>
> For most targets, this is at worst a missed optimisation; we should have
> threaded A to L2 much earlier than dbr_schedule, and deleted the whole
> printk block as dead.  I don't think the MIPS port can rely on that
> happening for correctness.  So (alas!) I think the upshot is simply
> that we need to add some special code to mips_reorg to delete high-part
> relocations that have no matching lows.
>
> I'll have a poke.
>
>   
That makes sense, however it is a bit strange because... IIRC when I 
compiled the .i file with a fairly recent 4.3 build, the printk  in 
question was not optimized away.  So if 4.2.2 can validly optimize away 
the printk, then we have an optimization regression in 4.3

David Daney

Comment 11 rsandifo@nildram.co.uk 2007-10-14 09:41:15 UTC
Subject: Re:  Gcc 4.2.2 broken for mips linux kernel builds

"ddaney at avtrex dot com" <gcc-bugzilla@gcc.gnu.org> writes:
> rsandifo at gcc dot gnu dot org wrote:
>> ------- Comment #9 from rsandifo at gcc dot gnu dot org  2007-10-13 10:47 -------
>> The problem comes from dbr_schedule, although it's not really a bug there.
>> We have:
>>
>>         bne     $5,$0,L1        # A
>>         ...stuff...
>> L1:
>>         bne     $5,$0,L2        # B
>>         ...printk call...
>> L2:
>>
>> and nothing before dbr_schedule has managed to thread A to L2.
>> dbr_schedule first fills B's delay slot with an lui from the printk
>> block, then steal_delay_list_from_target realises that A can steal B's
>> delay slot and branch directly to L2.  There is no other path to L1,
>> so the rest of the printk call is now dead.
>>
>> For most targets, this is at worst a missed optimisation; we should have
>> threaded A to L2 much earlier than dbr_schedule, and deleted the whole
>> printk block as dead.  I don't think the MIPS port can rely on that
>> happening for correctness.  So (alas!) I think the upshot is simply
>> that we need to add some special code to mips_reorg to delete high-part
>> relocations that have no matching lows.
>>
>> I'll have a poke.
>>
>>   
> That makes sense, however it is a bit strange because... IIRC when I 
> compiled the .i file with a fairly recent 4.3 build, the printk  in 
> question was not optimized away.  So if 4.2.2 can validly optimize away 
> the printk, then we have an optimization regression in 4.3

Well, in one sense, yeah.  But it's not a regression that indicates
a new bug.  The point is that dbr_schedule isn't how we're meant to
optimise this case anyway; the optimisation in dbr_schedule is really
there for other situations.  Both 4.2 and 4.3 are missing optimisations
further up the chain, and 4.3 isn't lacking them more than 4.2.

Both versions expand the switch statement to the following rtl:

   tmp = val & 1
   if (tmp == 0) goto ...
   tmp2 = 1
   if (tmp == tmp2) goto ...
   ...printk() code...

4.2 keeps the block in essentially this form up until combine,
which use nonzero_bits checks to optimise the second branch into
"if (tmp != 0) goto ...".  4.3's loop-invariant motion can hoist
the setting of tmp2, thus stopping combine from doing the optimisation.
But what 4.3 is doing is perfectly reasonable if we need to keep the
branch, and if we don't need to keep the branch, we should have
optimised it away before lim.

We probably need some "nonzero bits" optimisations in tree vrp, if we
don't already.  (And if we don't already, I doubt I'll have been the
first person to say that.)

Richard
Comment 12 Richard Sandiford 2007-10-15 16:23:02 UTC
As a status update: I've got a patch I'm reasonably happy with,
tested against 4.2 and trunk.  The trunk version depends on some
uncommitted patches, which in turn depend on a system.h patch,
so I'll hold off applying the patches for this bug until the
earlier ones are in.
Comment 13 Richard Sandiford 2007-10-22 20:47:09 UTC
Created attachment 14391 [details]
4.2 version of the proposed patch
Comment 14 Richard Sandiford 2007-10-22 20:47:45 UTC
Created attachment 14392 [details]
Mainline version of proposed patch
Comment 15 Richard Sandiford 2007-10-24 17:52:28 UTC
Subject: Bug 33755

Author: rsandifo
Date: Wed Oct 24 17:52:16 2007
New Revision: 129606

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129606
Log:
gcc/
	PR target/33755
	* config/mips/mips.c (mips_lo_sum_offset): New structure.
	(mips_hash_base, mips_lo_sum_offset_hash, mips_lo_sum_offset_eq)
	(mips_lo_sum_offset_lookup, mips_record_lo_sum)
	(mips_orphaned_high_part_p: New functions.
	(mips_avoid_hazard): Don't check INSN_P here.
	(mips_avoid_hazards): Rename to...
	(mips_reorg_process_insns): ...this.  Cope with
	!TARGET_EXPLICIT_RELOCS.  Delete orphaned high-part relocations,
	or turn them into nops.
	(mips_reorg): Remove TARGET_EXPLICIT_RELOCS check from calls to
	dbr_schedule and mips_avoid_hazards/mips_reorg_process_insns.
	(mips_set_mips16_mode): Don't set flag_delayed_branch here.
	(mips_override_options): Set flag_delayed_branch to 0.

gcc/testsuite/
	PR target/33755
	* gcc.target/mips/pr33755.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/mips/pr33755.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/mips/mips.c
    trunk/gcc/testsuite/ChangeLog

Comment 16 Richard Sandiford 2007-10-24 17:54:49 UTC
Subject: Bug 33755

Author: rsandifo
Date: Wed Oct 24 17:54:40 2007
New Revision: 129607

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129607
Log:
gcc/
	PR target/33755
	* config/mips/mips.c (override_options): Always move
	flag_delayed_branch to mips_flag_delayed_branch.
	(mips_lo_sum_offset): New structure.
	(mips_hash_base, mips_lo_sum_offset_hash, mips_lo_sum_offset_eq)
	(mips_lo_sum_offset_lookup, mips_record_lo_sum)
	(mips_orphaned_high_part_p: New functions.
	(mips_avoid_hazard): Don't check INSN_P here.
	(mips_avoid_hazards): Rename to...
	(mips_reorg_process_insns): ...this.  Cope with
	!TARGET_EXPLICIT_RELOCS.  Delete orphaned high-part relocations,
	or turn them into nops.
	(mips_reorg): Remove TARGET_EXPLICIT_RELOCS check from calls to
	dbr_schedule and mips_avoid_hazards/mips_reorg_process_insns.

gcc/testsuite/
	PR target/33755
	* gcc.target/mips/pr33755.c: New test.

Added:
    branches/gcc-4_2-branch/gcc/testsuite/gcc.target/mips/pr33755.c
Modified:
    branches/gcc-4_2-branch/gcc/ChangeLog
    branches/gcc-4_2-branch/gcc/config/mips/mips.c
    branches/gcc-4_2-branch/gcc/testsuite/ChangeLog

Comment 17 Richard Sandiford 2007-10-24 17:55:50 UTC
Patches applied to mainline and 4.2.