Bug 34999 - Incorrect FDE entries with hot/cold code section splitting (partition_hot_cold_basic_blocks)
Summary: Incorrect FDE entries with hot/cold code section splitting (partition_hot_col...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: 4.5.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
: 37058 (view as bug list)
Depends on:
Blocks: 41501
  Show dependency treegraph
 
Reported: 2008-01-28 15:21 UTC by revital eres
Modified: 2018-08-19 14:13 UTC (History)
9 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.5.0
Known to fail: 4.4.3
Last reconfirmed: 2009-11-22 17:25:53


Attachments
the testcase (309 bytes, text/plain)
2008-01-28 15:22 UTC, revital eres
Details
A patch I am currently testing (474 bytes, patch)
2008-01-29 16:07 UTC, revital eres
Details | Diff
asm that crashes linker (1.02 KB, text/plain)
2008-02-28 19:12 UTC, Uroš Bizjak
Details

Note You need to log in before you can comment on or make changes to this bug.
Description revital eres 2008-01-28 15:21:22 UTC
I run built-in-setjmp.c testcase (taken from gcc.c-torture/execute dir);
first with -fprofile-generate -freorder-blocks-and-partition
and then with -fprofile-use -freorder-blocks-and-partition
under x86_64 and ppc64-linux with r131883 mainline and I get the ICE:

vn/build/build/spu/gcc/gcc/testsuite/gcc/built-in-setjmp.c -fprofile-use -freorder-blocks-and-partition
../gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c: In function âmainâ:
../gcc/gcc/testsuite/gcc.c-torture/execute/built-in-setjmp.c:39: internal compiler error: in add_labels_and_missing_jumps, at bb-reorder.c:1304
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://gcc.gnu.org/bugs.html> for instructions.

The code for fixing up fallthru edges that cross between hot and cold sections
is in add_labels_and_missing_jumps () and fix_up_fall_thru_edges () functions.
The first function deals with fallthru edges with single successor and the later
deals with fallthru edges with more than one successor.

It seems that the cause to this fail is because both of the functions do not take into account the fact that a basic-block with a crossing edge can end with a call insn.  I am working to fix that.
Comment 1 revital eres 2008-01-28 15:22:45 UTC
Created attachment 15037 [details]
the testcase
Comment 2 revital eres 2008-01-29 16:07:26 UTC
Created attachment 15049 [details]
A patch I am currently testing
Comment 3 revitale 2008-02-27 13:28:43 UTC
Subject: Bug 34999

Author: revitale
Date: Wed Feb 27 13:27:56 2008
New Revision: 132711

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=132711
Log:
Fix PR rtl-optimization/34999

Added:
    trunk/gcc/testsuite/gcc.dg/tree-prof/pr34999.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/bb-reorder.c
    trunk/gcc/testsuite/ChangeLog

Comment 4 Uroš Bizjak 2008-02-28 09:47:54 UTC
Test case fails on x86_64:

Running target unix
FAIL: gcc.dg/tree-prof/pr34999.c compilation,  -fprofile-use -D_PROFILE_USE
UNRESOLVED: gcc.dg/tree-prof/pr34999.c execution,    -fprofile-use -D_PROFILE_USE

http://gcc.gnu.org/ml/gcc-testresults/2008-02/msg01856.html
Comment 5 revital eres 2008-02-28 13:00:53 UTC
I can not reproduce this fail on my local x86_64-unknown-linux-gnu machine:

PASS: gcc.dg/tree-prof/pr34999.c compilation,  -fprofile-use -D_PROFILE_USE
PASS: gcc.dg/tree-prof/pr34999.c execution,    -fprofile-use -D_PROFILE_USE

I appreciate any info on this ICE so I could try to fix it.
Comment 6 Uroš Bizjak 2008-02-28 13:26:41 UTC
(In reply to comment #5)
> I can not reproduce this fail on my local x86_64-unknown-linux-gnu machine:
> 
> PASS: gcc.dg/tree-prof/pr34999.c compilation,  -fprofile-use -D_PROFILE_USE
> PASS: gcc.dg/tree-prof/pr34999.c execution,    -fprofile-use -D_PROFILE_USE
> 
> I appreciate any info on this ICE so I could try to fix it.

Sometimes --enable-checking=release triggers bugs that are hidden by other --enable-checking settings.

Comment 7 revital eres 2008-02-28 13:49:14 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > I appreciate any info on this ICE so I could try to fix it.
> Sometimes --enable-checking=release triggers bugs that are hidden by other
> --enable-checking settings.

Thanks, unfortunately I still can not reproduce the fail.


Comment 8 Uroš Bizjak 2008-02-28 18:34:22 UTC
(In reply to comment #7)

> Thanks, unfortunately I still can not reproduce the fail.

Probably you need newer binutils:

GNU ld (GNU Binutils) 2.18

Executing on host: /home/uros/gcc-build/gcc/xgcc -B/home/uros/gcc-build/gcc/ /home/uros/gcc-svn/trunk/gcc/testsuite/gcc.dg/tree-prof/pr34999.c   -O2 -freorder-blocks-and-partition -fprofile-use -D_PROFILE_USE -fno-show-column  -lm   -o /home/uros/gcc-build/gcc/testsuite/gcc/pr34999.x02    (timeout = 300)
/usr/local/x86_64-unknown-linux-gnu/bin/ld: error in /tmp/cceUz9XT.o(.eh_frame); no .eh_frame_hdr table will be created.^M
output is:
/usr/local/x86_64-unknown-linux-gnu/bin/ld: error in /tmp/cceUz9XT.o(.eh_frame); no .eh_frame_hdr table will be created.^M

FAIL: gcc.dg/tree-prof/pr34999.c compilation,  -fprofile-use -D_PROFILE_USE
UNRESOLVED: gcc.dg/tree-prof/pr34999.c execution,    -fprofile-use -D_PROFILE_USE

HJ, is this a binutils bug or gcc bug?
Comment 9 Uroš Bizjak 2008-02-28 19:12:55 UTC
Created attachment 15241 [details]
asm that crashes linker

gcc pr34999.s 
/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/../../../../x86_64-unknown-linux-gnu/bin/ld: error in /tmp/ccuXalsV.o(.eh_frame); no .eh_frame_hdr table will be created.
Comment 10 revital eres 2008-02-29 05:23:13 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Thanks, unfortunately I still can not reproduce the fail.
> Probably you need newer binutils:
> GNU ld (GNU Binutils) 2.18

Yes, using a newer binutils the fail is reproducible. thanks
Comment 11 H.J. Lu 2008-02-29 05:54:08 UTC
I don't think it is a linker bug. This bug looks very similar
to PR 34249. Uros, you fixed PR 34249. Maybe there is another
similar problem elsewhere in dwarf2.out.c.
Comment 12 Uroš Bizjak 2008-02-29 06:07:31 UTC
(In reply to comment #11)
> I don't think it is a linker bug. This bug looks very similar
> to PR 34249. Uros, you fixed PR 34249. Maybe there is another
> similar problem elsewhere in dwarf2.out.c.

The problem is in this part of the frame:

.LASFDE3:
	.long	.LASFDE3-.Lframe1
	.long	.LHOTB2
	.long	.LHOTE2-.LHOTB2
	.long	.LCOLDB2
>>>	.long	.LCOLDE2-.LCOLDB2
	.uleb128 0x0
	.byte	0x4
	.long	.LCFI2-.LFB3
	.byte	0xe
	.uleb128 0x10
	.byte	0x86
	.uleb128 0x2
	.byte	0x4
	.long	.LCFI3-.LCFI2
	.byte	0xd
	.uleb128 0x6
	.align 8
.LEFDE3:

If the marked line is changed to .long	.LCOLDB2-.LCOLDB2 (or ..E2 - ..E2) the compilation succeeds. However, looking through asm, it looks correct to me.
Comment 13 Uroš Bizjak 2008-02-29 06:32:37 UTC
(In reply to comment #12)

> If the marked line is changed to .long  .LCOLDB2-.LCOLDB2 (or ..E2 - ..E2) the
> compilation succeeds. However, looking through asm, it looks correct to me.

That is, the frame and label positions look correct to me. The problem in PR 34249 was malformed frame, which is not the case here.


Comment 14 Uroš Bizjak 2008-02-29 06:47:55 UTC
CC Jakub, perhaps he can give us a clue what goes wrong here.
Comment 15 Jakub Jelinek 2008-02-29 08:56:15 UTC
.LSFDE3:
        .long   .LEFDE3-.LASFDE3        # FDE Length
.LASFDE3:
        .long   .LASFDE3-.Lframe1       # FDE CIE offset
        .long   .LHOTB2 # FDE initial location
        .long   .LHOTE2-.LHOTB2 # FDE address range
        .long   .LCOLDB2        # FDE initial location
        .long   .LCOLDE2-.LCOLDB2       # FDE address range
        .uleb128 0x0    # Augmentation size
        .byte   0x4     # DW_CFA_advance_loc4
        .long   .LCFI2-.LFB3
        .byte   0xe     # DW_CFA_def_cfa_offset
        .uleb128 0x10
        .byte   0x86    # DW_CFA_offset, column 0x6
        .uleb128 0x2
        .byte   0x4     # DW_CFA_advance_loc4
        .long   .LCFI3-.LCFI2
        .byte   0xd     # DW_CFA_def_cfa_register
        .uleb128 0x6
        .align 8
.LEFDE3:

is not a valid .eh_frame FDE, the
2005-03-31  Caroline Tice  <ctice@apple.com>

        (output_call_frame_info): Add test to see if function switches text
        sections in the middle; if so, use appropriate extra hot and cold
        section labels to compute size deltas for the hot and cold sections.

change was totally broken.  There is no such thing as a two ranges FDE in Dwarf2/3.  FDE starts with FDE length, CIE offset, FDE initial location, FDE address range, depending on CIE optionally with augmentation size and augmentation and then CFA instructions.  So, to describe .text/.text.unlikely split function, you need two FDEs, one describing the .text section, the other describing .text.unlikely section.  When the code jumps in between the sections, you need to insert CFA instructions that reflect what really changed.
The easiest would be to use .cfi_* assembler directives that recentish gas supports and emitting them inline in the code, rather than creating separate .eh_frame.  But I believe we need some more help from gas - particularly more powerful .cfi_escape and some easy way to just save current state resp. restore it.  That way you could save the state at the jump location and restore it at the jump target and let gas compute what CFA instructions are needed.
Comment 16 Ulrich Weigand 2008-03-04 14:51:02 UTC
Hi Jakub,

we need the same changes in both .eh_frame and .dwarf_frame;
does the gas .cfi_ support both sections?

I'm wondering how "save & restore" should work across two
different FDEs -- in the new FDE, we'd have to emit the full
set of CFA instructions to get to the "base-line" state ...
Comment 17 Jakub Jelinek 2008-03-04 15:10:05 UTC
.cfi_* doesn't create .dwarf_frame, but perhaps it could be taught to do that
optionally (some flag on .cfi_startproc that would switch on additional creation of .dwarf_frame).
The plan with save/restore was that you could do say
.text
.cfi_startproc
.cfi_personality 3, __gxx_personality_v0
.cfi_lsda 3, .LLSDA2
some code plus .cfi_* instructions for it
jmp 1f
.cfi_save .L0123
.section .text.unlikely
.cfi_startproc
.cfi_personality 3, __gxx_personality_v0
.cfi_lsda .LLSDA2
.cfi_restore .L0123
1:
some code plus .cfi_* instructions for it
.cfi_endproc
.previous
.cfi_endproc

where that .cfi_save would add a list of .cfi_* instructions from .cfi_startproc
till that .cfi_save into .L0123 queue, then .cfi_restore would just add those
queued .cfi directives all at the same location (with some optimizations, e.g.
.cfi_* instructions that clearly cancel themselves would be optimized out, any location advances of course taken out, etc.).
You can of course do that in dwarf2out.c, but we eventually want to use inline
.cfi_* instructions for other reasons - e.g. so that inline asm code can add its  unwind info.
Comment 18 John David Anglin 2008-03-16 16:27:59 UTC
Executing on host: /test/gnu/gcc/objdir/gcc/xgcc -B/test/gnu/gcc/objdir/gcc/ /te
st/gnu/gcc/gcc/gcc/testsuite/gcc.dg/tree-prof/pr34999.c   -O2 -freorder-blocks-a
nd-partition -fprofile-use -D_PROFILE_USE -fno-show-column  -lm   -o /test/gnu/g
cc/objdir/gcc/testsuite/gcc/pr34999.x02    (timeout = 300)
ld: An unknown relocation type 8 was encountered when applying the relocation fo
r symbol ".text.unlikely" at offset 0x15c in section index 1 of file /var/tmp//c
cFqB2Ph.o
1 errors.
collect2: ld returned 1 exit status
compiler exited with status 1
output is:
ld: An unknown relocation type 8 was encountered when applying the relocation fo
r symbol ".text.unlikely" at offset 0x15c in section index 1 of file /var/tmp//c
cFqB2Ph.o
1 errors.
collect2: ld returned 1 exit status

FAIL: gcc.dg/tree-prof/pr34999.c compilation,  -fprofile-use -D_PROFILE_USE

I believe the linker is complaining about the R_PARISC_PCREL12F relocation
used in this branch.

 15c:   9f f3 20 00     cmpb,*= r19,r31,164 <main+0x134>

Thought this hot and cold stuff was disabled on hppa as it breaks a lot
of stuff.
Comment 19 revital eres 2008-04-08 11:07:10 UTC
> The easiest would be to use .cfi_* assembler directives that recentish gas
> supports and emitting them inline in the code, rather than creating separate
> .eh_frame.  

I apologize ahead if I am totally wrong about it but 
on targets with limited local store when an overlay technique is used it might be prefered to place the unwind info in a separate section that will always be in the memory rather then inlining it using .cfi_* assembler.  
Comment 20 Jakub Jelinek 2008-04-08 11:13:24 UTC
Please read the documentation about .cfi_* directives.  They construct an .eh_frame section for you.
Comment 21 Andrew Pinski 2008-12-29 02:35:59 UTC
*** Bug 37058 has been marked as a duplicate of this bug. ***
Comment 22 Kaveh Ghazi 2009-03-27 16:37:36 UTC
I still see failures in bb-reorg.c and pr34999.c on x86_64-unknown-linux-gnu:
http://gcc.gnu.org/ml/gcc-testresults/2009-03/msg02744.html

The error message is slightly different than what others have posted here.  I'm using "GNU assembler 2.17 Debian GNU/Linux" on the GCC compile farm:

FAIL: gcc.dg/tree-prof/bb-reorg.c compilation,  -fprofile-use -D_PROFILE_USE
/tmp/cc87gzwS.s: Assembler messages:
/tmp/cc87gzwS.s:116: Error: can't resolve `.text.unlikely' {.text.unlikely section} - `L0^A' {.text section}
compiler exited with status 1

FAIL: gcc.dg/tree-prof/pr34999.c compilation,  -fprofile-use -D_PROFILE_USE
/tmp/ccDOht2P.s: Assembler messages:
/tmp/ccDOht2P.s:109: Error: can't resolve `.text.unlikely' {.text.unlikely section} - `L0^A' {.text section}
compiler exited with status 1

These are the last C testsuite failures I get on this box.  Perhaps we can xfail them when we branch if it won't be fixed in time for 4.4?

Comment 23 Steven Bosscher 2009-03-27 17:27:03 UTC
The checkin of the broken patch mentioned in comment #15 is here:
http://gcc.gnu.org/ml/gcc-cvs/2005-03/msg01521.html

Poor ctice also didn't manage to properly check in a ChangeLog entry, but oh well.
Comment 24 Jakub Jelinek 2009-07-24 15:37:04 UTC
I have a patch.
Comment 25 Jakub Jelinek 2009-07-24 23:30:54 UTC
Subject: Bug 34999

Author: jakub
Date: Fri Jul 24 23:30:39 2009
New Revision: 150069

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=150069
Log:
	PR rtl-optimization/34999
	* dwarf2out.c (struct dw_fde_struct): Add dw_fde_switch_cfi
	and dw_fde_switched_cold_to_hot fields.
	(output_cfi_p): New function.
	(output_call_frame_info): If fde->dw_fde_switched_sections,
	output 2 FDEs instead of one with corrupted header.
	(dwarf2out_do_cfi_startproc): New function.
	(dwarf2out_begin_prologue): Use it.  Initialize fde->dw_fde_switch_cfi
	and fde->dw_fde_switched_cold_to_hot.
	(dwarf2out_switch_text_section): Compute
	fde->dw_fde_switched_cold_to_hot.  Switch to new text section here.
	If dwarf2out_do_cfi_asm, emit .cfi_endproc before it and call
	dwarf2out_do_cfi_startproc plus emit again currently active CFI insns.
	Otherwise, compute fde->dw_fde_switch_cfi.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/dwarf2out.c

Comment 26 Kaveh Ghazi 2009-08-06 19:54:19 UTC
The patch fixed the bb-reorg.c and pr34999.c testsuite failures on my x86_64 box on mainline.  However I still see the failures on the 4.4 branch.

Jakub - Is your patch suitable for 4.4?
If so, will you please backport it?

Thanks.
Comment 27 Kaveh Ghazi 2009-09-07 01:04:31 UTC
(In reply to comment #25)
> Subject: Bug 34999
> Author: jakub
> Date: Fri Jul 24 23:30:39 2009
> New Revision: 150069
> URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=150069
> Log:
>         PR rtl-optimization/34999
>         * dwarf2out.c (struct dw_fde_struct): Add dw_fde_switch_cfi
>         and dw_fde_switched_cold_to_hot fields.
>         (output_cfi_p): New function.
>         (output_call_frame_info): If fde->dw_fde_switched_sections,
>         output 2 FDEs instead of one with corrupted header.
>         (dwarf2out_do_cfi_startproc): New function.
>         (dwarf2out_begin_prologue): Use it.  Initialize fde->dw_fde_switch_cfi
>         and fde->dw_fde_switched_cold_to_hot.
>         (dwarf2out_switch_text_section): Compute
>         fde->dw_fde_switched_cold_to_hot.  Switch to new text section here.
>         If dwarf2out_do_cfi_asm, emit .cfi_endproc before it and call
>         dwarf2out_do_cfi_startproc plus emit again currently active CFI insns.
>         Otherwise, compute fde->dw_fde_switch_cfi.
> Modified:
>     trunk/gcc/ChangeLog
>     trunk/gcc/dwarf2out.c

Is this patch suitable for 4.4.x?

Comment 28 Kaveh Ghazi 2009-12-01 15:14:31 UTC
Backport to 4.4?  Or close this and PR41501?
Comment 29 Richard Biener 2009-12-01 15:16:31 UTC
The CFI stuff isn't on the 4.4 branch, so WONTFIX there.