Bug 42240 - [4.3/4.4 Regression] wrong epilogue on naked function
Summary: [4.3/4.4 Regression] wrong epilogue on naked function
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.2
: P3 major
Target Milestone: 4.5.3
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2009-12-01 16:18 UTC by Stefan Dreyer
Modified: 2012-01-26 22:36 UTC (History)
6 users (show)

See Also:
Host:
Target: avr
Build:
Known to work: 4.2.2, 4.5.3, 4.6.0
Known to fail: 4.3.2, 4.3.4, 4.4.3, 4.5.0
Last reconfirmed: 2010-07-26 17:37:41


Attachments
Compileable testfile (414 bytes, text/plain)
2009-12-01 16:19 UTC, Stefan Dreyer
Details
Compileable testfile (362 bytes, text/plain)
2009-12-01 16:44 UTC, Stefan Dreyer
Details
generatet assambly file (418 bytes, text/plain)
2009-12-01 16:44 UTC, Stefan Dreyer
Details
Initial patch to fix the bug (608 bytes, patch)
2010-11-09 10:24 UTC, Anitha Boyapati
Details | Diff
PR target/42240 (699 bytes, text/plain)
2010-11-09 17:16 UTC, Georg-Johann Lay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Dreyer 2009-12-01 16:18:11 UTC
Hello!

Following code is Producing an infinit loop, so main() is never reached

extern int main(void);
int main(void){
	while(1);
}
#include <avr\io.h>
/*Seed aus SRAM Zellen*/
static uint16_t seedram __attribute__ ((section (".noinit")));
void __init_seed(void) __attribute__ ((naked, section (".init8")));
// Bestimme seed aus zufälligem RAM-Inhalt
// !!! FUNKTION NICHT AUFRUFEN       !!!
// !!! FUNKTION WIRD AUTOMATISCH     !!!
// !!! IN .init8 vor main AUSGEFÜHRT !!!
void __init_seed (void)
{
	uint16_t s = 0;
	uint16_t *p = (uint16_t*) (RAMEND+1);
	extern uint16_t __noinit_start;

	while (p >= &__noinit_start + 1)
		s ^= *(--p);

	seedram = s;
}


is going to 

	while (p >= &__noinit_start + 1)
  8e:	81 e0       	ldi	r24, 0x01	; 1
  90:	e2 30       	cpi	r30, 0x02	; 2
  92:	f8 07       	cpc	r31, r24
  94:	c0 f7       	brcc	.-16     	; 0x86 <__init_seed+0x12>
		s ^= *(--p);

	seedram = s;
  96:	30 93 01 01 	sts	0x0101, r19
  9a:	20 93 00 01 	sts	0x0100, r18
}
  9e:	20 e0       	ldi	r18, 0x00	; 0
  a0:	30 e0       	ldi	r19, 0x00	; 0
  a2:	f9 cf       	rjmp	.-14     	; 0x96 <__init_seed+0x22>

at line a2: an infinit loop starts and never leaves this function.

the Problem is only an Optimation levels O2 and O3

Orginal thread in german Forum:
http://www.mikrocontroller.net/topic/158516

compiler call:

avr-gcc -c -mmcu=atmega168 -I. -gdwarf-2 -DF_CPU=8000000UL -O2 -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=./main.lst  -std=c99 -MMD -MP -MF .dep/main.o.d main.c -o main.o

linker call:

avr-gcc -mmcu=atmega168 -I. -gdwarf-2 -DF_CPU=8000000UL -O2 -funsigned-char -funsigned-bitfields -fpack-struct -fshort-enums -Wall -Wstrict-prototypes -Wa,-adhlns=main.o  -std=c99 -MMD -MP -MF .dep/main.elf.d main.o --output main.elf -Wl,-Map=main.map,--cref     -lm

Compiler version:

D:\Daten\Projekte\Eclipse Workspace\test init>avr-gcc -v
Using built-in specs.
Target: avr
Configured with: ../gcc-4.3.2/configure --enable-win32-registry=WinAVR-20090313
--with-gmp=/usr/local --with-mpfr=/usr/local --prefix=/c/WinAVR --target=avr --enable-languages=c,c++,objc --with-dwarf2 --enable-doc --disable-shared --disable-libada --disable-libssp --disable-nls --with-pkgversion='WinAVR20090313' --with-bugurl='URL:http://sourceforge.net/tracker/?atid=520074&group_id=68108&func=browse'Thread model: single
gcc version 4.3.2 (WinAVR 20090313)

Greetings Stefan
Comment 1 Stefan Dreyer 2009-12-01 16:19:35 UTC
Created attachment 19199 [details]
Compileable testfile
Comment 2 Stefan Dreyer 2009-12-01 16:44:03 UTC
Created attachment 19200 [details]
Compileable testfile
Comment 3 Stefan Dreyer 2009-12-01 16:44:57 UTC
Created attachment 19201 [details]
generatet assambly file
Comment 4 Stefan Dreyer 2009-12-01 16:46:26 UTC
Comment on attachment 19201 [details]
generatet assambly file

in 
/* epilogue start */

rjmp .L6

infinite, never leaves function
Comment 5 Joerg Wunsch 2009-12-01 16:58:35 UTC
My first analysis shows that this is likely to be related to the
introduction of RTL prologues/epilogues in GCC 4.3.  GCC 4.2.2
has:

  if (avr_naked_function_p (current_function_decl))
    {
      fputs ("/* epilogue: naked */\n", file);
      goto out;
    }
...
 out:
  fprintf (file, "/* epilogue end (size=%d) */\n", epilogue_size);
  fprintf (file, "/* function %s size %d (%d) */\n", current_function_name (),
           prologue_size + function_size + epilogue_size, function_size);
  commands_in_file += prologue_size + function_size + epilogue_size;
  commands_in_prologues += prologue_size;
  commands_in_epilogues += epilogue_size;
}

GCC 4.3.4 has:

  /* epilogue: naked  */
  if (cfun->machine->is_naked)
    {
      emit_jump_insn (gen_return ());
      return;
    }

So apparently, emit_jump_insn (gen_return ()); is responsible for creating
the infinite loop.
Comment 6 Andreas Kaiser 2009-12-01 19:21:02 UTC
Same for ARM target, when compiled for Thumb instruction set (-fthumb).
Comment 7 Andreas Kaiser 2009-12-01 19:37:58 UTC
Confirmed for 4.4.2.
Comment 8 Andreas Kaiser 2009-12-01 22:50:11 UTC
For the ARM target, falling off the end of the function does not make sense anyway, since the last byte usually is followed by the constant pool. So the AVR target finally caught up with the ARM target and doesn't work that way either.
Comment 9 Anitha Boyapati 2010-07-22 11:26:18 UTC
I think the bug is not due to 'naked' function. When the naked attribute is removed for __init_seed(void), the following diff can be observed in the assembly file.

====================================================
--- test_without_naked.s	2010-07-22 15:53:13.178956400 +0530
+++ test_with_naked.s		2010-07-22 15:52:52.474373900 +0530
@@ -14,10 +14,11 @@
 .L2:
 	rjmp .L2
 	.size	main, .-main
+	.section	.init8,"ax",@progbits
 .global	__init_seed
 	.type	__init_seed, @function
 __init_seed:
-/* prologue: function */
+/* prologue: naked */
 /* frame size = 0 */
 	ldi r24,lo8(__noinit_start+2)
 	ldi r25,hi8(__noinit_start+2)
@@ -41,7 +42,6 @@
 	sts (seedram)+1,r19
 	sts seedram,r18
 /* epilogue start */
-	ret
 .L11:
 	ldi r18,lo8(0)
 	ldi r19,hi8(0)

=======================================================

It can be seen that in case of naked function, 'ret' is not generated. Hence the execution goes into infinite loop. The code generated for epilogue is wrong in few cases for either for naked/normal functions with -O2 enabled. However incase of normal functions, execution is not affected as 'ret' is executed before entering into block labelled .L11 (of main.s).

.L11:
	ldi r18,lo8(0)
	ldi r19,hi8(0)
	rjmp .L6
Comment 10 Anitha Boyapati 2010-07-26 06:49:08 UTC
Bug can be confirmed with 4.4.3 version also (With -O2)
Comment 11 Eric Weddington 2010-07-26 17:37:41 UTC
Anitha, can you please check 4.5.x and HEAD?
Comment 12 Anitha Boyapati 2010-07-27 11:35:45 UTC
confirmed for 4.5.0. Yet to verify for HEAD.
Comment 13 Georg-Johann Lay 2010-11-04 11:14:15 UTC
Implement TARGET_CANNOT_MODIFY_JUMPS_P and respect epilogue_completed and cfun->machine->is_naked. This will stop BB reorder and similar post-epilogue passes from moving the non-code-producing return-insn up.
Comment 14 Anitha Boyapati 2010-11-09 10:24:11 UTC
Created attachment 22339 [details]
Initial patch to fix the bug

BB reordering pass is suppressed for naked functions. Also suppressed when the end of epilogue is reached.
Comment 15 Anitha Boyapati 2010-11-09 10:26:04 UTC
(In reply to comment #13)
> Implement TARGET_CANNOT_MODIFY_JUMPS_P and respect epilogue_completed and
> cfun->machine->is_naked. This will stop BB reorder and similar post-epilogue
> passes from moving the non-code-producing return-insn up.

Yes, BB reordering pass is shifting jump blocks. Attached patch solves the issue. The output is: 

.brsh .L11        |     brlo .L5
                  >     ldi r24, lo8(0)
                  >     ldi r25, hi8(0)
                  >     rjmp .L6
                  >     L5:
.L11:             <
 ldi r24, lo8(0)  <
 ldi r25, hi8(0)  <
 rjmp .L6         <
Comment 16 Georg-Johann Lay 2010-11-09 17:16:02 UTC
Created attachment 22349 [details]
PR target/42240
Comment 17 Georg-Johann Lay 2010-11-09 17:20:59 UTC
(In reply to comment #14)
> Created attachment 22339 [details]
> Initial patch to fix the bug
> 
> BB reordering pass is suppressed for naked functions. Also suppressed when the
> end of epilogue is reached.

BBreordering (or other passes doing similar things) need not to be disabled. What do you think? Moreover, there is a gap between reload_completed and epilogue_completed because the naked return insn is allowed since after reload.

Georg
Comment 18 Anitha Boyapati 2010-11-10 10:38:33 UTC
(In reply to comment #17)
> (In reply to comment #14)
> > Created attachment 22339 [details] [details]
> > Initial patch to fix the bug
> > 
> > BB reordering pass is suppressed for naked functions. Also suppressed when the
> > end of epilogue is reached.
> 
> BBreordering (or other passes doing similar things) need not to be disabled.
> What do you think? 

I think BBreordering after the end of epilogue in case of normal functions is little confusing. For e.g., the same testcase with naked removed generates the following code without the patch.

.L6:
        sts (seedram)+1,r25
        sts seedram,r24
/* epilogue start */
        ret
.L11:
        ldi r24,lo8(0)
        ldi r25,hi8(0)
        rjmp .L6


I am not sure what additional advantage we get by allowing this to happen after epilogue. Hence I tried to disable this. Ofcourse, I still need to run entire regression to see if this causes any side affect.

> Moreover, there is a gap between reload_completed and
> epilogue_completed 

Agreed. Thanks for catching that.


> because the naked return insn is allowed since after reload.

I dindn't quite get this. Can you give some example?
Comment 19 Georg-Johann Lay 2010-11-10 13:00:47 UTC
(In reply to comment #18)

> I think BBreordering after the end of epilogue in case of normal functions is
> little confusing.

Yes, optimized code is often confusing.

> I am not sure what additional advantage we get by allowing this to happen after 
> epilogue. Hence I tried to disable this. 

There is no bug for functions not attributed naked. I do not see why that should be changed. Note that the backend is lying about branch costs and sets them to zero -- both in space and time. Therefore, .bbro has fun and jumps around. Tweaking that is of no concern here.

> > Moreover, there is a gap between reload_completed and
> > epilogue_completed 
> 
> Agreed. Thanks for catching that.
> 
> 
> > because the naked return insn is allowed since after reload.
> 
> I dindn't quite get this. Can you give some example?

avr.h allows that insn after reload:

(define_insn "return_from_naked_epilogue"
  [(return)]
  "(reload_completed 
    && cfun->machine 
    && cfun->machine->is_naked)"
  ...

so it might be emitted after reload. Because it is non-simple, it will occur just once, but it must be in the right place (ensured by .pro_and_epilogue) and must not be moved around (ensured by new target hook). Observe a simple test case like lined out below, and how the compiler transform with/without naked, optimize size/speed, branch cost 0/!=0

char c;

void foo (char i) 
{
  c = 17;
  switch (i)
    {
    case 0 : c = 1; break;
    case 3 : c = 2; break;
    case 7 : c = 12; break;
    case 9 : c = 32; break;
    }
}
Comment 20 denisc 2011-02-27 08:36:58 UTC
Author: denisc
Date: Sun Feb 27 08:36:55 2011
New Revision: 170534

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170534
Log:
2011-02-22  Georg-Johann Lay  <avr@gjlay.de>

	PR target/42240
	* config/avr/avr.c (avr_cannot_modify_jumps_p): New function.
	(TARGET_CANNOT_MODIFY_JUMPS_P): Define.
	

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.c
Comment 21 denisc 2011-03-03 16:58:34 UTC
Author: denisc
Date: Thu Mar  3 16:58:26 2011
New Revision: 170657

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=170657
Log:
	Backport from mainline
	2011-02-22  Georg-Johann Lay  <avr@gjlay.de>

	PR target/42240
	* config/avr/avr.c (avr_cannot_modify_jumps_p): New function.
	(TARGET_CANNOT_MODIFY_JUMPS_P): Define.


Modified:
    branches/gcc-4_5-branch/gcc/ChangeLog
    branches/gcc-4_5-branch/gcc/config/avr/avr.c
Comment 22 Georg-Johann Lay 2011-04-14 15:37:51 UTC
Closed as resolved+fixed in 4.5.3, 4.6.0