Bug 21990 - Wrong code for 4.0 and head: Reload clobbers the frame pointer by using it as spill register without recognizing the clobbering
Summary: Wrong code for 4.0 and head: Reload clobbers the frame pointer by using it as...
Status: RESOLVED WORKSFORME
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2005-06-09 22:28 UTC by Björn Haase
Modified: 2007-08-23 20:39 UTC (History)
4 users (show)

See Also:
Host:
Target: avr-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2005-06-15 03:22:24


Attachments
Modified test case. (715 bytes, text/plain)
2005-06-10 16:43 UTC, Björn Haase
Details
Patch for avr_hard_regno_mode_ok (700 bytes, patch)
2005-09-04 10:32 UTC, Björn Haase
Details | Diff
Corrected typo. (704 bytes, patch)
2005-09-04 11:27 UTC, Björn Haase
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Björn Haase 2005-06-09 22:28:07 UTC
Hi, 
 
I have observed a wrong code bug that I judge to be so serious that IMHO one 
should discourage use of the avr port for 4.x.x until it is resolved. 
 
Unfortunately the bug showed up in a deeply embedded code segment of *really* 
confidential code owned by my employer. I unfortunately cannot pass the test 
case I am having to the public. I am working hard for reproducing the bug in a 
non-confidential stripped file. Meanwhile, I have the following diagnosis: ( I, 
know that this will be difficult, but maybe someone could never the less give 
me an indication already on this base.) 
 
 
1.) Background info for those not too familiar with the AVR port: 
AVR uses the hard reg 28 (also called Y-Register) as frame pointer,  
when necessary. If a pseudo happens to end up on a stack slot and needs to be 
loaded into a register of class POINTER_REG (one of the hard regs 26 (reg X), 
28 (reg Y) or 30(reg Z)), reload sometimes generates code where the frame 
pointer 28 (reg Y) itself is used as spill register. GCC, however, still uses 
the Y register as frame pointer and does not recognize that it no longer points 
to the stack.  
 
2.) Description of the bug: 
 
Excerpt from the debugging dump originating from the reload pass 
for my testcase (test.c.23.greg): 
 
Reload 0: reload_in (HI) = (reg/v/f:HI 65 [ pointer_to_next_C ]) 
        reload_out (HI) = (reg/v/f:HI 65 [ pointer_to_next_C ]) 
        POINTER_REGS, RELOAD_OTHER (opnum = 1) 
        reload_in_reg: (reg/v/f:HI 65 [ pointer_to_next_C ]) 
        reload_out_reg: (reg/v/f:HI 65 [ pointer_to_next_C ]) 
        reload_reg_rtx: (reg:HI 28 r28) 
 
The corresponding asm output reads (Problem case plus part of the prologue for 
illustration. descriptive comments added by me): 
 
        ;; Section of the prologue allocating space on the stack for a HImode 
        ;; variable. 
         
        in r28,__SP_L__ 
        in r29,__SP_H__ 
        sbiw r28,2 
        in __tmp_reg__,__SREG__ 
        cli 
        out __SP_H__,r29 
        out __SREG__,__tmp_reg__ 
        out __SP_L__,r28 
        ;; Stack pointer and frame pointer r28:r29 are now increased by 2 for 
        ;; making place for a pointer variable. Y+1 and Y+2 point to the 
        ;; lsb and the msb of this pointer variable. 
  
        ... some complicated code ... 
 
        ;; Now the variable on the stack slot is needed in a spill register 
        ;; of class "pointer registers" (i.e. r26-r31). 
        ;; GCC chooses to use the Frame pointer r28 as spill register.  
        ;; r28 is overwritten by the value of the variable on the stack 
        ;; (GCC uses special pattern for loading Y from an address Y is  
        ;; pointing to by use of the tmp reg. 
 
        ldd __tmp_reg__,Y+1 
        ldd r29,Y+2 
        mov r28,__tmp_reg__ 
        ; Now the frame pointer no longer points to the stack! 
 
        ; Use Y for accessing the memory the pointer variable on the stack 
        ; points to. 
        ld r18,Y+ 
        ld r19,Y+ 
        ld r20,Y+ 
        ld r21,Y+ 
 
        ;; GCC now assumes that it could write Y back to the stack 
        ;; by the following two commands. BUG!!! Also all of the subsequent 
        ;; Accesses to the stack by the frame pointer will fail. 
        std Y+2,r29 
        std Y+1,r28 
 
The bug exists already in January 05 and in today's head 4.1. It was present in 
all of the intermediate snapshots I have analyzed so far. I was able to 
reproduce the problem identically on a cygwin-woe200 host, an i386-linux host 
and an amd64-linux host. 
 
I am working hard to reduce the test case so that it will be possible for me to 
post it here. Meanwhile, I'd appreciate very much any advice on where to look 
for the origin. 
 
Yours, 
 
Björn
Comment 1 Andrew Pinski 2005-06-09 22:33:37 UTC
Waiting for a testcase, it could be anything really, a target bug or even a reload one.
Comment 2 Bjoern Haase 2005-06-10 12:10:41 UTC
Hi,
here is the promised test case. I unfortunately had to use a number of asm 
statements in order to reproduce the register contraint combination exposing 
the bug. The test case compiles only with -morder1  (== best configuration for
AVR when using many SImode variables) but ICEs otherwise.


Start of testcase:

// Testcase to be compiled with avr-gcc -Os -morder1
#include <inttypes.h>

typedef struct 
{
  uint32_t HLFG;
} special_struct;

void
foo3 (special_struct *a, const special_struct b)
{ 
    asm (
	 ";"
	 : "+r" (a->HLFG) : "r" (b.HLFG) );	 
}

uint16_t inline static 
foo (const uint32_t i, const special_struct f)
{ uint16_t ui16_resultat;
  asm volatile (
      ";"
      : "=&r" (ui16_resultat)
      : "r"   (i),
        "r"   (f.HLFG)
      );  
 return ui16_resultat;
}

special_struct inline static
foo2 (const special_struct m1, const uint16_t m2)
{ special_struct resultat;

  asm volatile (
      ";"
      : "=&r" (resultat.HLFG)
      : "r"   (m1.HLFG),
        "r"   (m2)
      );  

  if (m1.HLFG & 0x00020000)
   {
     resultat.HLFG = ~resultat.HLFG;
   }  

  return resultat;
}


special_struct inline static
foo7 (uint16_t t)
{ special_struct resultat;
  asm volatile (";" : "=r" (resultat.HLFG):);
  
  if (t & 0x020)
   t = ~(t);
    
  if (t & 0x0100)
   { 
     t = ~t;

     asm volatile (";" : "+r" (resultat.HLFG):);
   }
  
  asm volatile (";" : "+d" (t):);
      
  
   if (t > 9300)
    { 
      t = 12743 - ( (int16_t) t);
      uint16_t ui16_r = 7480;
      
      asm volatile (
              ";"
           : "+&r" (resultat.HLFG)
	   : "r" (t),
	     "r" (ui16_r)
	   );

    } else if (t > 9211)
    {    
      uint16_t ui16_r = 1108;
      asm volatile (
		 ";"
           : "+&r" (resultat.HLFG)
	   : "r" (t),
	     "r" (ui16_r)
	   );
    } else
    { 
      uint16_t r = 440;
      asm volatile ( 
		     ";" 
		    : "+&d" (resultat.HLFG),
		      "+&r" (r)
		    : "r"   (t) );
    }
  
  return resultat;
}

typedef struct
   { uint16_t q;
     uint16_t a;
   } lp_t;

typedef struct
   { 

     lp_t l[19];

     uint32_t sf;

   } vmt;

vmt m;

special_struct sp_C[19];

special_struct
problem (uint32_t s)
{  
   special_struct *p_C;
   special_struct ga;
   ga.HLFG = 0;

   p_C = &sp_C[0];
   uint8_t lz;
   
   for (lz=0; lz < 23; lz ++)
   { 
     special_struct ac;     
     {
       special_struct C;
       
       C = *p_C;
       p_C ++;
       
       uint16_t t;
       t = m.l [lz].q;
     
       ac = foo7 ( foo (s, C) - t);
     }
     
     { uint16_t e;
       e = m.l [lz].a;
       foo3 (&ga, foo2 (ac, e));
     }     
   }
   
   return ga;
}
Comment 3 Björn Haase 2005-06-10 16:43:02 UTC
Created attachment 9061 [details]
Modified test case.

Hi, 
 
advancing further, I recognized that all of the asm statements in the test case

could be replaced by standard C expression except for a single one. Look at the
attached file.
 
It seems that the immediate wrong code issue is triggered by constraint 
 
asm volatile (	
		";"  
	      : "+&d" (resultat.HLFG), 
		"+&r" (r) 
	      : "r"   (t) );  
 
Reload seems to have difficulties with this "two combined input/output 
operands" expression. I would consider it to be perfectly acceptable if reload
refuses to handle this case (i.e. triggers an ICE). 
 
However, what I am observing is that when using above asm statement for
generatating  stress for reload, reload generates code that alters 
the frame pointer! This is, IMO, extremely suspicious. I have serious concerns
regarding the correctness of the code as soon as frame pointers are used.

(There were other reports stating that applications used to work with the 3.4.x
series and failed with 4.0.0 as soon as optimizations were switched on.)

Yours,
 
Björn
Comment 4 Björn Haase 2005-07-29 20:06:32 UTC
Hi, 
 
a couple of weeks after identifying this issue, I have continued working with 
4.0.0. I never observed a similar failure in our production code. The issue 
seems to be strongly linked to the constraint requiring simultaneously two 
operands with the "+&" modifier. The back-end, itself never generates such a 
constraint pattern right now. 
 
It seems to me that the real problem has to do with register constraint 
matching. I personally do not have sufficient knowledge so that I would 
consider to start thinking about daring to eventually touch reload.c and 
comrades. On the other hand, I still feel uneasy about this bug.  
 
For this reason I am thinking about a workaround solely within the back-end. 
While refraining from requiring that code that contains the mentioned difficult 
constraits compiles, I presently think that one might already be better of with 
a situation where it triggers an ICE instead of generating wrong code. 
 
IMO, a possible workaround might be to simply disable within the machine 
description the move patterns that meet the two following two conditions  
1.) the move uses a memory reference stored in Y 
2.) the move target is the Y register itself 
 
My hope is that in absence of this move pattern, reload will no longer attempt 
to use the frame pointer register Y as spill register. 
 
This workaround certainly would be far from ideal, but better than nothing. I'd 
appreciate comments on this issue. 
 
Yours, 
 
Björn 
Comment 5 Björn Haase 2005-08-20 13:49:50 UTC
I have re-investigated this bug today and observed that it no longer appears in 
mainline. It is also absent in today's cvs state of the 4.0 branch. 
 
Dunno whether the original problem has been fixed or whether something else has 
changed such that the bug is no longer exposed: Since the register allocator 
now seems to be smarter on both branches than it used to be, the function 
exposing the bug no longer needs the frame pointer. 
 
Yours, 
 
Björn 
Comment 6 Björn Haase 2005-09-03 14:25:39 UTC
Hi, 
 
I now think that the true origin of this PR is related to the fact that for AVR 
we need *two* registers for holding the frame pointer: 
 
Recently, I have played around with modifying avr-gcc by choosing different 
registers for the frame pointer and the stack pointer. Namely I have 
experimented with the calling convention used by the IAR system (use two 
seperate stacks for the return addresses and data). 
It turned out that the stack pointer and frame pointer register number during 
my experiments that was used by gcc was not 28 but 29! 
 
Possibly one would have to re-investigate avr_hard_regno_mode_ok(REGNO, MODE) 
function in avr.c that implements the HARD_REGNO_MODE_OK target macro. IIUC, 
there have been a couple of problems in the past with similar results: 
clobbered frame pointer. Maybe the fact that this bug does not show up 
frequently has to do with that the constraints for the Y register implemented 
in avr_hard_regno_mode_ok are fairly restrictive right now, so that the 
register allocator and reload almost never are tempted to use it. 
 
Important information would be wether avr_hard_regno_mode_ok is evaluated 
dynamically for each function. In this case one could probably prevent 
clobbering of the frame pointer by denying any mode in regs 28 and 29 in case 
that a frame pointer is needed. This way one possibly could also lift the 
present restriction that only pointers (Pmode objects) may be held in the y 
register. 
 
So I think that we are right now having a hidden problem in avr-gcc that shows 
up only in very rare occasions. Most probably it's also present in the 3.4.4 
series. 
 
Yours, 
 
Björn 
Comment 7 Björn Haase 2005-09-04 10:32:12 UTC
Created attachment 9662 [details]
Patch for avr_hard_regno_mode_ok
Comment 8 Björn Haase 2005-09-04 10:35:36 UTC
Indeed IMHO the problem stems from the patch:   
   
+2005-01-22  Roger Sayle  <roger@eyesopen.com>   
+   
+	PR middle-end/19378   
+	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.   
+   
   
IMO the problem stems from the change   
   
-  /* Reload can use r28:r29 for reload register and for frame pointer   
-   in one insn. It's wrong. We must disable it.  */   
-  if (mode != Pmode && reload_in_progress && frame_pointer_required_p ()   
-      && regno <= REG_Y && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))   
+  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */   
+  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))   
     return 0;   
   
Denis Chertykov's older version using "frame_pointer_required_p ()" has now   
vanished. A problem during register allocation has been removed by Roger  
Sayle's patch. However, the old frame pointer bug has been re-introduced.   
   
My suggestion is to use the following implementation. (according patch to   
today's head attached) This patch has passed regression tests on head without   
new regressions on head on the c-testsuite for compilation and simulator  
excecution for the atmega128.   
I also verified on an old snapshot that had exposed PR21990 that this patch  
resolves PR21990.  
  
I'll post this message also on gcc-patches.  
  
Yours,  
  
Bjoern  
Comment 9 Giovanni Bajo 2005-09-04 11:19:51 UTC
Roger, want to have a look at this?
Comment 10 Björn Haase 2005-09-04 11:27:01 UTC
Created attachment 9663 [details]
Corrected typo.

replaces previous patch.
Comment 11 Björn Haase 2005-09-04 11:28:33 UTC
I just realized that the patch posted here had just posted had a <= where  
should have been a < for two comparisons. The patch on gcc-patches@gcc.gnu.org 
is already correct. 
Comment 12 Eric Weddington 2006-09-19 19:41:20 UTC
(In reply to comment #11)
> I just realized that the patch posted here had just posted had a <= where  
> should have been a < for two comparisons. The patch on gcc-patches@gcc.gnu.org 
> is already correct. 

Bjoern,

1. Is this patch already applied?
2. If yes, what versions is it applied to?
3. If no, can you post the link to the patch on gcc-patches? Or can you attach the corrected patch to this bug report?

Thanks
Comment 13 Björn Haase 2006-09-19 20:16:36 UTC
Hello Eric,

IIRC, the bug never was really resolved. The true place to fix the issue was, IMO, the most dreaded source file of the entire GCC source tree: reload. 

My now quite old patch tried to fix the immediate problem without touching reload and it used a fairly crude method by denying gcc to make use of Y in many situations.

My test case generated code that was very difficult to handle by the register allocator. This code triggered the bug. Later on some change in some of the passes just before the 4.0.0 release removed the immediate problem. With 4.0 and afterwards I no longer succeeded to reproduce the bug:

The reason might be, that the original reload bug was fixed.
The reason might as well be that the reload bug is still there, but that it's no longer exposed due to some other modification in the compiler.

The message is: There stand good chances that this bug is resolved, but I cannot prove it.

In any case, I'd like to suggest *NOT* to make use of my old crude patch. IMO it's better to just hope that the problem is fixed.

Until now, I never again stepped over this bug, so I have good confidence.

HTH,

Bjoern
Comment 14 Eric Weddington 2006-09-19 20:25:30 UTC
Thanks, Bjoern, for responding in detail.

If this bug cannot be reproduced, can we go ahead and close this bug report?

Eric
Comment 15 Eric Weddington 2007-08-23 20:39:40 UTC
Closing bug as WORKSFORME based on Bjoern's observations in comment #13.