This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: new register allocator and HARD_REGNO_CALL_PART_CLOBBERED
- From: hermantenbrugge at home dot nl (Herman ten Brugge)
- To: dalej at apple dot com (Dale Johannesen)
- Cc: gcc at gcc dot gnu dot org, dberlin at dberlin dot org
- Date: Thu, 1 May 2003 21:21:10 +0200 (CEST)
- Subject: Re: new register allocator and HARD_REGNO_CALL_PART_CLOBBERED
Dale Johannesen wrote :
>
> On Wednesday, April 30, 2003, at 12:31 PM, Herman ten Brugge wrote:
>
> > Hello,
> >
> > I found a problem with the -fnew-ra option of the compiler. I analysed
> > the problem and found that the macro HARD_REGNO_CALL_PART_CLOBBERED
> > is not used by the new register allocator.
> > I think this is also a bit difficult to implement because we now
> > have to check the mode for every register that is used during a
> > CALL_INSN.
>
> You can try the following (mail may have screwed up spacing). It may
> result in even worse code than your patch, but maybe it's better than
> trying to work around this new-ra problem in individual targets.
> (Untested, I do not have a c4x.)
I do not think the problem is in df.c. The code in df.c walks through the
rtl and sets clobbered/used register info. We do not yet know what mode
the hard registers will get (reg_raw_mode can not be used here). The mode
is assigned in the new-ra pass.
We run df.c when no hard registers are set yet, only pseudo registers
are set. HARD_REGNO_CALL_PART_CLOBBERED works only on hard registers,
so how should this work? See how HARD_REGNO_CALL_PART_CLOBBERED is
used in local_alloc.c and global.c.
When the code below is compiled with '-O3 -fnewra' I see (with your patch
applied):
laj _ran
ldfu f2,f4
stik 0,*+ar3(1)
push r11
laj is a call insn with 3 delay slots. The first insn is ldfu f2,f4. This
suggests that f2 is saved in f4 during this call but this is not possible
because f4 is not saved as floating point register. This register is
only saved as integer register. (f means floating point register in
QFmode, r means integer register in QImode. The registers are 40 bits
wide the 32 msb's are used for floating point values, the 32 lsb's are
used for integer mode registers. I did not design this!).
You can generate a c4x compiler with:
configure --target=c4x --enable-languages=c
This gets you at least the cc1 compiler. Then things start failing because
the you probably dont have the include/library for the c4x.
Herman.
---------------------------------------------------
double ran(int *idum);
void maxbol(double vp , double *vx , double *vy , double *vz);
main ()
{
double vp = 0.0048;
double vx;
double vy;
double vz;
maxbol(vp, &vx , &vy , &vz );
if (vx < 0.0013165056 || vx > 0.0013165057)
abort();
if (vy < 0.002731491 || vy > 0.002731492)
abort();
if (vz < 0.001561453 || vz > 0.001561455)
abort();
exit(0);
}
void maxbol(double vp , double *vx , double *vy , double *vz)
{
int idum=0;
int i;
double temp;
*vx=vp*ran( &idum );
*vy=vp*ran( &idum );
*vz=vp*ran( &idum );
}
double ran(int *idum)
{
static long ix1,ix2,ix3;
static double r[97];
double temp;
static int iff=0;
int j;
if(*idum<0 || iff==0){
iff=1;
ix1=(54773-(*idum))%259200;
ix1=(7141*ix1+54773)%259200;
ix2=ix1 %134456;
ix1=(7141*ix1+54773)%259200;
ix3=ix1 %243000;
for(j=0; j<97; j++){
ix1=(7141*ix1+54773)%259200;
ix2=(8121*ix2+28411)%134456;
r[j]=(ix1+ix2*(1.0/134456))*(1.0/259200);
}
*idum=1;
}
ix1=(7141*ix1+54773)%259200;
ix2=(8121*ix2+28411)%134456;
ix3=(4561*ix3+51349)%243000;
j=((97*ix3)/243000);
if(j >= 97 && j < 0)
abort();
temp=r[j];
r[j]=(ix1+ix2*(1.0/134456))*(1.0/259200);
return temp;
}