This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/62173] [5.0 regression] [AArch64] Performance regression due to r213488


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62173

--- Comment #9 from Jiong Wang <jiwang at gcc dot gnu.org> ---
To summary, given the following testcases:

case A.C
===
void bar(int i)
{
  char A[10];
  g(A);
  f(A[i]);
}

case B.c
===
void bar(int i)              
{                        
  char A[10];
  char B[10];     
  char C[10];       
  g(A);                        
  g(B);
  g(C);                              
  f(A[i]);                 
  f(B[i]);                        
  f(C[i]);                   
  return;               
} 

current code base:

  * generate sub-optimal code for case A.
  * generate optimal code for case B, because frame address are rematerialized.

I verified *arm/mips also generate the same sub-optimal code layout for case
A*, and I believe should be the same for Sebastian's testcase.

r213488 bring AArch64 to the correct road then we run into common issue existed
on other target also.

for any target with FRAME_GROWS_DOWNWARD be 1, the same sub-optimal code layout
will be generated, because the base address of the first stack variable will be
eliminated into frame + some_minus_value in later stage of LRA which cause it
can't be foled with other constant.

and after my experimental hack on LEGITIMIZE_ADDRESS to associate
stack_var_virtual_rtx with constant offset, then:

  * generate optimal code for case A.
  * generate sub-optimal code for case B, because frame address are *not
rematerialized*.

will do further investigation on this especially the frame address
rematerialization after my patch.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]