This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Short Displacement Problems.
- From: "Naveen Sharma, Noida" <naveens at noida dot hcltech dot com>
- To: law at redhat dot com, Joern Rennecke <joern dot rennecke at superh dot com>, Alexandre Oliva <aoliva at redhat dot com>, bernds at redhat dot com, gniibe at m17n dot org, Richard Henderson <rth at redhat dot com>
- Cc: gcc at gcc dot gnu dot org
- Date: Wed, 29 May 2002 17:57:04 +0530
- Subject: Short Displacement Problems.
Hi Everyone,
This is in continuation of my earlier postings
1. http://gcc.gnu.org/ml/gcc/2002-05/msg02662.html
2. http://gcc.gnu.org/ml/gcc/2002-04/msg00379.html
I would like you to please have a look at the
problem and comment whether
1. The problem, if solved, will bring significant gain on
architectures which have short (4bit,6bit type) displacements.
2. Any obvious issues that you see in solutions (described below)
that I am thinking of.
I have studied the problem for SH architecture but other architectures
(mips16,hppa etc) have similar problems.
Your comments are important so that I take a proper direction.
Now let me describe the problem and solution in detail.
For a sample code like this
void func(void)
{
float fla[16];
int l,m,n;
putval(&l,&m,&n);
l=m+n;
func1(l,m,n);
}
Gcc produces this code (sh-elf) for statement "l=m+n"
add #72,r6
mov r14,r1 ! moving frame pointer r14 --> r1
add #68,r1 ! reaching "m"
mov.l @r1+,r5 ! r5 <-- m and reaching "n"
mov.l @r1,r6 ! r6 <-- n
mov r5,r4 ! n <-- m
add r6,r4 ! l <-- m+n
add #-8,r1
The code is like this because 1) SH has 64byte limit on displacement
2) Stack layout is "fla,l,m,n")
Ideally, if stack was laid out differently with following layout "l,m,n,fla"
We would have code something like
mov.l @(4,r14),r5
mov.l @(8,r14),r6
mov r5,r4
add r6,r4
which has two advantages
1. Reduced in code size.
2. register r1 is free. In larger programs a register being
free at register allocation time means less spills and
better code overall.
As I understand, we need to do two things
1. Reorder Stack with increasing size.
2. For variables with equal size, their layout on the stack
should be based on locality of accesses.
POSSIBLE SOLUTIONS.
1. We introduce a pass for achieving objectives #1 and #2.
For addressing the locality issue we do the following
a. Create an access sequence of data items in the insn stream.
This would give information of usage and frequency of reference
of variables e.g. for code like
c=a+d;
f=d+e;
Sequence in which they variables are accessed is "a d c d e f"
b. Then we construct an Access Graph telling number of times
two ( or greater) variables are accessed adjacently (or nearby)
e.g. in access graph constructed from above sequence we
would have an edge between <a, d> with frequency they occur adjacent.
Ideally all adjacent references should be at a SHORT displacement.
c. From this information, we can determine placement of variables on the
stack to
minimize large displacements.(we would be spanning this graph to
maximize
accesses that occur nearby)
2. A second option is to possibly view this in spirit similar
to register allocation. The problem is to allocate M fast access slots
(within N bit displacement window with respect to a base register)
among total references with respect to that base register.
Variables live at the same time must be allocated at different
locations.
If it is possible to allocate them in fast access window, we do so;
otherwise the variable has to be allocated on slow access window.
(Technically spilled to slow access window) and spill would mean
addition of extra code to access the desired variable.
An obvious problem ( as I mentioned in my previous mail)
I could see is that most of stack allocations are called from reload.
While stack offset assignments would be most beneficial before register
allocation, the picture of the stack isn't clear untill reload.
I want that register allocation should benefit from offset assignments.
If I do stack offset assignment after register allocation,
I might get reduction in code size but that would be not be as good
as it won't reduce register pressure during register allocation/reload.
Thoughts and ideas ??
Regards,
Naveen Sharma.