This is the mail archive of the
mailing list for the GCC project.
X86 and register classes
- From: "David S. Miller" <davem at redhat dot com>
- To: gcc at gcc dot gnu dot org
- Date: Tue, 21 May 2002 21:49:00 -0700 (PDT)
- Subject: X86 and register classes
After lots of oprofile runs and some work taming rtx_cost() and other
functions that recursively walk RTL, two functions remain top of the
list on x86, one of which is record_reg_classes.
Clearly this is because x86 has _TWENTY FIVE_ register classes. This
explodes the cost tables and the complexity of the alternative
scanning per-class loops record_reg_classes has to do. It also
contributes to reload complexity, but I'll leave that for another
Let's look at what N_REG_CLASSES is for some other targets:
The only thing I can say for some of those larger cases, including
x86, is "Yikes!"
It's a shame we compute costs considering MMX, SSE, float regs, etc.
when we are looking at an add of two SI mode values for example. Sure
in some weird case we can transform it into an MMX add or something
like that, but most of the time it is wasted work in regclass.
Next note, in trying to consider ways to improve this situation, is
the fact that record_reg_classes traverses two different cost tables
MAX_MACHINE_MODE X N_REG_CLASSES X N_REG_CLASSES
Sometimes the 'class' loop index applies to the second index
and sometimes to the third. That has to really stink for cache
usage, in fact it's probably close to perfectly suboptimal (each index
replaces the L1 cache line used by the previous interation).
Some day we'd like GCC to be able to optimize this such that the
tables will be walked more linearly in memory. However, today such an
optimization does not exist in GCC so we have to do this by hand. The
best way to go about this is to make the linear indexing occur with
the rightmost index.
I tried this with may_move_in_cost but this didn't seem to help
things much, if at all. Ho hum...
An area of exploration for improvements for someone so inclined...