This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[0/9] Record number of hard registers in a REG

While looking at a profile of gcc, I noticed one thing fairly high
up the list was a loop iterating over all the registers in a REG,
apparently due to the delay in computing the index for hard_regno_nregs
and then loading the value (which would often be an L1 cache miss).

When we were adding CONST_WIDE_INT, the general opinion seemed to be
that we should lay out rtxes for LP64 hosts rather than try to have two
alternative layouts, one optimised for ILP32 and one for LP64.  We therefore
unconditionally filled the 32-bit hole (on LP64) between the rtx header and
the main union with extra data.  That area is already used by REGs to store
ORIGINAL_REGNO, but on LP64 hosts there's another hole in the REGNO
field itself.  This series takes that idea a step further and uses the
hole to store the number of registers in a REG.

This still leaves 24 redundant bits that could be used for other things
in future.  That's actually enough to store a SUBREG of a REG (8 bits
for the inner mode, 16 for the offset), but having a single rtx for that
would probably cause too many problems.

The series sped up an --enable-checking=release gcc by just over 0.5%
for various tests on my box.  Not a big saving, but hopefully the
patches also count as a clean-up.

As a follow-on, I'd like to add a FOR_EACH_* macro that iterates over
all the registers in a REG.  These loops always execute at least once,
and rarely more than once, and it would be good to model that in the
iterator so that all use sites benefit.

Each patch in the series was individually bootstrapped & regression-tested
on x86_64-linux-gnu.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]