This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, 7 Dec 2010, Vladimir Makarov wrote:Nice.
On 12/07/2010 10:24 AM, Richard Guenther wrote:When looking at PR18687 (again) I noticed that invalid_mode_change_p is pretty high in the profiles, so this patch tries to rework what it does based on the few existing callers.
This patch does two things, first, it changes the interface of invalid_mode_change_p (only used by ira-cost.c) to not pass in a source mode but assume the mode of the pseudo. Second, it caches and computes the outcome of invalid_mode_change_p at record_subregs_of_mode time, in an array indexed by register class.
As there are usually less register classes than machine modes this should save memory for the flags. It also avoids the loop over machine modes in invalid_mode_change_p, making the predicate cheaper in favor of doing the equivalent of N_REG_CLASSES checks for each subreg we encounter (we can optimize this with some extra memory cost at record_subregs_of_mode time to avoid the loop for mode pairs we've already seen).
The patch also caches the hashtable lookup (something which can also be done independently).
A even more reworked interface would for example have a sbitmap invalid_mode_change_classes (regno) interface returning a bitmap indexed by reg_class, so the users can use TEST_BIT on it directly.
The patch is a hack, just to make the following measurements possible, I will rework it to be nice but expect some direction from your IRA maintainer folks (maybe you've already done a similar thing).
I like the idea, Richard. So if you want to do it and have time, I really appreciate this.
ira-improv branch has a code which decreases # of calls of invalid_mode_change. So the effect of reworking of the function will be smaller when/if the branch is merged into the trunk. Still implementation of your idea will give considerable compilation time improvement (at least for this test case) and is really worth to do.Ok. I have split the patch into two pieces, one obvious changing the interface of invalid_mode_change_p (with no performance impact) and one with reginfo.c local changes only.
Micha suggested to use a single bitmap for all information, this exploits the sparseness of the information and with the current queries which are iterate over all regnos should have O(1) bitmap operations.
I use a temporary bitmap to cache whether we have seen a mode change of the same kind already to avoid the compute loop. Even though CANNOT_CHANGE_MODE_CLASS itself should be cheap, caching the case of a positive outcome with another bitmap test seems to be worthwhile for the testcase.
Re-benchmarking the patch on the testcase shows a 2% improvement and we also see slightly less number of minor pagefaults.
I'll bootstrap and regtest this now.Not really. I never investigated the problem. So I think you know much better.
Do you have any suggestions for further performance testing?
I don't know such public targets.Are there any targets where we'd have a larger number of register classes than modes?
Ok for me, Richard. Thanks for finding and working on the problem.For optimized compiles I can't measure an off-noise effect of the patches.
Ok if the bootstrap + regtest passes?
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |