This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Bug tree-optimization/15524] [4.0 Regression] jump threadingon trees is slow with switch statements with large # of cases

We've known for a little while that the prediction code compile time
behavior for code like pr15524 is, err, poor at best.

The first problem is the prediction code, much like the code
to record loop bounds, was walking all the blocks in the function and
performing a member test on them.

To give you an idea of how expensive this style of coding is, here's
the timevar info for pr15524 before my patch:

branch prediction     :  44.17 (35%) usr   0.01 ( 1%) sys  44.28 (32%)

OUCH.  Over 1/3 of our total compilation time is spent in the branch
prediction code.

I changed the code to record the blocks to visit in a bitmap and
we use EXECUTE_IF_SET_IN_BITMAP.  After that change we get:

branch prediction     :  11.23 (12%) usr   0.00 ( 0%) sys  11.25 (12%)

Which is a nice improvement.  However, there's clearly an algorithmic
problem in this code.  If I'm reading the code correctly it has a
fundamental problem in that the nodes in the inner loops are processed
multiple times (once per loop nest).  It would seem to me we could
get a nice speedup by aggregating and recording the information from
inner loops, then not walking them again and instead using the
cached information.

Anyway, it's a pretty straightforward change and gives us a nice
speedup on the 15524 code.

Bootstrapped and regression tested on i686-pc-linux-gnu.

Attachment: ZZZ
Description: Text document

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]