This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: postreload-gcse.c: Obvious fix applied: don't emit jumps when we can't


Mostafa Hagog wrote:



A more appropriate fix is to disable the gcse only for those cases that you
need an edge split; add the targetm.cannot_modify_jumps_p () condition to
eliminate_partially_redundant_load.  I had a patch to do a similar thing
when we don't have profiling information; in such a case we cannot be sure
that the split worth it (even when we remove redundant loads).  I have
update the patch to catch also cases where the machine doesn't support
adding jumps after reload.  here is the modified patch, didn't test it yet
let me know what do you think ?

If this indeed coverts all the cases where edge splits might be caused, then this is a preferable
solution to the one I put in earlier. A test build of sh64-elf suceeded with your patch, but
it will maybe another hour for the regression check. Strangely enough, one of the
execution failures seems to have been 'cured'.


For the profiling information problem, it certainly would be better if we could generate some
edge counts that are consistent with the assumed branch probabilities. For a single function,
you have basically a system of linear equations, and you can assign an arbitrary large number
for the entry edge to get a set of values. This is algorithmically simple, although it can become
computationally quite expensive for complex flowgraphs. If you consider multiple functions,
however, it gets worse, because the edge counts should be consistent across functions so that
inlining decisions make sense. To address function call/return and mutual recursion of direct
calls, you could mash the entire translation unit into a big linear equation system, but this
looks like it's pretty much guaranteed to be too expensive to solve for most translation units.
And it doesn't even begin to address vtables or indirect function calls in general.
So, ironically, here we have a situation where global functions can be optimized more efficiently
than static (i.e. ! TREE_PUBLIC) ones. For a global function. if we don't have profiling
information, we may assume that most of the calls come from outside of the module, and
assign an arbitrary large execution count to the entry edge. For most real-world flowgraphs,
the linear equation sytstem can probably be solved in reasonable time. We should probably
skip the cases where a cheap solution cannot be found with whatever algorithms we build
into the compiler.


Thus, while it appears desirable and feasible to have estimated edge execution count information
for some functions, it doesn't appear feasible to have it for all functions. Hence, we need a
mechanism to deal with the case of not having execution count information.
Therefore, I think your patch makes sense. If/when we get estimated edge execution count
information for some functions, we can refine the ! profile_info test to take this into account.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]