This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Shrink struct basic_block, get rid of NOTE_INSN_DISABLE_SCHED_OF_BLOCK



On Aug 24, 2004, at 3:56 PM, Zack Weinberg wrote:


Caroline Tice <ctice@apple.com> writes:

+ ??? If two basic blocks could otherwise be merged (which implies
+ that the jump between the two is unconditional), and one is in a
+ hot section and the other is in a cold section, surely that means
+ that one of the section choices is wrong. */

Why would that mean one is in the wrong section? For those architectures for which unconditional branches can reach all of memory this is exactly the situation that I would expect: jumps between sections (in that case) would use unconditional branches.

It isn't that the jump is unconditional, but that the jump is the only edge leaving A and the only edge entering B. (Being unconditional is a consequence of that.) If basic blocks A and B are candidates for merging, they will be executed exactly the same number of times in all possible runs of the program, and therefore it is nonsensical for them to be in different partitions.



I think perhaps you don't understand how this situation is arising with the hot/cold partitioning optimization.

In order to deal with architectures that have short conditional branches
(which cannot span all of memory) I have to take any conditional jump that
attempts to cross a section boundary and add a level of indirection: it becomes
a conditional jump to a new basic block, in the same section. The new basic block
contains an unconditional jump to the original target, in the other section. That is
how you can get these occurrences, which should not be merged together, as
the source and target block really do belong in distinct sections. I think that
answers the first part of your question, unless I misunderstand what you are saying.


Now, for those architectures whose unconditional branch is also incapable of
reaching all of memory, I then convert those unconditional jumps into indirect
jumps, through a register. This is what causes a lot of the mess you were
complaining about in your other email. Originally I had set up the hot/cold
partitioning to happen as part of the basic-block reordering phase, when the
appropriate flag was set. Unfortunately (as you also noted) this happens quite
late in the compilation; too late for me to grab the registers I need for creating the
indirect jumps. Therefore the only solution I could come up with was to move
the phase that decides on the partitioning and fixes up the branches into an
earlier part of the compiler where I could still grab the necessary registers. But
then, as you noted, I have to prevent many cfg cleanup optimizations from
undoing my fix ups for branches that cross section boundaries. I hope this clarifies
things for you a little.


If you have a suggestion for a better/cleaner way of handling this I would be happy
to hear it.


In the meantime, I am working on a patch to address all the remaining concerns that people
mentioned to me with regards to the hot/cold partitioning optimization. If other people want to
add to the list of things for me to address (with regards to this optimization), now is the time to
mention it.


-- Caroline Tice
ctice@apple.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]