This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: VLIW scheduling and delayed branch


Hi thomas,
Thanks for your reply. A couple of questions below.

Thomas Sailer wrote:
Has anyone faced a similar problem before? Are there targets for which both VLIW and DBR are enabled? Perhaps ia64?

I did something similar a few months ago.

What was your target? Is the target code available in Gcc mainline? If not, could you pass your code to me?



The problem is that haifa and the delayed branch scheduling passes don't really fit together. delayed branch scheduling happily undoes all the haifa decisions.

The question is how much you gain by delayed branch scheduling. I don't
have numbers, but it wasn't much in my case. And since your company name
is picochip, you certainly value size more than speed ?!

Yeah. We do. But, in our architecture, a branch has to have a delay slot instruction anyway. In the absence of one, we put a "nop" in there. If GCC manages to move a "single" instruction vliw into the delay slot, we would benefit in both size and speed, otherwise, we will just have no impact on either.



I pursued two approaches. The first one was to insert "stop bit" pseudo insns into the RTL stream in machdep reorg, so I didn't have to rely on TImode insn flags during output. But then delayed branch scheduling just took one insn out of an insn group and put it into the delay slot, meaning there was usually no cycle gain at all, just larger code size (due to insn duplication).

This seems fairly straightforward to implement.



The second approach was having lots of parallel insns (using match parallel and a custom predicate). machdep reorg then converts insn bundles into a single parallel insn. Delayed branch scheduling then does the right thing. This approach works fairly well for me, but there are a few complications. My output code is pretty hackish, as I didn't want to duplicate outputing a single insn / outputing the same insn as component of a parallel insn group.

When do you un-parallel those instructions? And, how?


Regards
Hari


Tom




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]