This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RL78] Questions about code-generation


On 10/03/14 22:37, DJ Delorie wrote:
I've managed to build GCC myself so that I could experiment a bit
but as this is my first foray into compiler internals, I'm
struggling to work out how things fit together and what affects
what.

The key thing to know about the RL78 backend, is that it has two
"targets" it uses.  For the first part of the compilation, up until
after reload, the model uses 16 virtual registers (R8 through R15) and
a virtual machine to give gcc an orthogonal model that it can generate
code for.  After reload, there's a "devirtualization" pass in the RL78
backend that maps the virtual model to the real model (R0 through R7),
which means copying values in and out of the real registers according
to which addressing modes are needed.  Then GCC continues optimizing,
which gets rid of most of the unneeded instructions.

The problem you're probably running into is that deciding which real
registers to use for each virtual one is a very tricky task, and the
post-reload optimizers aren't expecing the code to look like what it
does.

What causes that code to be generated when using a variable instead
of a fixed memory address?

The use of "volatile" disables many of GCC's optimizations.  I
consider this a bug in GCC, but at the moment it needs to be "fixed"
in the backends on a case-by-case basis.

Ah, that certainly explains a lot. How exactly would the fixing be done? Is there an example I could look at for one of the other processors?

It's certainly unfortunate, since an awful lot of bit-twiddling goes on with the memory-mapped hardware registers (which obviously generally need to be declared volatile).

Just to get a feel for the potential gains, I've removed the volatile keyword from all the declarations and rebuilt the project. That change alone reduces the code size by 3.7%. I wouldn't want to risk running that code but the gain is certainly significant.

I calculated a week or two ago that we could make a code-saving of around 8% by using near or relative branches and near calls instead of always generating far calls. I changed rl78-real.md to use near addressing and got about 5%. That's probably about right. I tried to generate relative branches too but I'm guessing that the 'length' attribute needs to be set for all instructions to get that working properly.

Obviously near/far addressing would need to be controlled by an external switch to allow for processors with more than 64KB code-flash.

A few small gains can be had elsewhere (using 'clrb a' in zero_extendqihi2_real, possibly optimizing addsi3_internal_real to avoid addw ax,#0 etc.). These don't save much space in our project (about 30-40 bytes perhaps) but it'll obviously vary from project to project.

Regards,

Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]