[PATCH v4] SH FDPIC backend support

Rich Felker dalias@libc.org
Wed Nov 11 14:56:00 GMT 2015


On Wed, Nov 11, 2015 at 11:36:26PM +0900, Oleg Endo wrote:
> On Tue, 2015-11-10 at 15:07 -0500, Rich Felker wrote:
> 
> > > The way libcalls are now emitted is a bit unhandy.  If more special
> > > -ABI
> > > libcalls are to be added in the future, they all have to do the jsr
> > > vs.
> > > bsrf handling (some potential candidates for new libcalls are
> > > optimized
> > > soft FP routines).  Then we still have PR 65374 and PR 54019. In
> > > the
> > > future maybe we should come up with something that allows emitting
> > > libcalls in a more transparent way...
> > 
> > I'd like to look into improving this at some point in the near
> > future.
> > On further reading of the changes made, I think there's a lot of code
> > we could reduce or simplify.
> > 
> > In all the places where new RTL patterns were added for *call*_fdpic,
> > the main constraint change vs the non-fdpic version is using REG_PIC.
> > Is it possible to make a REG_GOT_ARG macro or similar that's defined
> > as something like TARGET_FDPIC ? REG_PIC : nonexistent_or_dummy?
> 
> I'm not sure I understand what you mean by that.  Do you have a small
> code snippet example?

Sorry, I don't really understand RTL well enough to make a code
snippet. What I want to express is that an insn "uses" (in the (use
...) sense) a register (r12) conditionally depending on a runtime
option (TARGET_FDPIC).

> > As for the call site stuff, I wonder why the existing call site stuff
> > used by "call_pcrel" can't be used for SFUNC_STATIC. 
> 
> "call_pcrel" is a real call insn.  The libcalls are not expanded as
> real call insns to avoid the regular register save/restores etc which
> is needed to do a normal function call.

Yes, I see that. What I was really wondering though is why the new
call site generation code and constraint was added when the call_pcrel
code already has mechanisms for this, rather than just duplicating the
internals that call_pcrel uses. It seems like we're doing things in a
gratuitously different way here.

> I guess the generic fix for this issue would be some mechanism to
> specify which regs are clobbered/preserved and then provide the right
> settings for the libcall functions.

Is this possible in the sh backend or does it need changes to
higher-level gcc code? (i.e. is it presently possible to make an insn
that conditionally clobbers different things rather than having to
make tons of different insns for each possible set of clobbers?)

> > I'm actually
> > trying to prepare a simpler FDPIC patch for other gcc versions we're
> > interested in that's not so invasive, and for now I'm just having
> > function_symbol replace SFUNC_STATIC with SFUNC_GOT on TARGET_FDPIC
> > to
> > avoid needing all the label stuff, but it would be nice to find a way
> > to reuse the existing framework.
> 
> Do you know how this affects code size (and inherently performance)?

I suspect it makes very little difference, but to compare I'd need to
do the same hack on 5.2.0 or trunk. The only difference should be one
additional load per call, and one additional GOT slot per function
called this way (but just once per executable/library).

Another issue I've started looking at is how r12 is put in fixed_regs,
which is conceptually wrong. Preliminary tests show that removing it
from fixed_regs doesn't break and produces much better code -- r12
gets used as a temp register in functions that don't need it, and in
one function that made multiple calls, the saving of initial r12 to a
call-saved register even happened in the delay slot of the call. I've
been discussing it with Alexander Monakov on IRC (#musl) and based on
my understanding so far of how gcc works (which admittedly may be
wrong) the current FDPIC code looks like it's written not to depend on
r12 being 'fixed'. Also I think I'm pretty close to understanding how
we could make the same improvements for non-FDPIC PIC codegen: instead
of loading r12 in the prologue, load a pseudo, then use that pseudo
for GOT access and force it into r12 the same way FDPIC call code does
for PLT calls. Does this sound correct?

Rich



More information about the Gcc-patches mailing list