[PATCH] SH FDPIC backend support

Rich Felker dalias@libc.org
Mon Oct 5 02:16:00 GMT 2015


On Sun, Oct 04, 2015 at 02:10:42PM +0900, Oleg Endo wrote:
> On Sat, 2015-10-03 at 18:34 -0400, Rich Felker wrote:
> > > 
> > > I found and fixed the problem, but I have a new concern: calls to the
> > > new shift instructions are using the following address forms:
> > > 
> > > -mno-fdpic -fPIC:
> > > 	.long   __ashlsi3_r0@GOTOFF
> > > 
> > > -mfdpic:
> > > 	.long   __ashlsi3_r0-(.LPCS1+2)
> > > 
> > > Neither of these seems valid. Both assume __ashlsi3_r0 will be defined
> > > in the same DSO, which is not true in general; shared libgcc_s.so
> > > might be in use. In this case the call would need to go through the
> > > PLT, which (for PIC or FDPIC) requires r12 to be loaded with the GOT
> > > address. In the non-FDPIC case, r12 _happens_ to contain the GOT
> > > address just because it was used as an addend to get the function
> > > address from the @GOTOFF address, but this does not seem
> > > safe/reliable. In the FDPIC case there's nothing to cause r12 to
> > > contain the GOT address, and in fact if the function has already made
> > > another function call (which uses and clobbers r12), no code is
> > > generated to save and restore r12 for the libgcc call.
> 
> I might be missing something, but usually R12 is preserved across
> function calls.

This is FDPIC-specific. Because there is fundamentally no way for a
function to find its own GOT (it has one GOT for each process using
the code containing the function), its GOT address has to be a
(hidden) argument to the function which arrives in r12.

For calls via the PLT, r12 contains the PLT entry's (i.e. the calling
module's) GOT pointer at the time of the call, and the PLT thunk
replaces it with the callee's GOT pointer (loaded from the function
descriptor) before jumping to the callee code. There is fundamentally
nowhere the PLT thunk could store the old value of r12 and arrange for
it to be restored at return time, so using a PLT forces r12 to be
call-clobbered.

(Note that in the special case where the PLT is bypassed because the
callee is defined in the same module and bound at link-time, the GOT
value loaded by the caller is the right GOT value for the callee
automatically.)

If we didn't care about being able to do PLT calls, there's no
fundamental reason r12 has to be call-clobbered, but it still makes a
lot more sense. Getting back the value of r12 you passed when making a
function call is rarely useful except in the case where the caller
knows the function is defined in the same module (so it can keep using
r12 as its own GOT pointer after the call).

BTW the reason I'm spending time explaining this now is that it's
something we should optimize after the FDPIC patch goes in: I think
the r12-related spills/reload could be made a lot more efficient.

> The special functions in libgcc tell the compiler
> exactly which things they clobber and which not.  R12 is not clobbered
> by the shift functions.

For FDPIC, that implies an assumption that the definition is local to
the calling module (i.e. static-linked) but I think that assumption
already existed for non-FDPIC since r12 was not explicitly set for the
call.

> > > Calls to other functions lib libgcc (e.g. division) seem to work fine
> > > and either go through the PLT or bypass it and load from the GOT
> > > directly. It's only these new special-calling-convention ones that are
> > > broken, and I can't figure out why...
> 
> Sorry, I wasn't paying attention to dynamic linking or *PIC when
> changing the shift patterns back then, so maybe I've screwed up
> something there.
> To me it looks like they do the same thing as expanders for division or
> the SH1 multiplication ("mulsi3" pattern).  Each of the libgcc support
> functions have a different "ABI", so "__ashlsi3_r0" or "__lshrsi3_r0"
> doesn't introduce a new special ABI, it already is as per definition.
> These function calls are not expanded like regular function calls, via
> e.g. (define_expand "call" ... ).  The function call is hidden from the
> regular function call machinery and everything thinks it's a regular
> instruction that just has some special register constraints and
> clobbers.
> 
> I've just tried compiling the following with -m2 -ml -fPIC
> 
> unsigned int test_2 (unsigned int x, unsigned int y)
> {
>   return x << y;
> }
> 
> unsigned int test_3 (unsigned int x, unsigned int y)
> {
>   return x / y;
> }
> 
> And the compiled code is basically identically for both.  For the labels
> I get:
> 
> ..L4:	.long	_GLOBAL_OFFSET_TABLE_
> ..L5:	.long	___ashlsi3_r0@GOTOFF
> 
> and
> 
> ..L10:	.long	_GLOBAL_OFFSET_TABLE_
> ..L11:	.long	___udivsi3@GOTOFF
> 
> So the shifts do not work, but the divisions do work that way?

It's not that one works and the other doesn't. I was just concerned
about the behavior and how it seems to be unsafe for shared libgcc;
it's equally unsafe for either. But as I found later:

> > Hmm, according to sh-protos.h:
> > 
> >   /* A special function that should be linked statically.  These are typically
> >      smaller or not much larger than a PLT entry.
> >      Some also have a non-standard ABI which precludes dynamic linking.  */
> >   SFUNC_STATIC
> > 
> > So apparently the strange behavior I observed is intended. Presumably
> > there is some mechanism to ensure that these functions are always
> > static-linked? But I don't see it. The libgcc spec I see is:
> > 
> > *libgcc:
> > %{static|static-libgcc:-lgcc
> > -lgcc_eh}%{!static:%{!static-libgcc:%{!shared-libgcc:-lgcc --as-needed
> > -lgcc_s --no-as-needed}%{shared-libgcc:-lgcc_s%{!shared: -lgcc}}}}
> > 
> > This explicitly omits -lgcc when -shared-libgcc is used with -shared.
> > Thankfully __ashlsi3_r0 is not exported from libgcc.so.1 (as far as I
> > can tell), so this will just be a link error rather than horribly
> > wrong behavior, but it still seems like there's a bug here unless I'm
> > misunderstanding something. I think the final %{!shared: -lgcc} in the
> > spec is an error and should be replaced by simply -lgcc if there are
> > targets where libgcc.a contains necessary symbols that are not/cannot
> > be defined in libgcc_s.so.1.
> 
> Hm, maybe, but I don't know enough about this, sorry.  Kaz, maybe you
> have a comment on that?

I think this is all intentional; otherwise SFUNC_STATIC should not
even exist. I'm just mildly worried that -shared-libgcc -shared is
broken; I should try to setup a test case for it.

Rich



More information about the Gcc-patches mailing list