This is the mail archive of the
java-patches@gcc.gnu.org
mailing list for the Java project.
[PATCH] Fix powerpc libffi
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Andreas Tobler <toa at pop dot agri dot ch>
- Cc: David Edelsohn <dje at watson dot ibm dot com>, Java Patches <java-patches at gcc dot gnu dot org>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Tue, 17 Jan 2006 08:38:41 -0500
- Subject: [PATCH] Fix powerpc libffi
- References: <42D8FE32.30403@pop.agri.ch> <200507191846.j6JIkXd30598@makai.watson.ibm.com> <42DD6E52.9010405@pop.agri.ch>
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
On Tue, Jul 19, 2005 at 11:19:14PM +0200, Andreas Tobler wrote:
>
> >Okay with those changes.
> >
> >Thanks, David
>
> I committed the attached.
> Thanks David, for review.
2005-07-19 Andreas Tobler <a.tobler@schweiz.ch>
* Makefile.am (nodist_libffi_la_SOURCES): Add POWERPC_FREEBSD.
* Makefile.in: Regenerate.
* include/Makefile.in: Likewise.
* testsuite/Makefile.in: Likewise.
* configure.ac: Add POWERPC_FREEBSD rules.
* configure: Regenerate.
* src/powerpc/ffitarget.h: Add POWERPC_FREEBSD rules.
(FFI_SYSV_TYPE_SMALL_STRUCT): Define.
* src/powerpc/ffi.c: Add flags to handle small structure returns
in ffi_call_SYSV.
(ffi_prep_cif_machdep): Handle small structures for SYSV 4 ABI.
Aka FFI_SYSV.
(ffi_closure_helper_SYSV): Likewise.
* src/powerpc/ppc_closure.S: Add return types for small structures.
* src/powerpc/sysv.S: Add bits to handle small structures for
final SYSV 4 ABI.
This patch broke libffi on powerpc*-linux, because the
> + lwz %r3,0(%r6)
> + lwz %r4,4(%r6)
> + bl __lshrdi3 # libgcc function to shift r3/r4, shift value in r5.
> + b .Lfinish
and
> + rlwinm %r5,%r31,5+23,32-5,31 /* Extract the value to shift. */
> + bl __ashldi3 /* libgcc function to shift r3/r4,
> + shift value in r5. */
calls aren't PLT calls even in libffi.so and libffi_convenience.a.
This in turn makes libgcj.so.7 a DT_TEXTREL library and
a) in largish Java programs the relocation can easily fail, libgcc_s.so.1
can be mapped too far from libgcj.so.7
b) SELinux policy might by default refuse textrel libraries, unless
whitelisted
Simply adding @plt is not an option, because that would still be broken
on -msecure-plt builds.
So, either we'd need to add configury stuff to detect -msecure-plt support
in the assembler, conditionally add the:
bcl 20,31,.LCF0
.LCF0:
stw 30,8(1)
mflr 30
...
addis 30,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha
addi 30,30,_GLOBAL_OFFSET_TABLE_-.LCF0@l
etc. stuff and make the 2 calls @plt, or we can inline the calls.
The latter is done by the attached patch.
The __ashldi3 inline call can be simplified, because we know %r5 must be
0, 8, 16 or 24, so it will be also faster than __ashldi3, even when not
counting the overhead of a PLT call.
For the closure, the small struct reg passing patch added IMHO completely
unnecessarily 7 instructions to the fast path - returning small structs
is pretty rare and those insns would be wasting time for all return types.
For 1, 2, 3, 4 and 8 byte structs we can easily fit the insn sequences
into the 4 insn slots and don't need to run through lots of nops.
For 5, 6 and 7 byte structs 4 insns is not enough, but in that case
we know the shift count is only 8, 16 or 24, so the __lshrdi3 operation can
be simplified even further and inlined.
I don't have access to powerpc*-*-freebsd*, so I have just briefly tested
the sysv.S change with powerpc64-*-linux -> powerpc*-*-freebsd* cross
compiling and writing my own testcase that called the freebsd cross compiler
compiled routines, but I haven't really tested the ppc_closure.S bits
(I ran make check on powerpc64-*-linux* -m32, but that doesn't test those
bits).
Can anyone please test this on powerpc*-*-freebsd*?
Ok for trunk/4.1 if that testing succeeds?
2006-01-17 Jakub Jelinek <jakub@redhat.com>
* src/powerpc/sysv.S (smst_two_register): Don't call __ashldi3, instead
do the shifting inline.
* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't compute %r5
shift count unconditionally. Simplify load sequences for 1, 2, 3, 4
and 8 byte structs, for the remaining struct sizes don't call __lshrdi3,
instead do the shifting inline.
--- libffi/src/powerpc/sysv.S.jj 2005-11-12 18:08:36.000000000 +0100
+++ libffi/src/powerpc/sysv.S 2006-01-17 13:36:16.000000000 +0100
@@ -140,8 +140,14 @@ L(smst_one_register):
b L(done_return_value)
L(smst_two_register):
rlwinm %r5,%r31,5+23,32-5,31 /* Extract the value to shift. */
- bl __ashldi3 /* libgcc function to shift r3/r4,
- shift value in r5. */
+ cmpwi %r5,0
+ subfic %r9,%r5,32
+ slw %r29,%r3,%r5
+ srw %r9,%r4,%r9
+ beq- L(smst_8byte)
+ or %r3,%r9,%r29
+ slw %r4,%r4,%r5
+L(smst_8byte):
stw %r3,0(%r30)
stw %r4,4(%r30)
b L(done_return_value)
--- libffi/src/powerpc/ppc_closure.S.jj 2005-11-12 18:08:36.000000000 +0100
+++ libffi/src/powerpc/ppc_closure.S 2006-01-17 13:59:59.000000000 +0100
@@ -63,19 +63,6 @@ ENTRY(ffi_closure_SYSV)
# so use it to look up in a table
# so we know how to deal with each type
- # Extract the size of the return type for small structures.
- # Then calculate (4 - size) and multiply the result by 8.
- # This gives the value needed for the shift operation below.
- # This part is only needed for FFI_SYSV and small structures.
- addi %r5,%r3,-(FFI_SYSV_TYPE_SMALL_STRUCT)
- cmpwi cr0,%r5,4
- ble cr0,.Lnext
- addi %r5,%r5,-4
-.Lnext:
- addi %r5,%r5,-4
- neg %r5,%r5
- slwi %r5,%r5,3
-
# look up the proper starting point in table
# by using return type as offset
addi %r6,%r1,112 # get pointer to results area
@@ -207,66 +194,66 @@ ENTRY(ffi_closure_SYSV)
# case FFI_SYSV_TYPE_SMALL_STRUCT + 1. One byte struct.
.Lret_type15:
# fall through.
- nop
- nop
+ lbz %r3,0(%r6)
+ b .Lfinish
nop
nop
# case FFI_SYSV_TYPE_SMALL_STRUCT + 2. Two byte struct.
.Lret_type16:
# fall through.
- nop
- nop
+ lhz %r3,0(%r6)
+ b .Lfinish
nop
nop
# case FFI_SYSV_TYPE_SMALL_STRUCT + 3. Three byte struct.
.Lret_type17:
# fall through.
- nop
- nop
- nop
+ lwz %r3,0(%r6)
+ srwi %r3,%r3,8
+ b .Lfinish
nop
# case FFI_SYSV_TYPE_SMALL_STRUCT + 4. Four byte struct.
.Lret_type18:
# this one handles the structs from above too.
lwz %r3,0(%r6)
- srw %r3,%r3,%r5
b .Lfinish
nop
+ nop
# case FFI_SYSV_TYPE_SMALL_STRUCT + 5. Five byte struct.
.Lret_type19:
# fall through.
- nop
- nop
- nop
- nop
+ lwz %r3,0(%r6)
+ lwz %r4,4(%r6)
+ li %r5,24
+ b .Lstruct567
# case FFI_SYSV_TYPE_SMALL_STRUCT + 6. Six byte struct.
.Lret_type20:
# fall through.
- nop
- nop
- nop
- nop
+ lwz %r3,0(%r6)
+ lwz %r4,4(%r6)
+ li %r5,16
+ b .Lstruct567
# case FFI_SYSV_TYPE_SMALL_STRUCT + 7. Seven byte struct.
.Lret_type21:
# fall through.
- nop
- nop
- nop
- nop
+ lwz %r3,0(%r6)
+ lwz %r4,4(%r6)
+ li %r5,8
+ b .Lstruct567
# case FFI_SYSV_TYPE_SMALL_STRUCT + 8. Eight byte struct.
.Lret_type22:
# this one handles the above unhandled structs.
lwz %r3,0(%r6)
lwz %r4,4(%r6)
- bl __lshrdi3 # libgcc function to shift r3/r4, shift value in r5.
b .Lfinish
+ nop
# case done
.Lfinish:
@@ -275,6 +262,13 @@ ENTRY(ffi_closure_SYSV)
mtlr %r0
addi %r1,%r1,144
blr
+
+.Lstruct567:
+ slw %r0,%r3,%r5
+ srw %r4,%r4,%r5
+ srw %r3,%r3,%r5
+ or %r4,%r0,%r4
+ b .Lfinish
END(ffi_closure_SYSV)
.section ".eh_frame",EH_FRAME_FLAGS,@progbits
Jakub