This is the mail archive of the java-patches@gcc.gnu.org mailing list for the Java project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Fix powerpc libffi


On Tue, Jul 19, 2005 at 11:19:14PM +0200, Andreas Tobler wrote:
> 
> >Okay with those changes.
> >
> >Thanks, David
> 
> I committed the attached.
> Thanks David, for review.

2005-07-19  Andreas Tobler  <a.tobler@schweiz.ch>

        * Makefile.am (nodist_libffi_la_SOURCES): Add POWERPC_FREEBSD.
        * Makefile.in: Regenerate.
        * include/Makefile.in: Likewise.
        * testsuite/Makefile.in: Likewise.
        * configure.ac: Add POWERPC_FREEBSD rules.
        * configure: Regenerate.
        * src/powerpc/ffitarget.h: Add POWERPC_FREEBSD rules.
        (FFI_SYSV_TYPE_SMALL_STRUCT): Define.
        * src/powerpc/ffi.c: Add flags to handle small structure returns
        in ffi_call_SYSV.
        (ffi_prep_cif_machdep): Handle small structures for SYSV 4 ABI.
        Aka FFI_SYSV.
        (ffi_closure_helper_SYSV): Likewise.
        * src/powerpc/ppc_closure.S: Add return types for small structures.
        * src/powerpc/sysv.S: Add bits to handle small structures for
        final SYSV 4 ABI.

This patch broke libffi on powerpc*-linux, because the

> +	lwz %r3,0(%r6)
> +	lwz %r4,4(%r6)
> +	bl __lshrdi3	# libgcc function to shift r3/r4, shift value in r5.
> +	b .Lfinish

and

> +	rlwinm  %r5,%r31,5+23,32-5,31 /* Extract the value to shift.  */
> +	bl	__ashldi3  /* libgcc function to shift r3/r4,
> +			      shift value in r5.  */

calls aren't PLT calls even in libffi.so and libffi_convenience.a.
This in turn makes libgcj.so.7 a DT_TEXTREL library and
a) in largish Java programs the relocation can easily fail, libgcc_s.so.1
   can be mapped too far from libgcj.so.7
b) SELinux policy might by default refuse textrel libraries, unless
   whitelisted

Simply adding @plt is not an option, because that would still be broken
on -msecure-plt builds.
So, either we'd need to add configury stuff to detect -msecure-plt support
in the assembler, conditionally add the:
        bcl 20,31,.LCF0
.LCF0:
        stw 30,8(1)
        mflr 30
...
        addis 30,30,_GLOBAL_OFFSET_TABLE_-.LCF0@ha
        addi 30,30,_GLOBAL_OFFSET_TABLE_-.LCF0@l
etc. stuff and make the 2 calls @plt, or we can inline the calls.
The latter is done by the attached patch.
The __ashldi3 inline call can be simplified, because we know %r5 must be
0, 8, 16 or 24, so it will be also faster than __ashldi3, even when not
counting the overhead of a PLT call.
For the closure, the small struct reg passing patch added IMHO completely
unnecessarily 7 instructions to the fast path - returning small structs
is pretty rare and those insns would be wasting time for all return types.
For 1, 2, 3, 4 and 8 byte structs we can easily fit the insn sequences
into the 4 insn slots and don't need to run through lots of nops.
For 5, 6 and 7 byte structs 4 insns is not enough, but in that case
we know the shift count is only 8, 16 or 24, so the __lshrdi3 operation can
be simplified even further and inlined.

I don't have access to powerpc*-*-freebsd*, so I have just briefly tested
the sysv.S change with powerpc64-*-linux -> powerpc*-*-freebsd* cross
compiling and writing my own testcase that called the freebsd cross compiler
compiled routines, but I haven't really tested the ppc_closure.S bits
(I ran make check on powerpc64-*-linux* -m32, but that doesn't test those
bits).

Can anyone please test this on powerpc*-*-freebsd*?

Ok for trunk/4.1 if that testing succeeds?

2006-01-17  Jakub Jelinek  <jakub@redhat.com>

	* src/powerpc/sysv.S (smst_two_register): Don't call __ashldi3, instead
	do the shifting inline.
	* src/powerpc/ppc_closure.S (ffi_closure_SYSV): Don't compute %r5
	shift count unconditionally.  Simplify load sequences for 1, 2, 3, 4
	and 8 byte structs, for the remaining struct sizes don't call __lshrdi3,
	instead do the shifting inline.

--- libffi/src/powerpc/sysv.S.jj	2005-11-12 18:08:36.000000000 +0100
+++ libffi/src/powerpc/sysv.S	2006-01-17 13:36:16.000000000 +0100
@@ -140,8 +140,14 @@ L(smst_one_register):
 	b	L(done_return_value)
 L(smst_two_register):
 	rlwinm  %r5,%r31,5+23,32-5,31 /* Extract the value to shift.  */
-	bl	__ashldi3  /* libgcc function to shift r3/r4,
-			      shift value in r5.  */
+	cmpwi	%r5,0
+	subfic	%r9,%r5,32
+	slw	%r29,%r3,%r5
+	srw	%r9,%r4,%r9
+	beq-	L(smst_8byte)
+	or	%r3,%r9,%r29
+	slw	%r4,%r4,%r5
+L(smst_8byte):
 	stw	%r3,0(%r30)
 	stw	%r4,4(%r30)
 	b	L(done_return_value)
--- libffi/src/powerpc/ppc_closure.S.jj	2005-11-12 18:08:36.000000000 +0100
+++ libffi/src/powerpc/ppc_closure.S	2006-01-17 13:59:59.000000000 +0100
@@ -63,19 +63,6 @@ ENTRY(ffi_closure_SYSV)
 	# so use it to look up in a table
 	# so we know how to deal with each type
 
-	# Extract the size of the return type for small structures.
-	# Then calculate (4 - size) and multiply the result by 8.
-	# This gives the value needed for the shift operation below.
-	# This part is only needed for FFI_SYSV and small structures.
-	addi	%r5,%r3,-(FFI_SYSV_TYPE_SMALL_STRUCT)
-	cmpwi	cr0,%r5,4
-	ble	cr0,.Lnext
-	addi	%r5,%r5,-4
-.Lnext:
-	addi	%r5,%r5,-4
-	neg	%r5,%r5
-	slwi	%r5,%r5,3
-
 	# look up the proper starting point in table
 	# by using return type as offset
 	addi %r6,%r1,112   # get pointer to results area
@@ -207,66 +194,66 @@ ENTRY(ffi_closure_SYSV)
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 1. One byte struct.
 .Lret_type15:
 # fall through.
-	nop
-	nop
+	lbz %r3,0(%r6)
+	b .Lfinish
 	nop
 	nop
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 2. Two byte struct.
 .Lret_type16:
 # fall through.
-	nop
-	nop
+	lhz %r3,0(%r6)
+	b .Lfinish
 	nop
 	nop
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 3. Three byte struct.
 .Lret_type17:
 # fall through.
-	nop
-	nop
-	nop
+	lwz %r3,0(%r6)
+	srwi %r3,%r3,8
+	b .Lfinish
 	nop
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 4. Four byte struct.
 .Lret_type18:
 # this one handles the structs from above too.
 	lwz %r3,0(%r6)
-	srw %r3,%r3,%r5
 	b .Lfinish
 	nop
+	nop
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 5. Five byte struct.
 .Lret_type19:
 # fall through.
-	nop
-	nop
-	nop
-	nop
+	lwz %r3,0(%r6)
+	lwz %r4,4(%r6)
+	li %r5,24
+	b .Lstruct567
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 6. Six byte struct.
 .Lret_type20:
 # fall through.
-	nop
-	nop
-	nop
-	nop
+	lwz %r3,0(%r6)
+	lwz %r4,4(%r6)
+	li %r5,16
+	b .Lstruct567
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 7. Seven byte struct.
 .Lret_type21:
 # fall through.
-	nop
-	nop
-	nop
-	nop
+	lwz %r3,0(%r6)
+	lwz %r4,4(%r6)
+	li %r5,8
+	b .Lstruct567
 
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 8. Eight byte struct.
 .Lret_type22:
 # this one handles the above unhandled structs.
 	lwz %r3,0(%r6)
 	lwz %r4,4(%r6)
-	bl __lshrdi3	# libgcc function to shift r3/r4, shift value in r5.
 	b .Lfinish
+	nop
 
 # case done
 .Lfinish:
@@ -275,6 +262,13 @@ ENTRY(ffi_closure_SYSV)
 	mtlr %r0
 	addi %r1,%r1,144
 	blr
+
+.Lstruct567:
+	slw %r0,%r3,%r5
+	srw %r4,%r4,%r5
+	srw %r3,%r3,%r5
+	or %r4,%r0,%r4
+	b .Lfinish
 END(ffi_closure_SYSV)
 
 	.section	".eh_frame",EH_FRAME_FLAGS,@progbits


	Jakub


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]