This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH]Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc


On 04/06/14 07:56, Tony Wang wrote:
> Hi there,
> 
> In libgcc the file ieee754-sf.S and ieee754-df.S have some function
> pairs which will be bundled into one .o file and sharing the same
> .text section. For example, the fmul and fdiv, the libgcc makefile
> will build them into one .o file and archived into libgcc.a. So when
> user only call single float point multiply functions, the fdiv
> function will also be linked, and as fmul and fdiv share the same
> .text section, linker option --gc-sections or -flot canât remove the
> dead code.
> 
> So the optimization just separates the function pair(fmul/fdiv and
> dmul/ddiv) into different sections, following the naming pattern of
> âffunction-sections(.text.__functionname), through which the unused
> sections of fdiv/ddiv can be eliminated through option --gcc-sections
> when users only use fmul/dmul.The solution is to add a conditional
> statement in the macro FUNC_START, which will conditional change the
> section of a function from .text to .text.__\name. when compiling with
> the L_arm_muldivsf3 or L_arm_muldivdf3 macro.
> 
> There are 3 parts: mul, div and common. This patch puts mul and common
> together, so that user's multiply won't pull-in div, butuser's div
> will still pull-in mul. It is reasonable because size of mul is far
> smaller than size of div.
> 
> ChangLog changes are:
> 
> ***gcc/libgcc/ChangeLog***
> 
> 2014-05-28  Tony Wang  <tony.wang@arm.com>
> 
>         * config/arm/lib1funcs.S (FUNC_START): Add conditional section
>         redefine for macro L_arm_muldivsf3 and L_arm_muldivdf3
> 
> Bootstrapped on x86_64-linux-gnu and no regression found in the
> testsuite. Patch is in attachment.
> The code reduction for thumb2 on cortex-m3 is:
> 1. When user only use single float point multiply:
> fmul+fdiv => fmul will have a code size reduction of 318 bytes.
> 2. When user only use double float point multiply:
> dmul+ddiv => dmul will have a code size reduction of 474 bytes.
> 
> BR,
> Tony
> 
> 
> libgcc_mul_div_code_size_reduction.diff
> 
> 
> diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
> index b617137..0454bc8 100644
> --- a/libgcc/config/arm/lib1funcs.S
> +++ b/libgcc/config/arm/lib1funcs.S
> @@ -419,7 +419,11 @@ SYM (\name):
>  #endif
>  
>  .macro FUNC_START name
> +#if defined (L_arm_muldivsf3) || defined (L_arm_muldivdf3)
> +	.section	.text.__\name,"ax",%progbits
> +#else
>  	.text
> +#endif
>  	.globl SYM (__\name)
>  	TYPE (__\name)
>  	.align 0
> @@ -468,7 +472,11 @@ _L__\name:
>  #define EQUIV .thumb_set
>  #else
>  .macro	ARM_FUNC_START name
> +#if defined (L_arm_muldivsf3) || defined (L_arm_muldivdf3)
> +	.section	.text.__\name,"ax",%progbits
> +#else
>  	.text
> +#endif
>  	.globl SYM (__\name)
>  	TYPE (__\name)
>  	.align 0
> 

I've two concerns about this:

1) the hacky approach to selecting when to use a separate section
2) the possibility that this will create out-of-range branches between
code fragments that cannot be veneered because the labels are untyped.
This is potentially exacerbated by the way GNU LD orders sections with
different names.

Fixing 1) is relatively straight-forward.  Extend the FUNC_START macro
to take an optional argument that controls whether a special section
name is used.

Fixing 2) is harder.  First you must mark all symbols that cross
fragment boundaries as function symbols (they don't have to be global).
 Secondly, you must ensure that r12 (IP) is not live at such points.
This might involve substantial restructuring to the code (I haven't
checked).

R.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]