New parameters to control stringop expansion libcall strategy
Xinliang David Li
davidxl@google.com
Wed Aug 7 17:06:00 GMT 2013
Fixed the do while formatting. Ok for trunk with this version?
thanks,
David
On Tue, Aug 6, 2013 at 2:42 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >>> 2013-08-02 Xinliang David Li <davidxl@google.com>
>> >>>
>> >>> * config/i386/stringop.def: New file.
>> >>> * config/i386/stringop.opt: New file.
>> >>> * config/i386/i386-opts.h: Include stringopt.def.
>> >>> * config/i386/i386.opt: Include stringopt.opt.
>> >>> * config/i386/i386.c (ix86_option_override_internal):
>> >>> Override default size based stringop inline strategies
>> >>> with options.
>> >>> * config/i386/i386.c (ix86_parse_stringop_strategy_string):
>> >>> New function.
>> >>>
>> >>> 2013-08-04 Xinliang David Li <davidxl@google.com>
>> >>>
>> >>> * testsuite/gcc.target/i386/memcpy-strategy-1.c: New test.
>> >>> * testsuite/gcc.target/i386/memcpy-strategy-2.c: Ditto.
>> >>> * testsuite/gcc.target/i386/memset-strategy-1.c: Ditto.
>> >>> * testsuite/gcc.target/i386/memcpy-strategy-3.c: Ditto.
>
> The patch looks resonable to me in general. I wonder why we need to bring
> all the cost tables non-const instead of just having writable storage for
> the "current strategy" like we do with other flags anyway.
>
> Your strings are definitely more readable than the in-memory representation
> I came up with. Perhaps we can even turn the cost tables into strings
> for easier maintenance? I guess they are bit confusing for people
> not familiar with a code.
>
> Honza
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Aug 2, 2013 at 9:21 PM, Xinliang David Li <davidxl@google.com> wrote:
>> >>> > On x86_64, when the expected size of memcpy/memset is known (e.g, with
>> >>> > FDO), libcall strategy is used with the size is > 8192. This value is
>> >>> > hard coded, which makes it hard to do performance tuning. This patch
>> >>> > adds two new parameters to do that. Potential usage includes
>> >>> > per-application libcall strategy min-size tuning based on summary data
>> >>> > with FDO (e.g, instruction workset size).
>> >>> >
>> >>> > Bootstrap and tested on x86_64/linux. Ok for trunk?
>> >>> >
>> >>> > thanks,
>> >>> >
>> >>> > David
>> >>> >
>> >>> >
>> >>> > 2013-08-02 Xinliang David Li <davidxl@google.com>
>> >>> >
>> >>> > * params.def: New parameters.
>> >>> > * config/i386/i386.c (ix86_option_override_internal):
>> >>> > Override default libcall size limit with parameters.
>> >>
>> >>> Index: config/i386/stringop.def
>> >>> ===================================================================
>> >>> --- config/i386/stringop.def (revision 0)
>> >>> +++ config/i386/stringop.def (revision 0)
>> >>> @@ -0,0 +1,42 @@
>> >>> +/* Definitions for option handling for IA-32.
>> >>> + Copyright (C) 2013 Free Software Foundation, Inc.
>> >>> +
>> >>> +This file is part of GCC.
>> >>> +
>> >>> +GCC is free software; you can redistribute it and/or modify
>> >>> +it under the terms of the GNU General Public License as published by
>> >>> +the Free Software Foundation; either version 3, or (at your option)
>> >>> +any later version.
>> >>> +
>> >>> +GCC is distributed in the hope that it will be useful,
>> >>> +but WITHOUT ANY WARRANTY; without even the implied warranty of
>> >>> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> >>> +GNU General Public License for more details.
>> >>> +
>> >>> +Under Section 7 of GPL version 3, you are granted additional
>> >>> +permissions described in the GCC Runtime Library Exception, version
>> >>> +3.1, as published by the Free Software Foundation.
>> >>> +
>> >>> +You should have received a copy of the GNU General Public License and
>> >>> +a copy of the GCC Runtime Library Exception along with this program;
>> >>> +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
>> >>> +<http://www.gnu.org/licenses/>. */
>> >>> +
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (no_stringop, no_stringop)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (libcall, libcall)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (rep_prefix_1_byte, rep_byte)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (rep_prefix_4_byte, rep_4byte)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (rep_prefix_8_byte, rep_8byte)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (loop_1_byte, byte_loop)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (loop, loop)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (unrolled_loop, unrolled_loop)
>> >>> +DEF_ENUM
>> >>> +DEF_ALG (vector_loop, vector_loop)
>> >>> Index: config/i386/i386.opt
>> >>> ===================================================================
>> >>> --- config/i386/i386.opt (revision 201458)
>> >>> +++ config/i386/i386.opt (working copy)
>> >>> @@ -316,6 +316,14 @@ mstack-arg-probe
>> >>> Target Report Mask(STACK_PROBE) Save
>> >>> Enable stack probing
>> >>>
>> >>> +mmemcpy-strategy=
>> >>> +Target RejectNegative Joined Var(ix86_tune_memcpy_strategy)
>> >>> +Specify memcpy expansion strategy when expected size is known
>> >>> +
>> >>> +mmemset-strategy=
>> >>> +Target RejectNegative Joined Var(ix86_tune_memset_strategy)
>> >>> +Specify memset expansion strategy when expected size is known
>> >>> +
>> >>> mstringop-strategy=
>> >>> Target RejectNegative Joined Enum(stringop_alg) Var(ix86_stringop_alg) Init(no_stringop)
>> >>> Chose strategy to generate stringop using
>> >>> Index: config/i386/stringop.opt
>> >>> ===================================================================
>> >>> --- config/i386/stringop.opt (revision 0)
>> >>> +++ config/i386/stringop.opt (revision 0)
>> >>> @@ -0,0 +1,36 @@
>> >>> +/* Definitions for option handling for IA-32.
>> >>> + Copyright (C) 2013 Free Software Foundation, Inc.
>> >>> +
>> >>> +This file is part of GCC.
>> >>> +
>> >>> +GCC is free software; you can redistribute it and/or modify
>> >>> +it under the terms of the GNU General Public License as published by
>> >>> +the Free Software Foundation; either version 3, or (at your option)
>> >>> +any later version.
>> >>> +
>> >>> +GCC is distributed in the hope that it will be useful,
>> >>> +but WITHOUT ANY WARRANTY; without even the implied warranty of
>> >>> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> >>> +GNU General Public License for more details.
>> >>> +
>> >>> +Under Section 7 of GPL version 3, you are granted additional
>> >>> +permissions described in the GCC Runtime Library Exception, version
>> >>> +3.1, as published by the Free Software Foundation.
>> >>> +
>> >>> +You should have received a copy of the GNU General Public License and
>> >>> +a copy of the GCC Runtime Library Exception along with this program;
>> >>> +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
>> >>> +<http://www.gnu.org/licenses/>. */
>> >>> +
>> >>> +Enum(stringop_alg) String(rep_byte) Value(rep_prefix_1_byte)
>> >>> +
>> >>> +#undef DEF_ENUM
>> >>> +#define DEF_ENUM EnumValue
>> >>> +
>> >>> +#undef DEF_ALG
>> >>> +#define DEF_ALG(alg, name) Enum(stringop_alg) String(name) Value(alg)
>> >>> +
>> >>> +#include "stringop.def"
>> >>> +
>> >>> +#undef DEF_ENUM
>> >>> +#undef DEF_ALG
>> >>> Index: config/i386/i386.c
>> >>> ===================================================================
>> >>> --- config/i386/i386.c (revision 201458)
>> >>> +++ config/i386/i386.c (working copy)
>> >>> @@ -156,7 +156,7 @@ struct processor_costs ix86_size_cost =
>> >>> };
>> >>>
>> >>> /* Processor costs (relative to an add) */
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs i386_cost = { /* 386 specific costs */
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1), /* cost of a lea instruction */
>> >>> @@ -226,7 +226,7 @@ struct processor_costs i386_cost = { /*
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs i486_cost = { /* 486 specific costs */
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1), /* cost of a lea instruction */
>> >>> @@ -298,7 +298,7 @@ struct processor_costs i486_cost = { /*
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs pentium_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1), /* cost of a lea instruction */
>> >>> @@ -368,7 +368,7 @@ struct processor_costs pentium_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs pentiumpro_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1), /* cost of a lea instruction */
>> >>> @@ -447,7 +447,7 @@ struct processor_costs pentiumpro_cost =
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs geode_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1), /* cost of a lea instruction */
>> >>> @@ -518,7 +518,7 @@ struct processor_costs geode_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs k6_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (2), /* cost of a lea instruction */
>> >>> @@ -591,7 +591,7 @@ struct processor_costs k6_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs athlon_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (2), /* cost of a lea instruction */
>> >>> @@ -664,7 +664,7 @@ struct processor_costs athlon_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs k8_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (2), /* cost of a lea instruction */
>> >>> @@ -1265,7 +1265,7 @@ struct processor_costs btver2_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs pentium4_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (3), /* cost of a lea instruction */
>> >>> @@ -1336,7 +1336,7 @@ struct processor_costs pentium4_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs nocona_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1), /* cost of a lea instruction */
>> >>> @@ -1409,7 +1409,7 @@ struct processor_costs nocona_cost = {
>> >>> 1, /* cond_not_taken_branch_cost. */
>> >>> };
>> >>>
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs atom_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1) + 1, /* cost of a lea instruction */
>> >>> @@ -1556,7 +1556,7 @@ struct processor_costs slm_cost = {
>> >>> };
>> >>>
>> >>> /* Generic64 should produce code tuned for Nocona and K8. */
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs generic64_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> /* On all chips taken into consideration lea is 2 cycles and more. With
>> >>> @@ -1635,7 +1635,7 @@ struct processor_costs generic64_cost =
>> >>> };
>> >>>
>> >>> /* core_cost should produce code tuned for Core familly of CPUs. */
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs core_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> /* On all chips taken into consideration lea is 2 cycles and more. With
>> >>> @@ -1717,7 +1717,7 @@ struct processor_costs core_cost = {
>> >>>
>> >>> /* Generic32 should produce code tuned for PPro, Pentium4, Nocona,
>> >>> Athlon and K8. */
>> >>> -static const
>> >>> +static
>> >>> struct processor_costs generic32_cost = {
>> >>> COSTS_N_INSNS (1), /* cost of an add instruction */
>> >>> COSTS_N_INSNS (1) + 1, /* cost of a lea instruction */
>> >>> @@ -2900,6 +2900,150 @@ ix86_debug_options (void)
>> >>>
>> >>> return;
>> >>> }
>> >>> +
>> >>> +static const char *stringop_alg_names[] = {
>> >>> +#define DEF_ENUM
>> >>> +#define DEF_ALG(alg, name) #name,
>> >>> +#include "stringop.def"
>> >>> +#undef DEF_ENUM
>> >>> +#undef DEF_ALG
>> >>> +};
>> >>> +
>> >>> +/* Parse parameter string passed to -mmemcpy-strategy= or -mmemset-strategy=.
>> >>> + The string is of the following form (or comma separated list of it):
>> >>> +
>> >>> + strategy_alg:max_size:[align|noalign]
>> >>> +
>> >>> + where the full size range for the strategy is either [0, max_size] or
>> >>> + [min_size, max_size], in which min_size is the max_size + 1 of the
>> >>> + preceding range. The last size range must have max_size == -1.
>> >>> +
>> >>> + Examples:
>> >>> +
>> >>> + 1.
>> >>> + -mmemcpy-strategy=libcall:-1:noalign
>> >>> +
>> >>> + this is equivalent to (for known size memcpy) -mstringop-strategy=libcall
>> >>> +
>> >>> +
>> >>> + 2.
>> >>> + -mmemset-strategy=rep_8byte:16:noalign,vector_loop:2048:align,libcall:-1:noalign
>> >>> +
>> >>> + This is to tell the compiler to use the following strategy for memset
>> >>> + 1) when the expected size is between [1, 16], use rep_8byte strategy;
>> >>> + 2) when the size is between [17, 2048], use vector_loop;
>> >>> + 3) when the size is > 2048, use libcall.
>> >>> +
>> >>> +*/
>> >>> +
>> >>> +struct stringop_size_range
>> >>> +{
>> >>> + int min;
>> >>> + int max;
>> >>> + stringop_alg alg;
>> >>> + bool noalign;
>> >>> +};
>> >>> +
>> >>> +static void
>> >>> +ix86_parse_stringop_strategy_string (char *strategy_str, bool is_memset)
>> >>> +{
>> >>> + const struct stringop_algs *default_algs;
>> >>> + stringop_size_range input_ranges[MAX_STRINGOP_ALGS];
>> >>> + char *curr_range_str, *next_range_str;
>> >>> + int i = 0, n = 0;
>> >>> +
>> >>> + if (is_memset)
>> >>> + default_algs = &ix86_cost->memset[TARGET_64BIT != 0];
>> >>> + else
>> >>> + default_algs = &ix86_cost->memcpy[TARGET_64BIT != 0];
>> >>> +
>> >>> + curr_range_str = strategy_str;
>> >>> +
>> >>> + do {
>> >>> +
>> >>> + int mins, maxs;
>> >>> + stringop_alg alg;
>> >>> + char alg_name[128];
>> >>> + char align[16];
>> >>> +
>> >>> + next_range_str = strchr (curr_range_str, ',');
>> >>> + if (next_range_str)
>> >>> + *next_range_str++ = '\0';
>> >>> +
>> >>> + if (3 != sscanf (curr_range_str, "%[^:]:%d:%s", alg_name, &maxs, align))
>> >>> + {
>> >>> + warning (0, "Wrong arg %s to option %s", curr_range_str,
>> >>> + is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
>> >>> + return;
>> >>> + }
>> >>> +
>> >>> + if (n > 0 && (maxs < (mins = input_ranges[n - 1].max + 1) && maxs != -1))
>> >>> + {
>> >>> + warning (0, "Size ranges of option %s should be increasing",
>> >>> + is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
>> >>> + return;
>> >>> + }
>> >>> +
>> >>> + for (i = 0; i < last_alg; i++)
>> >>> + {
>> >>> + if (!strcmp (alg_name, stringop_alg_names[i]))
>> >>> + {
>> >>> + alg = (stringop_alg) i;
>> >>> + break;
>> >>> + }
>> >>> + }
>> >>> +
>> >>> + if (i == last_alg)
>> >>> + {
>> >>> + warning (0, "Wrong stringop strategy name %s specified for option %s",
>> >>> + alg_name,
>> >>> + is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
>> >>> + return;
>> >>> + }
>> >>> +
>> >>> + input_ranges[n].min = mins;
>> >>> + input_ranges[n].max = maxs;
>> >>> + input_ranges[n].alg = alg;
>> >>> + if (!strcmp (align, "align"))
>> >>> + input_ranges[n].noalign = false;
>> >>> + else if (!strcmp (align, "noalign"))
>> >>> + input_ranges[n].noalign = true;
>> >>> + else
>> >>> + {
>> >>> + warning (0, "Unknown alignment %s specified for option %s",
>> >>> + align, is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
>> >>> + return;
>> >>> + }
>> >>> + n++;
>> >>> + curr_range_str = next_range_str;
>> >>> + } while (curr_range_str);
>> >>> +
>> >>> + if (input_ranges[n - 1].max != -1)
>> >>> + {
>> >>> + warning (0, "The max value for the last size range should be -1"
>> >>> + " for option %s",
>> >>> + is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
>> >>> + return;
>> >>> + }
>> >>> +
>> >>> + if (n > MAX_STRINGOP_ALGS)
>> >>> + {
>> >>> + warning (0, "Too many size ranges specified in option %s",
>> >>> + is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
>> >>> + return;
>> >>> + }
>> >>> +
>> >>> + /* Now override the default algs array */
>> >>> + for (i = 0; i < n; i++)
>> >>> + {
>> >>> + *const_cast<int *>(&default_algs->size[i].max) = input_ranges[i].max;
>> >>> + *const_cast<stringop_alg *>(&default_algs->size[i].alg)
>> >>> + = input_ranges[i].alg;
>> >>> + *const_cast<int *>(&default_algs->size[i].noalign)
>> >>> + = input_ranges[i].noalign;
>> >>> + }
>> >>> +}
>> >>> +
>> >>>
>> >>> /* Override various settings based on options. If MAIN_ARGS_P, the
>> >>> options are from the command line, otherwise they are from
>> >>> @@ -4021,6 +4165,21 @@ ix86_option_override_internal (bool main
>> >>> /* Handle stack protector */
>> >>> if (!global_options_set.x_ix86_stack_protector_guard)
>> >>> ix86_stack_protector_guard = TARGET_HAS_BIONIC ? SSP_GLOBAL : SSP_TLS;
>> >>> +
>> >>> + /* Handle -mmemcpy-strategy= and -mmemset-strategy= */
>> >>> + if (ix86_tune_memcpy_strategy)
>> >>> + {
>> >>> + char *str = xstrdup (ix86_tune_memcpy_strategy);
>> >>> + ix86_parse_stringop_strategy_string (str, false);
>> >>> + free (str);
>> >>> + }
>> >>> +
>> >>> + if (ix86_tune_memset_strategy)
>> >>> + {
>> >>> + char *str = xstrdup (ix86_tune_memset_strategy);
>> >>> + ix86_parse_stringop_strategy_string (str, true);
>> >>> + free (str);
>> >>> + }
>> >>> }
>> >>>
>> >>> /* Implement the TARGET_OPTION_OVERRIDE hook. */
>> >>> @@ -22903,6 +23062,7 @@ ix86_expand_movmem (rtx dst, rtx src, rt
>> >>> {
>> >>> case libcall:
>> >>> case no_stringop:
>> >>> + case last_alg:
>> >>> gcc_unreachable ();
>> >>> case loop_1_byte:
>> >>> need_zero_guard = true;
>> >>> @@ -23093,6 +23253,7 @@ ix86_expand_movmem (rtx dst, rtx src, rt
>> >>> {
>> >>> case libcall:
>> >>> case no_stringop:
>> >>> + case last_alg:
>> >>> gcc_unreachable ();
>> >>> case loop_1_byte:
>> >>> case loop:
>> >>> @@ -23304,6 +23465,7 @@ ix86_expand_setmem (rtx dst, rtx count_e
>> >>> {
>> >>> case libcall:
>> >>> case no_stringop:
>> >>> + case last_alg:
>> >>> gcc_unreachable ();
>> >>> case loop:
>> >>> need_zero_guard = true;
>> >>> @@ -23481,6 +23643,7 @@ ix86_expand_setmem (rtx dst, rtx count_e
>> >>> {
>> >>> case libcall:
>> >>> case no_stringop:
>> >>> + case last_alg:
>> >>> gcc_unreachable ();
>> >>> case loop_1_byte:
>> >>> case loop:
>> >>> Index: config/i386/i386-opts.h
>> >>> ===================================================================
>> >>> --- config/i386/i386-opts.h (revision 201458)
>> >>> +++ config/i386/i386-opts.h (working copy)
>> >>> @@ -28,15 +28,17 @@ see the files COPYING3 and COPYING.RUNTI
>> >>> /* Algorithm to expand string function with. */
>> >>> enum stringop_alg
>> >>> {
>> >>> - no_stringop,
>> >>> - libcall,
>> >>> - rep_prefix_1_byte,
>> >>> - rep_prefix_4_byte,
>> >>> - rep_prefix_8_byte,
>> >>> - loop_1_byte,
>> >>> - loop,
>> >>> - unrolled_loop,
>> >>> - vector_loop
>> >>> +#undef DEF_ENUM
>> >>> +#define DEF_ENUM
>> >>> +
>> >>> +#undef DEF_ALG
>> >>> +#define DEF_ALG(alg, name) alg,
>> >>> +
>> >>> +#include "stringop.def"
>> >>> +last_alg
>> >>> +
>> >>> +#undef DEF_ENUM
>> >>> +#undef DEF_ALG
>> >>> };
>> >>>
>> >>> /* Available call abi. */
>> >>> Index: doc/invoke.texi
>> >>> ===================================================================
>> >>> --- doc/invoke.texi (revision 201458)
>> >>> +++ doc/invoke.texi (working copy)
>> >>> @@ -649,6 +649,7 @@ Objective-C and Objective-C++ Dialects}.
>> >>> -mbmi2 -mrtm -mlwp -mthreads @gol
>> >>> -mno-align-stringops -minline-all-stringops @gol
>> >>> -minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol
>> >>> +-mmemcpy-strategy=@var{strategy} -mmemset-strategy=@var{strategy}
>> >>> -mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol
>> >>> -m96bit-long-double -mlong-double-64 -mlong-double-80 @gol
>> >>> -mregparm=@var{num} -msseregparm @gol
>> >>> @@ -14598,6 +14599,24 @@ Expand into an inline loop.
>> >>> Always use a library call.
>> >>> @end table
>> >>>
>> >>> +@item -mmemcpy-strategy=@var{strategy}
>> >>> +@opindex mmemcpy-strategy=@var{strategy}
>> >>> +Override the internal decision heuristic to decide if @code{__builtin_memcpy}
>> >>> +should be inlined and what inline algorithm to use when the expected size
>> >>> +of the copy operation is known. @var{strategy}
>> >>> +is a comma-separated list of @var{alg}:@var{max_size}:@var{dest_align} triplets.
>> >>> +@var{alg} is specified in @option{-mstringop-strategy}, @var{max_size} specifies
>> >>> +the max byte size with which inline algorithm @var{alg} is allowed. For the last
>> >>> +triplet, the @var{max_size} must be @code{-1}. The @var{max_size} of the triplets
>> >>> +in the list must be specified in increasing order. The minimal byte size for
>> >>> +@var{alg} is @code{0} for the first triplet and @code{@var{max_size} + 1} of the
>> >>> +preceding range.
>> >>> +
>> >>> +@item -mmemset-strategy=@var{strategy}
>> >>> +@opindex mmemset-strategy=@var{strategy}
>> >>> +The option is similar to @option{-mmemcpy-strategy=} except that it is to control
>> >>> +@code{__builtin_memset} expansion.
>> >>> +
>> >>> @item -momit-leaf-frame-pointer
>> >>> @opindex momit-leaf-frame-pointer
>> >>> Don't keep the frame pointer in a register for leaf functions. This
>> >>> Index: testsuite/gcc.target/i386/memcpy-strategy-1.c
>> >>> ===================================================================
>> >>> --- testsuite/gcc.target/i386/memcpy-strategy-1.c (revision 0)
>> >>> +++ testsuite/gcc.target/i386/memcpy-strategy-1.c (revision 0)
>> >>> @@ -0,0 +1,12 @@
>> >>> +/* { dg-do compile } */
>> >>> +/* { dg-options "-O2 -march=atom -mmemcpy-strategy=vector_loop:-1:align" } */
>> >>> +/* { dg-final { scan-assembler-times "movdqa" 8 { target { ! { ia32 } } } } } */
>> >>> +/* { dg-final { scan-assembler-times "movdqa" 4 { target { ia32 } } } } */
>> >>> +
>> >>> +char a[2048];
>> >>> +char b[2048];
>> >>> +void t (void)
>> >>> +{
>> >>> + __builtin_memcpy (a, b, 2048);
>> >>> +}
>> >>> +
>> >>> Index: testsuite/gcc.target/i386/memcpy-strategy-2.c
>> >>> ===================================================================
>> >>> --- testsuite/gcc.target/i386/memcpy-strategy-2.c (revision 0)
>> >>> +++ testsuite/gcc.target/i386/memcpy-strategy-2.c (revision 0)
>> >>> @@ -0,0 +1,12 @@
>> >>> +/* { dg-do compile } */
>> >>> +/* { dg-options "-O2 -march=atom -mmemcpy-strategy=vector_loop:3000:align,libcall:-1:align" } */
>> >>> +/* { dg-final { scan-assembler-times "movdqa" 8 { target { ! { ia32 } } } } } */
>> >>> +/* { dg-final { scan-assembler-times "movdqa" 4 { target { ia32 } } } } */
>> >>> +
>> >>> +char a[2048];
>> >>> +char b[2048];
>> >>> +void t (void)
>> >>> +{
>> >>> + __builtin_memcpy (a, b, 2048);
>> >>> +}
>> >>> +
>> >>> Index: testsuite/gcc.target/i386/memset-strategy-1.c
>> >>> ===================================================================
>> >>> --- testsuite/gcc.target/i386/memset-strategy-1.c (revision 0)
>> >>> +++ testsuite/gcc.target/i386/memset-strategy-1.c (revision 0)
>> >>> @@ -0,0 +1,10 @@
>> >>> +/* { dg-do compile } */
>> >>> +/* { dg-options "-O2 -march=atom -mmemset-strategy=libcall:-1:align" } */
>> >>> +/* { dg-final { scan-assembler-times "memset" 2 } } */
>> >>> +
>> >>> +char a[2048];
>> >>> +void t (void)
>> >>> +{
>> >>> + __builtin_memset (a, 1, 2048);
>> >>> +}
>> >>> +
>> >>> Index: testsuite/gcc.target/i386/memcpy-strategy-3.c
>> >>> ===================================================================
>> >>> --- testsuite/gcc.target/i386/memcpy-strategy-3.c (revision 0)
>> >>> +++ testsuite/gcc.target/i386/memcpy-strategy-3.c (revision 0)
>> >>> @@ -0,0 +1,11 @@
>> >>> +/* { dg-do compile } */
>> >>> +/* { dg-options "-O2 -march=atom -mmemcpy-strategy=vector_loop:2000:align,libcall:-1:align" } */
>> >>> +/* { dg-final { scan-assembler-times "memcpy" 2 } } */
>> >>> +
>> >>> +char a[2048];
>> >>> +char b[2048];
>> >>> +void t (void)
>> >>> +{
>> >>> + __builtin_memcpy (a, b, 2048);
>> >>> +}
>> >>> +
>> >>
>>
>>
>>
>> --
>> ---
>> Best regards,
>> Michael V. Zolotukhin,
>> Software Engineer
>> Intel Corporation.
-------------- next part --------------
Index: testsuite/gcc.target/i386/memcpy-strategy-3.c
===================================================================
--- testsuite/gcc.target/i386/memcpy-strategy-3.c (revision 0)
+++ testsuite/gcc.target/i386/memcpy-strategy-3.c (revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=atom -mmemcpy-strategy=vector_loop:2000:align,libcall:-1:align" } */
+/* { dg-final { scan-assembler-times "memcpy" 2 } } */
+
+char a[2048];
+char b[2048];
+void t (void)
+{
+ __builtin_memcpy (a, b, 2048);
+}
+
Index: testsuite/gcc.target/i386/memcpy-strategy-1.c
===================================================================
--- testsuite/gcc.target/i386/memcpy-strategy-1.c (revision 0)
+++ testsuite/gcc.target/i386/memcpy-strategy-1.c (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=atom -mmemcpy-strategy=vector_loop:-1:align" } */
+/* { dg-final { scan-assembler-times "movdqa" 8 { target { ! { ia32 } } } } } */
+/* { dg-final { scan-assembler-times "movdqa" 4 { target { ia32 } } } } */
+
+char a[2048];
+char b[2048];
+void t (void)
+{
+ __builtin_memcpy (a, b, 2048);
+}
+
Index: testsuite/gcc.target/i386/memcpy-strategy-2.c
===================================================================
--- testsuite/gcc.target/i386/memcpy-strategy-2.c (revision 0)
+++ testsuite/gcc.target/i386/memcpy-strategy-2.c (revision 0)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=atom -mmemcpy-strategy=vector_loop:3000:align,libcall:-1:align" } */
+/* { dg-final { scan-assembler-times "movdqa" 8 { target { ! { ia32 } } } } } */
+/* { dg-final { scan-assembler-times "movdqa" 4 { target { ia32 } } } } */
+
+char a[2048];
+char b[2048];
+void t (void)
+{
+ __builtin_memcpy (a, b, 2048);
+}
+
Index: testsuite/gcc.target/i386/memset-strategy-1.c
===================================================================
--- testsuite/gcc.target/i386/memset-strategy-1.c (revision 0)
+++ testsuite/gcc.target/i386/memset-strategy-1.c (revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=atom -mmemset-strategy=libcall:-1:align" } */
+/* { dg-final { scan-assembler-times "memset" 2 } } */
+
+char a[2048];
+void t (void)
+{
+ __builtin_memset (a, 1, 2048);
+}
+
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi (revision 201540)
+++ doc/invoke.texi (working copy)
@@ -649,6 +649,7 @@ Objective-C and Objective-C++ Dialects}.
-mbmi2 -mrtm -mlwp -mthreads @gol
-mno-align-stringops -minline-all-stringops @gol
-minline-stringops-dynamically -mstringop-strategy=@var{alg} @gol
+-mmemcpy-strategy=@var{strategy} -mmemset-strategy=@var{strategy}
-mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol
-m96bit-long-double -mlong-double-64 -mlong-double-80 @gol
-mregparm=@var{num} -msseregparm @gol
@@ -14598,6 +14599,24 @@ Expand into an inline loop.
Always use a library call.
@end table
+@item -mmemcpy-strategy=@var{strategy}
+@opindex mmemcpy-strategy=@var{strategy}
+Override the internal decision heuristic to decide if @code{__builtin_memcpy}
+should be inlined and what inline algorithm to use when the expected size
+of the copy operation is known. @var{strategy}
+is a comma-separated list of @var{alg}:@var{max_size}:@var{dest_align} triplets.
+@var{alg} is specified in @option{-mstringop-strategy}, @var{max_size} specifies
+the max byte size with which inline algorithm @var{alg} is allowed. For the last
+triplet, the @var{max_size} must be @code{-1}. The @var{max_size} of the triplets
+in the list must be specified in increasing order. The minimal byte size for
+@var{alg} is @code{0} for the first triplet and @code{@var{max_size} + 1} of the
+preceding range.
+
+@item -mmemset-strategy=@var{strategy}
+@opindex mmemset-strategy=@var{strategy}
+The option is similar to @option{-mmemcpy-strategy=} except that it is to control
+@code{__builtin_memset} expansion.
+
@item -momit-leaf-frame-pointer
@opindex momit-leaf-frame-pointer
Don't keep the frame pointer in a register for leaf functions. This
Index: config/i386/stringop.def
===================================================================
--- config/i386/stringop.def (revision 0)
+++ config/i386/stringop.def (revision 0)
@@ -0,0 +1,42 @@
+/* Definitions for option handling for IA-32.
+ Copyright (C) 2013 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+<http://www.gnu.org/licenses/>. */
+
+DEF_ENUM
+DEF_ALG (no_stringop, no_stringop)
+DEF_ENUM
+DEF_ALG (libcall, libcall)
+DEF_ENUM
+DEF_ALG (rep_prefix_1_byte, rep_byte)
+DEF_ENUM
+DEF_ALG (rep_prefix_4_byte, rep_4byte)
+DEF_ENUM
+DEF_ALG (rep_prefix_8_byte, rep_8byte)
+DEF_ENUM
+DEF_ALG (loop_1_byte, byte_loop)
+DEF_ENUM
+DEF_ALG (loop, loop)
+DEF_ENUM
+DEF_ALG (unrolled_loop, unrolled_loop)
+DEF_ENUM
+DEF_ALG (vector_loop, vector_loop)
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c (revision 201540)
+++ config/i386/i386.c (working copy)
@@ -158,7 +158,7 @@ struct processor_costs ix86_size_cost =
};
/* Processor costs (relative to an add) */
-static const
+static
struct processor_costs i386_cost = { /* 386 specific costs */
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1), /* cost of a lea instruction */
@@ -228,7 +228,7 @@ struct processor_costs i386_cost = { /*
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs i486_cost = { /* 486 specific costs */
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1), /* cost of a lea instruction */
@@ -300,7 +300,7 @@ struct processor_costs i486_cost = { /*
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs pentium_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1), /* cost of a lea instruction */
@@ -370,7 +370,7 @@ struct processor_costs pentium_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs pentiumpro_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1), /* cost of a lea instruction */
@@ -449,7 +449,7 @@ struct processor_costs pentiumpro_cost =
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs geode_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1), /* cost of a lea instruction */
@@ -520,7 +520,7 @@ struct processor_costs geode_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs k6_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (2), /* cost of a lea instruction */
@@ -593,7 +593,7 @@ struct processor_costs k6_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs athlon_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (2), /* cost of a lea instruction */
@@ -666,7 +666,7 @@ struct processor_costs athlon_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs k8_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (2), /* cost of a lea instruction */
@@ -1267,7 +1267,7 @@ struct processor_costs btver2_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs pentium4_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (3), /* cost of a lea instruction */
@@ -1338,7 +1338,7 @@ struct processor_costs pentium4_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs nocona_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1), /* cost of a lea instruction */
@@ -1411,7 +1411,7 @@ struct processor_costs nocona_cost = {
1, /* cond_not_taken_branch_cost. */
};
-static const
+static
struct processor_costs atom_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1) + 1, /* cost of a lea instruction */
@@ -1558,7 +1558,7 @@ struct processor_costs slm_cost = {
};
/* Generic64 should produce code tuned for Nocona and K8. */
-static const
+static
struct processor_costs generic64_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
/* On all chips taken into consideration lea is 2 cycles and more. With
@@ -1637,7 +1637,7 @@ struct processor_costs generic64_cost =
};
/* core_cost should produce code tuned for Core familly of CPUs. */
-static const
+static
struct processor_costs core_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
/* On all chips taken into consideration lea is 2 cycles and more. With
@@ -1719,7 +1719,7 @@ struct processor_costs core_cost = {
/* Generic32 should produce code tuned for PPro, Pentium4, Nocona,
Athlon and K8. */
-static const
+static
struct processor_costs generic32_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1) + 1, /* cost of a lea instruction */
@@ -2919,6 +2919,149 @@ ix86_debug_options (void)
return;
}
+
+static const char *stringop_alg_names[] = {
+#define DEF_ENUM
+#define DEF_ALG(alg, name) #name,
+#include "stringop.def"
+#undef DEF_ENUM
+#undef DEF_ALG
+};
+
+/* Parse parameter string passed to -mmemcpy-strategy= or -mmemset-strategy=.
+ The string is of the following form (or comma separated list of it):
+
+ strategy_alg:max_size:[align|noalign]
+
+ where the full size range for the strategy is either [0, max_size] or
+ [min_size, max_size], in which min_size is the max_size + 1 of the
+ preceding range. The last size range must have max_size == -1.
+
+ Examples:
+
+ 1.
+ -mmemcpy-strategy=libcall:-1:noalign
+
+ this is equivalent to (for known size memcpy) -mstringop-strategy=libcall
+
+
+ 2.
+ -mmemset-strategy=rep_8byte:16:noalign,vector_loop:2048:align,libcall:-1:noalign
+
+ This is to tell the compiler to use the following strategy for memset
+ 1) when the expected size is between [1, 16], use rep_8byte strategy;
+ 2) when the size is between [17, 2048], use vector_loop;
+ 3) when the size is > 2048, use libcall. */
+
+struct stringop_size_range
+{
+ int min;
+ int max;
+ stringop_alg alg;
+ bool noalign;
+};
+
+static void
+ix86_parse_stringop_strategy_string (char *strategy_str, bool is_memset)
+{
+ const struct stringop_algs *default_algs;
+ stringop_size_range input_ranges[MAX_STRINGOP_ALGS];
+ char *curr_range_str, *next_range_str;
+ int i = 0, n = 0;
+
+ if (is_memset)
+ default_algs = &ix86_cost->memset[TARGET_64BIT != 0];
+ else
+ default_algs = &ix86_cost->memcpy[TARGET_64BIT != 0];
+
+ curr_range_str = strategy_str;
+
+ do
+ {
+ int mins, maxs;
+ stringop_alg alg;
+ char alg_name[128];
+ char align[16];
+
+ next_range_str = strchr (curr_range_str, ',');
+ if (next_range_str)
+ *next_range_str++ = '\0';
+
+ if (3 != sscanf (curr_range_str, "%[^:]:%d:%s", alg_name, &maxs, align))
+ {
+ warning (0, "Wrong arg %s to option %s", curr_range_str,
+ is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
+ return;
+ }
+
+ if (n > 0 && (maxs < (mins = input_ranges[n - 1].max + 1) && maxs != -1))
+ {
+ warning (0, "Size ranges of option %s should be increasing",
+ is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
+ return;
+ }
+
+ for (i = 0; i < last_alg; i++)
+ {
+ if (!strcmp (alg_name, stringop_alg_names[i]))
+ {
+ alg = (stringop_alg) i;
+ break;
+ }
+ }
+
+ if (i == last_alg)
+ {
+ warning (0, "Wrong stringop strategy name %s specified for option %s",
+ alg_name,
+ is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
+ return;
+ }
+
+ input_ranges[n].min = mins;
+ input_ranges[n].max = maxs;
+ input_ranges[n].alg = alg;
+ if (!strcmp (align, "align"))
+ input_ranges[n].noalign = false;
+ else if (!strcmp (align, "noalign"))
+ input_ranges[n].noalign = true;
+ else
+ {
+ warning (0, "Unknown alignment %s specified for option %s",
+ align, is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
+ return;
+ }
+ n++;
+ curr_range_str = next_range_str;
+ }
+ while (curr_range_str);
+
+ if (input_ranges[n - 1].max != -1)
+ {
+ warning (0, "The max value for the last size range should be -1"
+ " for option %s",
+ is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
+ return;
+ }
+
+ if (n > MAX_STRINGOP_ALGS)
+ {
+ warning (0, "Too many size ranges specified in option %s",
+ is_memset ? "-mmemset_strategy=" : "-mmemcpy_strategy=");
+ return;
+ }
+
+ /* Now override the default algs array. */
+ for (i = 0; i < n; i++)
+ {
+ *const_cast<int *>(&default_algs->size[i].max) = input_ranges[i].max;
+ *const_cast<stringop_alg *>(&default_algs->size[i].alg)
+ = input_ranges[i].alg;
+ *const_cast<int *>(&default_algs->size[i].noalign)
+ = input_ranges[i].noalign;
+ }
+}
+
/* Override various settings based on options. If MAIN_ARGS_P, the
options are from the command line, otherwise they are from
@@ -4040,6 +4183,21 @@ ix86_option_override_internal (bool main
/* Handle stack protector */
if (!global_options_set.x_ix86_stack_protector_guard)
ix86_stack_protector_guard = TARGET_HAS_BIONIC ? SSP_GLOBAL : SSP_TLS;
+
+ /* Handle -mmemcpy-strategy= and -mmemset-strategy= */
+ if (ix86_tune_memcpy_strategy)
+ {
+ char *str = xstrdup (ix86_tune_memcpy_strategy);
+ ix86_parse_stringop_strategy_string (str, false);
+ free (str);
+ }
+
+ if (ix86_tune_memset_strategy)
+ {
+ char *str = xstrdup (ix86_tune_memset_strategy);
+ ix86_parse_stringop_strategy_string (str, true);
+ free (str);
+ }
}
/* Implement the TARGET_OPTION_OVERRIDE hook. */
@@ -22923,6 +23081,7 @@ ix86_expand_movmem (rtx dst, rtx src, rt
{
case libcall:
case no_stringop:
+ case last_alg:
gcc_unreachable ();
case loop_1_byte:
need_zero_guard = true;
@@ -23113,6 +23272,7 @@ ix86_expand_movmem (rtx dst, rtx src, rt
{
case libcall:
case no_stringop:
+ case last_alg:
gcc_unreachable ();
case loop_1_byte:
case loop:
@@ -23324,6 +23484,7 @@ ix86_expand_setmem (rtx dst, rtx count_e
{
case libcall:
case no_stringop:
+ case last_alg:
gcc_unreachable ();
case loop:
need_zero_guard = true;
@@ -23501,6 +23662,7 @@ ix86_expand_setmem (rtx dst, rtx count_e
{
case libcall:
case no_stringop:
+ case last_alg:
gcc_unreachable ();
case loop_1_byte:
case loop:
Index: config/i386/i386-opts.h
===================================================================
--- config/i386/i386-opts.h (revision 201540)
+++ config/i386/i386-opts.h (working copy)
@@ -28,15 +28,17 @@ see the files COPYING3 and COPYING.RUNTI
/* Algorithm to expand string function with. */
enum stringop_alg
{
- no_stringop,
- libcall,
- rep_prefix_1_byte,
- rep_prefix_4_byte,
- rep_prefix_8_byte,
- loop_1_byte,
- loop,
- unrolled_loop,
- vector_loop
+#undef DEF_ENUM
+#define DEF_ENUM
+
+#undef DEF_ALG
+#define DEF_ALG(alg, name) alg,
+
+#include "stringop.def"
+last_alg
+
+#undef DEF_ENUM
+#undef DEF_ALG
};
/* Available call abi. */
Index: config/i386/stringop.opt
===================================================================
--- config/i386/stringop.opt (revision 0)
+++ config/i386/stringop.opt (revision 0)
@@ -0,0 +1,36 @@
+/* Definitions for option handling for IA-32.
+ Copyright (C) 2013 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively. If not, see
+<http://www.gnu.org/licenses/>. */
+
+Enum(stringop_alg) String(rep_byte) Value(rep_prefix_1_byte)
+
+#undef DEF_ENUM
+#define DEF_ENUM EnumValue
+
+#undef DEF_ALG
+#define DEF_ALG(alg, name) Enum(stringop_alg) String(name) Value(alg)
+
+#include "stringop.def"
+
+#undef DEF_ENUM
+#undef DEF_ALG
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt (revision 201540)
+++ config/i386/i386.opt (working copy)
@@ -316,6 +316,14 @@ mstack-arg-probe
Target Report Mask(STACK_PROBE) Save
Enable stack probing
+mmemcpy-strategy=
+Target RejectNegative Joined Var(ix86_tune_memcpy_strategy)
+Specify memcpy expansion strategy when expected size is known
+
+mmemset-strategy=
+Target RejectNegative Joined Var(ix86_tune_memset_strategy)
+Specify memset expansion strategy when expected size is known
+
mstringop-strategy=
Target RejectNegative Joined Enum(stringop_alg) Var(ix86_stringop_alg) Init(no_stringop)
Chose strategy to generate stringop using
More information about the Gcc-patches
mailing list