This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [4/6] Optionally pick the cheapest loop_vec_info
- From: Richard Sandiford <richard dot sandiford at arm dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: "gcc-patches\@gcc.gnu.org" <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 07 Nov 2019 17:15:27 +0000
- Subject: Re: [4/6] Optionally pick the cheapest loop_vec_info
- References: <mptbltqtn9b.fsf@arm.com> <mpttv7is8fp.fsf@arm.com> <CAFiYyc3eZvCx1Rw6mg3L=fhk7TEzVrC877diUPUc7Cfof-pHsw@mail.gmail.com> <mptpni5nlxn.fsf@arm.com> <CAFiYyc24cDs9qgCPJ4E3OcVFZ7n+T7H7r=gyUbB+02E=+dkCyw@mail.gmail.com>
Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Nov 6, 2019 at 3:01 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener <richard.guenther@gmail.com> writes:
>> > On Tue, Nov 5, 2019 at 3:29 PM Richard Sandiford
>> > <Richard.Sandiford@arm.com> wrote:
>> >>
>> >> This patch adds a mode in which the vectoriser tries each available
>> >> base vector mode and picks the one with the lowest cost. For now
>> >> the behaviour is behind a default-off --param, but a later patch
>> >> enables it by default for SVE.
>> >>
>> >> The patch keeps the current behaviour of preferring a VF of
>> >> loop->simdlen over any larger or smaller VF, regardless of costs
>> >> or target preferences.
>> >
>> > Can you avoid using a --param for this? Instead I'd suggest to
>> > amend the vectorize_modes target hook to return some
>> > flags like VECT_FIRST_MODE_WINS. We'd eventually want
>> > to make the target able to say do-not-vectorize-epiloges-of-MODE
>> > (I think we may not want to vectorize SSE vectorized loop
>> > epilogues with MMX-with-SSE or GPRs for example). I guess
>> > for the latter we'd use a new target hook.
>>
>> The reason for using a --param was that I wanted a way of turning
>> this on and off on the command line, so that users can experiment
>> with it if necessary. E.g. enabling the --param could be a viable
>> alternative to -mprefix-* in some cases. Disabling it would be
>> a way of working around a bad cost model decision without going
>> all the way to -fno-vect-cost-model.
>>
>> These kinds of --params can become useful workarounds until an
>> optimisation bug is fixed.
>
> I'm arguing that the default depends on the actual ISAs so there isn't
> a one-fits all and given we have OMP SIMD and target cloning for
> multiple ISAs this looks like a wrong approach. For sure the
> target can use its own switches to override defaults here, or alternatively
> we might want to have a #pragma GCC simdlen mimicing OMP behavior
> here.
I agree there's no one-size-fits-all choice here, but that's true for
other --params too. The problem with using target switches is that we
have to explain them and to keep accepting them "forever" (or at least
with a long deprecation period). Whereas the --param was just something
that people could play with or perhaps use to work around problems
temporarily. It would come with no guarantees attached. And what the
--param did applied to any targets that support multiple modes,
regardless of what the targets do by default.
All that said, here's a version that returns the bitmask you suggested.
I ended up making the flag select the new behaviour and 0 select the
current behaviour, rather than have a flag for "first mode wins".
Tested as before.
Thanks,
Richard
2019-11-07 Richard Sandiford <richard.sandiford@arm.com>
gcc/
* target.h (VECT_COMPARE_COSTS): New constant.
* target.def (autovectorize_vector_modes): Return a bitmask of flags.
* doc/tm.texi: Regenerate.
* targhooks.h (default_autovectorize_vector_modes): Update accordingly.
* targhooks.c (default_autovectorize_vector_modes): Likewise.
* config/aarch64/aarch64.c (aarch64_autovectorize_vector_modes):
Likewise.
* config/arc/arc.c (arc_autovectorize_vector_modes): Likewise.
* config/arm/arm.c (arm_autovectorize_vector_modes): Likewise.
* config/i386/i386.c (ix86_autovectorize_vector_modes): Likewise.
* config/mips/mips.c (mips_autovectorize_vector_modes): Likewise.
* tree-vectorizer.h (_loop_vec_info::vec_outside_cost)
(_loop_vec_info::vec_inside_cost): New member variables.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize them.
(vect_better_loop_vinfo_p, vect_joust_loop_vinfos): New functions.
(vect_analyze_loop): When autovectorize_vector_modes returns
VECT_COMPARE_COSTS, try vectorizing the loop with each available
vector mode and picking the one with the lowest cost.
(vect_estimate_min_profitable_iters): Record the computed costs
in the loop_vec_info.
Index: gcc/target.h
===================================================================
--- gcc/target.h 2019-11-07 15:11:15.831017985 +0000
+++ gcc/target.h 2019-11-07 16:52:30.037198353 +0000
@@ -218,6 +218,14 @@ enum omp_device_kind_arch_isa {
omp_device_isa
};
+/* Flags returned by TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES:
+
+ VECT_COMPARE_COSTS
+ Tells the loop vectorizer to try all the provided modes and
+ pick the one with the lowest cost. By default the vectorizer
+ will choose the first mode that works. */
+const unsigned int VECT_COMPARE_COSTS = 1U << 0;
+
/* The target structure. This holds all the backend hooks. */
#define DEFHOOKPOD(NAME, DOC, TYPE, INIT) TYPE NAME;
#define DEFHOOK(NAME, DOC, TYPE, PARAMS, INIT) TYPE (* NAME) PARAMS;
Index: gcc/target.def
===================================================================
--- gcc/target.def 2019-11-07 15:11:15.819018071 +0000
+++ gcc/target.def 2019-11-07 16:52:30.037198353 +0000
@@ -1925,10 +1925,20 @@ element mode.\n\
If @var{all} is true, add suitable vector modes even when they are generally\n\
not expected to be worthwhile.\n\
\n\
+The hook returns a bitmask of flags that control how the modes in\n\
+@var{modes} are used. The flags are:\n\
+@table @code\n\
+@item VECT_COMPARE_COSTS\n\
+Tells the loop vectorizer to try all the provided modes and pick the one\n\
+with the lowest cost. By default the vectorizer will choose the first\n\
+mode that works.\n\
+@end table\n\
+\n\
The hook does not need to do anything if the vector returned by\n\
@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE} is the only one relevant\n\
-for autovectorization. The default implementation does nothing.",
- void,
+for autovectorization. The default implementation adds no modes and\n\
+returns 0.",
+ unsigned int,
(vector_modes *modes, bool all),
default_autovectorize_vector_modes)
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi 2019-11-07 15:11:15.779018354 +0000
+++ gcc/doc/tm.texi 2019-11-07 16:52:30.037198353 +0000
@@ -6016,7 +6016,7 @@ against lower halves of vectors recursiv
reached. The default is @var{mode} which means no splitting.
@end deftypefn
-@deftypefn {Target Hook} void TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES (vector_modes *@var{modes}, bool @var{all})
+@deftypefn {Target Hook} {unsigned int} TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES (vector_modes *@var{modes}, bool @var{all})
If using the mode returned by @code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}
is not the only approach worth considering, this hook should add one mode to
@var{modes} for each useful alternative approach. These modes are then
@@ -6032,9 +6032,19 @@ element mode.
If @var{all} is true, add suitable vector modes even when they are generally
not expected to be worthwhile.
+The hook returns a bitmask of flags that control how the modes in
+@var{modes} are used. The flags are:
+@table @code
+@item VECT_COMPARE_COSTS
+Tells the loop vectorizer to try all the provided modes and pick the one
+with the lowest cost. By default the vectorizer will choose the first
+mode that works.
+@end table
+
The hook does not need to do anything if the vector returned by
@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE} is the only one relevant
-for autovectorization. The default implementation does nothing.
+for autovectorization. The default implementation adds no modes and
+returns 0.
@end deftypefn
@deftypefn {Target Hook} opt_machine_mode TARGET_VECTORIZE_RELATED_MODE (machine_mode @var{vector_mode}, scalar_mode @var{element_mode}, poly_uint64 @var{nunits})
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h 2019-11-07 15:11:15.831017985 +0000
+++ gcc/targhooks.h 2019-11-07 16:52:30.041198324 +0000
@@ -113,7 +113,7 @@ default_builtin_support_vector_misalignm
int, bool);
extern machine_mode default_preferred_simd_mode (scalar_mode mode);
extern machine_mode default_split_reduction (machine_mode);
-extern void default_autovectorize_vector_modes (vector_modes *, bool);
+extern unsigned int default_autovectorize_vector_modes (vector_modes *, bool);
extern opt_machine_mode default_vectorize_related_mode (machine_mode,
scalar_mode,
poly_uint64);
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c 2019-11-07 15:11:15.831017985 +0000
+++ gcc/targhooks.c 2019-11-07 16:52:30.041198324 +0000
@@ -1301,9 +1301,10 @@ default_split_reduction (machine_mode mo
/* By default only the preferred vector mode is tried. */
-void
+unsigned int
default_autovectorize_vector_modes (vector_modes *, bool)
{
+ return 0;
}
/* The default implementation of TARGET_VECTORIZE_RELATED_MODE. */
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c 2019-11-07 15:11:19.442992405 +0000
+++ gcc/config/aarch64/aarch64.c 2019-11-07 16:52:30.021198461 +0000
@@ -15949,7 +15949,7 @@ aarch64_preferred_simd_mode (scalar_mode
/* Return a list of possible vector sizes for the vectorizer
to iterate over. */
-static void
+static unsigned int
aarch64_autovectorize_vector_modes (vector_modes *modes, bool)
{
if (TARGET_SVE)
@@ -15975,6 +15975,8 @@ aarch64_autovectorize_vector_modes (vect
TODO: We could similarly support limited forms of V2QImode and V2HImode
for this case. */
modes->safe_push (V2SImode);
+
+ return 0;
}
/* Implement TARGET_MANGLE_TYPE. */
Index: gcc/config/arc/arc.c
===================================================================
--- gcc/config/arc/arc.c 2019-11-07 15:11:15.599019628 +0000
+++ gcc/config/arc/arc.c 2019-11-07 16:52:30.021198461 +0000
@@ -609,7 +609,7 @@ arc_preferred_simd_mode (scalar_mode mod
/* Implements target hook
TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES. */
-static void
+static unsigned int
arc_autovectorize_vector_modes (vector_modes *modes, bool)
{
if (TARGET_PLUS_QMACW)
@@ -617,6 +617,7 @@ arc_autovectorize_vector_modes (vector_m
modes->quick_push (V4HImode);
modes->quick_push (V2HImode);
}
+ return 0;
}
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c 2019-11-07 15:11:15.683019033 +0000
+++ gcc/config/arm/arm.c 2019-11-07 16:52:30.029198406 +0000
@@ -289,7 +289,7 @@ static bool arm_builtin_support_vector_m
static void arm_conditional_register_usage (void);
static enum flt_eval_method arm_excess_precision (enum excess_precision_type);
static reg_class_t arm_preferred_rename_class (reg_class_t rclass);
-static void arm_autovectorize_vector_modes (vector_modes *, bool);
+static unsigned int arm_autovectorize_vector_modes (vector_modes *, bool);
static int arm_default_branch_cost (bool, bool);
static int arm_cortex_a5_branch_cost (bool, bool);
static int arm_cortex_m_branch_cost (bool, bool);
@@ -29015,7 +29015,7 @@ arm_vector_alignment (const_tree type)
return align;
}
-static void
+static unsigned int
arm_autovectorize_vector_modes (vector_modes *modes, bool)
{
if (!TARGET_NEON_VECTORIZE_DOUBLE)
@@ -29023,6 +29023,7 @@ arm_autovectorize_vector_modes (vector_m
modes->safe_push (V16QImode);
modes->safe_push (V8QImode);
}
+ return 0;
}
static bool
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c 2019-11-07 15:11:15.715018807 +0000
+++ gcc/config/i386/i386.c 2019-11-07 16:52:30.033198382 +0000
@@ -21385,7 +21385,7 @@ ix86_preferred_simd_mode (scalar_mode mo
vectors. If AVX512F is enabled then try vectorizing with 512bit,
256bit and 128bit vectors. */
-static void
+static unsigned int
ix86_autovectorize_vector_modes (vector_modes *modes, bool all)
{
if (TARGET_AVX512F && !TARGET_PREFER_AVX256)
@@ -21415,6 +21415,8 @@ ix86_autovectorize_vector_modes (vector_
if (TARGET_MMX_WITH_SSE)
modes->safe_push (V8QImode);
+
+ return 0;
}
/* Implemenation of targetm.vectorize.get_mask_mode. */
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c 2019-11-07 15:11:15.755018524 +0000
+++ gcc/config/mips/mips.c 2019-11-07 16:52:30.037198353 +0000
@@ -13455,11 +13455,12 @@ mips_preferred_simd_mode (scalar_mode mo
/* Implement TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES. */
-static void
+static unsigned int
mips_autovectorize_vector_modes (vector_modes *modes, bool)
{
if (ISA_HAS_MSA)
modes->safe_push (V16QImode);
+ return 0;
}
/* Implement TARGET_INIT_LIBFUNCS. */
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h 2019-11-07 16:52:25.000000000 +0000
+++ gcc/tree-vectorizer.h 2019-11-07 16:52:30.041198324 +0000
@@ -601,6 +601,13 @@ typedef class _loop_vec_info : public ve
/* Cost of a single scalar iteration. */
int single_scalar_iteration_cost;
+ /* The cost of the vector prologue and epilogue, including peeled
+ iterations and set-up code. */
+ int vec_outside_cost;
+
+ /* The cost of the vector loop body. */
+ int vec_inside_cost;
+
/* Is the loop vectorizable? */
bool vectorizable;
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c 2019-11-07 16:52:25.000000000 +0000
+++ gcc/tree-vect-loop.c 2019-11-07 16:52:30.041198324 +0000
@@ -830,6 +830,8 @@ _loop_vec_info::_loop_vec_info (class lo
scan_map (NULL),
slp_unrolling_factor (1),
single_scalar_iteration_cost (0),
+ vec_outside_cost (0),
+ vec_inside_cost (0),
vectorizable (false),
can_fully_mask_p (true),
fully_masked_p (false),
@@ -2375,6 +2377,80 @@ vect_analyze_loop_2 (loop_vec_info loop_
goto start_over;
}
+/* Return true if vectorizing a loop using NEW_LOOP_VINFO appears
+ to be better than vectorizing it using OLD_LOOP_VINFO. Assume that
+ OLD_LOOP_VINFO is better unless something specifically indicates
+ otherwise.
+
+ Note that this deliberately isn't a partial order. */
+
+static bool
+vect_better_loop_vinfo_p (loop_vec_info new_loop_vinfo,
+ loop_vec_info old_loop_vinfo)
+{
+ struct loop *loop = LOOP_VINFO_LOOP (new_loop_vinfo);
+ gcc_assert (LOOP_VINFO_LOOP (old_loop_vinfo) == loop);
+
+ poly_int64 new_vf = LOOP_VINFO_VECT_FACTOR (new_loop_vinfo);
+ poly_int64 old_vf = LOOP_VINFO_VECT_FACTOR (old_loop_vinfo);
+
+ /* Always prefer a VF of loop->simdlen over any other VF. */
+ if (loop->simdlen)
+ {
+ bool new_simdlen_p = known_eq (new_vf, loop->simdlen);
+ bool old_simdlen_p = known_eq (old_vf, loop->simdlen);
+ if (new_simdlen_p != old_simdlen_p)
+ return new_simdlen_p;
+ }
+
+ /* Limit the VFs to what is likely to be the maximum number of iterations,
+ to handle cases in which at least one loop_vinfo is fully-masked. */
+ HOST_WIDE_INT estimated_max_niter = likely_max_stmt_executions_int (loop);
+ if (estimated_max_niter != -1)
+ {
+ if (known_le (estimated_max_niter, new_vf))
+ new_vf = estimated_max_niter;
+ if (known_le (estimated_max_niter, old_vf))
+ old_vf = estimated_max_niter;
+ }
+
+ /* Check whether the (fractional) cost per scalar iteration is lower
+ or higher: new_inside_cost / new_vf vs. old_inside_cost / old_vf. */
+ poly_widest_int rel_new = (new_loop_vinfo->vec_inside_cost
+ * poly_widest_int (old_vf));
+ poly_widest_int rel_old = (old_loop_vinfo->vec_inside_cost
+ * poly_widest_int (new_vf));
+ if (maybe_lt (rel_old, rel_new))
+ return false;
+ if (known_lt (rel_new, rel_old))
+ return true;
+
+ /* If there's nothing to choose between the loop bodies, see whether
+ there's a difference in the prologue and epilogue costs. */
+ if (new_loop_vinfo->vec_outside_cost != old_loop_vinfo->vec_outside_cost)
+ return new_loop_vinfo->vec_outside_cost < old_loop_vinfo->vec_outside_cost;
+
+ return false;
+}
+
+/* Decide whether to replace OLD_LOOP_VINFO with NEW_LOOP_VINFO. Return
+ true if we should. */
+
+static bool
+vect_joust_loop_vinfos (loop_vec_info new_loop_vinfo,
+ loop_vec_info old_loop_vinfo)
+{
+ if (!vect_better_loop_vinfo_p (new_loop_vinfo, old_loop_vinfo))
+ return false;
+
+ if (dump_enabled_p ())
+ dump_printf_loc (MSG_NOTE, vect_location,
+ "***** Preferring vector mode %s to vector mode %s\n",
+ GET_MODE_NAME (new_loop_vinfo->vector_mode),
+ GET_MODE_NAME (old_loop_vinfo->vector_mode));
+ return true;
+}
+
/* Function vect_analyze_loop.
Apply a set of analyses on LOOP, and create a loop_vec_info struct
@@ -2386,8 +2462,9 @@ vect_analyze_loop (class loop *loop, vec
auto_vector_modes vector_modes;
/* Autodetect first vector size we try. */
- targetm.vectorize.autovectorize_vector_modes (&vector_modes,
- loop->simdlen != 0);
+ unsigned int autovec_flags
+ = targetm.vectorize.autovectorize_vector_modes (&vector_modes,
+ loop->simdlen != 0);
unsigned int mode_i = 0;
DUMP_VECT_SCOPE ("analyze_loop_nest");
@@ -2410,6 +2487,8 @@ vect_analyze_loop (class loop *loop, vec
machine_mode next_vector_mode = VOIDmode;
poly_uint64 lowest_th = 0;
unsigned vectorized_loops = 0;
+ bool pick_lowest_cost_p = ((autovec_flags & VECT_COMPARE_COSTS)
+ && !unlimited_cost_model (loop));
bool vect_epilogues = false;
opt_result res = opt_result::success ();
@@ -2430,6 +2509,34 @@ vect_analyze_loop (class loop *loop, vec
bool fatal = false;
+ /* When pick_lowest_cost_p is true, we should in principle iterate
+ over all the loop_vec_infos that LOOP_VINFO could replace and
+ try to vectorize LOOP_VINFO under the same conditions.
+ E.g. when trying to replace an epilogue loop, we should vectorize
+ LOOP_VINFO as an epilogue loop with the same VF limit. When trying
+ to replace the main loop, we should vectorize LOOP_VINFO as a main
+ loop too.
+
+ However, autovectorize_vector_modes is usually sorted as follows:
+
+ - Modes that naturally produce lower VFs usually follow modes that
+ naturally produce higher VFs.
+
+ - When modes naturally produce the same VF, maskable modes
+ usually follow unmaskable ones, so that the maskable mode
+ can be used to vectorize the epilogue of the unmaskable mode.
+
+ This order is preferred because it leads to the maximum
+ epilogue vectorization opportunities. Targets should only use
+ a different order if they want to make wide modes available while
+ disparaging them relative to earlier, smaller modes. The assumption
+ in that case is that the wider modes are more expensive in some
+ way that isn't reflected directly in the costs.
+
+ There should therefore be few interesting cases in which
+ LOOP_VINFO fails when treated as an epilogue loop, succeeds when
+ treated as a standalone loop, and ends up being genuinely cheaper
+ than FIRST_LOOP_VINFO. */
if (vect_epilogues)
LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = first_loop_vinfo;
@@ -2477,13 +2584,34 @@ vect_analyze_loop (class loop *loop, vec
LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = NULL;
simdlen = 0;
}
+ else if (pick_lowest_cost_p && first_loop_vinfo)
+ {
+ /* Keep trying to roll back vectorization attempts while the
+ loop_vec_infos they produced were worse than this one. */
+ vec<loop_vec_info> &vinfos = first_loop_vinfo->epilogue_vinfos;
+ while (!vinfos.is_empty ()
+ && vect_joust_loop_vinfos (loop_vinfo, vinfos.last ()))
+ {
+ gcc_assert (vect_epilogues);
+ delete vinfos.pop ();
+ }
+ if (vinfos.is_empty ()
+ && vect_joust_loop_vinfos (loop_vinfo, first_loop_vinfo))
+ {
+ delete first_loop_vinfo;
+ first_loop_vinfo = opt_loop_vec_info::success (NULL);
+ LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo) = NULL;
+ }
+ }
if (first_loop_vinfo == NULL)
{
first_loop_vinfo = loop_vinfo;
lowest_th = LOOP_VINFO_VERSIONING_THRESHOLD (first_loop_vinfo);
}
- else if (vect_epilogues)
+ else if (vect_epilogues
+ /* For now only allow one epilogue loop. */
+ && first_loop_vinfo->epilogue_vinfos.is_empty ())
{
first_loop_vinfo->epilogue_vinfos.safe_push (loop_vinfo);
poly_uint64 th = LOOP_VINFO_VERSIONING_THRESHOLD (loop_vinfo);
@@ -2503,12 +2631,14 @@ vect_analyze_loop (class loop *loop, vec
&& loop->inner == NULL
&& PARAM_VALUE (PARAM_VECT_EPILOGUES_NOMASK)
&& LOOP_VINFO_PEELING_FOR_NITER (first_loop_vinfo)
- /* For now only allow one epilogue loop. */
- && first_loop_vinfo->epilogue_vinfos.is_empty ());
+ /* For now only allow one epilogue loop, but allow
+ pick_lowest_cost_p to replace it. */
+ && (first_loop_vinfo->epilogue_vinfos.is_empty ()
+ || pick_lowest_cost_p));
/* Commit to first_loop_vinfo if we have no reason to try
alternatives. */
- if (!simdlen && !vect_epilogues)
+ if (!simdlen && !vect_epilogues && !pick_lowest_cost_p)
break;
}
else
@@ -3467,7 +3597,11 @@ vect_estimate_min_profitable_iters (loop
&vec_inside_cost, &vec_epilogue_cost);
vec_outside_cost = (int)(vec_prologue_cost + vec_epilogue_cost);
-
+
+ /* Stash the costs so that we can compare two loop_vec_infos. */
+ loop_vinfo->vec_inside_cost = vec_inside_cost;
+ loop_vinfo->vec_outside_cost = vec_outside_cost;
+
if (dump_enabled_p ())
{
dump_printf_loc (MSG_NOTE, vect_location, "Cost model analysis: \n");