This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Move profiling to SSA
- From: Jan Hubicka <hubicka at ucw dot cz>
- To: Richard Guenther <rguenther at suse dot de>
- Cc: gcc-patches at gcc dot gnu dot org, Jan Hubicka <jh at suse dot de>
- Date: Fri, 24 Sep 2010 00:24:29 +0200
- Subject: Re: [PATCH] Move profiling to SSA
- References: <alpine.LNX.2.00.1009231740330.8982@zhemvz.fhfr.qr>
>
> This moves profile (and coverage) instrumentation to SSA, removing
> the need of a separate early inliner and this eventually killing
> the last non-SSA inlining path (well, not quite yet I guess).
>
> The only issue is non-local gotos which need some new attribute
> for the profile functions, Honza had some patch for this that needs
> updating.
Hi,
thanks for working on this (it is quite long time on my TODO, but IPA/LTO
fixes still take apriority)
this patch adds leaf attribute broken up and updated
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg01329.html
It does not modify non-C frontend to handle it nor annotate builtin definitions
Richard asked me to do incrementally.
In addition to updating it to current tree, I also added stmt_can_make_abnormal_goto
code so Richard actually can use it to solve problem with profiling builtins.
While making new testcase I noticed that ipa-reference also needs updating in
ipa_reference_get_not_read_global and ipa_reference_get_not_written_global
I did not updated RTL land to avoid construcion of abnormal goto edges on leaf
functions. This is something that can be done incrementally using REG_NOTE
annotating the call statement. It is probably not really important however.
Bootstrapped/regtested x86_64-linux, OK?
Honza
* doc/extend.texi: (attribute leaf): Document.
* tree.c (local_define_builtin): Handle ECF_LEAF.
(build_common_builtin_nodes): Set ECF_LEAF where needed.
* tree.h (ECF_LEAF): New.
* ipa-reference.c (propagate_bits): For leaf calls propagate ever overwrittable
and unavailable functions.
(ipa_init): Put all_module_statics into optimization_summary_obstack.
(copy_global_bitmap): Do not copy all_module_statics.
(read_write_all_from_decl): Use cgraph_node argument; handle ECF_LEAF.
(propagate): Handle overwritable and unavailable leaf functions;
initialize global info for overwritable and unavailable leaf functions;
do not free all module statics.
(ipa_reference_get_not_read_global, ipa_reference_get_not_written_global):
leaf calls don't clobber local statics.
* calls.c (flags_from_decl_or_type): Handle leaf.
* tree-cfg.c (stmt_can_make_abnormal_goto): Leaf functions can't do
abnormal gotos.
* c-common.c (handle_leaf_attribute): New function.
(struct attribute_spec c_common_att): Add leaf.
* gcc.dg/tree-ssa/leaf.c: New testcase.
Index: doc/extend.texi
===================================================================
*** doc/extend.texi (revision 164477)
--- doc/extend.texi (working copy)
*************** SRAM. The function will be put into a sp
*** 2671,2676 ****
--- 2671,2701 ----
@code{.l1.text}. With @option{-mfdpic}, callers of such functions will use
an inlined PLT.
+ @item leaf
+ @cindex @code{leaf} function attribute
+ Calls to external functions with this attribute must return to the current
+ compilation unit only by return or by exception handling. In particular, leaf
+ functions are not allowed to call callback function passed to it from current
+ compilation unit or directly call functions exported by the unit or longjmp
+ into the unit. Still leaf function might call functions from other complation
+ units and thus they are not neccesarily leaf in the sense that they contains no
+ function calls at all.
+
+ The attribute is intended for library functions to improve dataflow analysis.
+ Compiler takes the hint that any data not escaping current compilation unit can
+ not be used or modified by the leaf function. For example, function @code{sin}
+ is leaf, function @code{qsort} is not.
+
+ Note that the leaf functions might invoke signals and signal handlers might be
+ defined in the current compilation unit and use static variables. Only
+ compliant way to write such a signal handler is to declare such variables
+ @code{volatile}.
+
+ The attribute has no effect on functions defined within current compilation
+ unit. This is to allow easy merging of multiple compilation units into one,
+ for example, by using the link time optimization. For this reason the
+ attribute is not allowed on types to annotate indirect calls.
+
@item long_call/short_call
@cindex indirect calls on ARM
This attribute specifies how a particular function is called on
Index: c-family/c-common.c
===================================================================
*** c-family/c-common.c (revision 164477)
--- c-family/c-common.c (working copy)
*************** static tree handle_hot_attribute (tree *
*** 308,313 ****
--- 308,314 ----
static tree handle_cold_attribute (tree *, tree, tree, int, bool *);
static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
+ static tree handle_leaf_attribute (tree *, tree, tree, int, bool *);
static tree handle_always_inline_attribute (tree *, tree, tree, int,
bool *);
static tree handle_gnu_inline_attribute (tree *, tree, tree, int, bool *);
*************** const struct attribute_spec c_common_att
*** 570,575 ****
--- 571,578 ----
handle_noinline_attribute },
{ "noclone", 0, 0, true, false, false,
handle_noclone_attribute },
+ { "leaf", 0, 0, true, false, false,
+ handle_leaf_attribute },
{ "always_inline", 0, 0, true, false, false,
handle_always_inline_attribute },
{ "gnu_inline", 0, 0, true, false, false,
*************** handle_gnu_inline_attribute (tree *node,
*** 5870,5875 ****
--- 5873,5900 ----
*no_add_attrs = true;
}
+ return NULL_TREE;
+ }
+
+ /* Handle a "leaf" attribute; arguments as in
+ struct attribute_spec.handler. */
+
+ static tree
+ handle_leaf_attribute (tree *node, tree name,
+ tree ARG_UNUSED (args),
+ int ARG_UNUSED (flags), bool *no_add_attrs)
+ {
+ if (TREE_CODE (*node) != FUNCTION_DECL)
+ {
+ warning (OPT_Wattributes, "%qE attribute ignored", name);
+ *no_add_attrs = true;
+ }
+ if (!TREE_PUBLIC (*node))
+ {
+ warning (OPT_Wattributes, "%qE attribute has no effect on unit local functions", name);
+ *no_add_attrs = true;
+ }
+
return NULL_TREE;
}
Index: tree.c
===================================================================
*** tree.c (revision 164477)
--- tree.c (working copy)
*************** local_define_builtin (const char *name,
*** 9224,9229 ****
--- 9224,9232 ----
TREE_NOTHROW (decl) = 1;
if (ecf_flags & ECF_MALLOC)
DECL_IS_MALLOC (decl) = 1;
+ if (ecf_flags & ECF_LEAF)
+ DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("leaf"),
+ NULL, DECL_ATTRIBUTES (decl));
built_in_decls[code] = decl;
implicit_built_in_decls[code] = decl;
*************** build_common_builtin_nodes (void)
*** 9247,9256 ****
if (built_in_decls[BUILT_IN_MEMCPY] == NULL)
local_define_builtin ("__builtin_memcpy", ftype, BUILT_IN_MEMCPY,
! "memcpy", ECF_NOTHROW);
if (built_in_decls[BUILT_IN_MEMMOVE] == NULL)
local_define_builtin ("__builtin_memmove", ftype, BUILT_IN_MEMMOVE,
! "memmove", ECF_NOTHROW);
}
if (built_in_decls[BUILT_IN_MEMCMP] == NULL)
--- 9250,9259 ----
if (built_in_decls[BUILT_IN_MEMCPY] == NULL)
local_define_builtin ("__builtin_memcpy", ftype, BUILT_IN_MEMCPY,
! "memcpy", ECF_NOTHROW | ECF_LEAF);
if (built_in_decls[BUILT_IN_MEMMOVE] == NULL)
local_define_builtin ("__builtin_memmove", ftype, BUILT_IN_MEMMOVE,
! "memmove", ECF_NOTHROW | ECF_LEAF);
}
if (built_in_decls[BUILT_IN_MEMCMP] == NULL)
*************** build_common_builtin_nodes (void)
*** 9259,9265 ****
const_ptr_type_node, size_type_node,
NULL_TREE);
local_define_builtin ("__builtin_memcmp", ftype, BUILT_IN_MEMCMP,
! "memcmp", ECF_PURE | ECF_NOTHROW);
}
if (built_in_decls[BUILT_IN_MEMSET] == NULL)
--- 9262,9268 ----
const_ptr_type_node, size_type_node,
NULL_TREE);
local_define_builtin ("__builtin_memcmp", ftype, BUILT_IN_MEMCMP,
! "memcmp", ECF_PURE | ECF_NOTHROW | ECF_LEAF);
}
if (built_in_decls[BUILT_IN_MEMSET] == NULL)
*************** build_common_builtin_nodes (void)
*** 9268,9274 ****
ptr_type_node, integer_type_node,
size_type_node, NULL_TREE);
local_define_builtin ("__builtin_memset", ftype, BUILT_IN_MEMSET,
! "memset", ECF_NOTHROW);
}
if (built_in_decls[BUILT_IN_ALLOCA] == NULL)
--- 9271,9277 ----
ptr_type_node, integer_type_node,
size_type_node, NULL_TREE);
local_define_builtin ("__builtin_memset", ftype, BUILT_IN_MEMSET,
! "memset", ECF_NOTHROW | ECF_LEAF);
}
if (built_in_decls[BUILT_IN_ALLOCA] == NULL)
*************** build_common_builtin_nodes (void)
*** 9276,9282 ****
ftype = build_function_type_list (ptr_type_node,
size_type_node, NULL_TREE);
local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA,
! "alloca", ECF_MALLOC | ECF_NOTHROW);
}
/* If we're checking the stack, `alloca' can throw. */
--- 9279,9285 ----
ftype = build_function_type_list (ptr_type_node,
size_type_node, NULL_TREE);
local_define_builtin ("__builtin_alloca", ftype, BUILT_IN_ALLOCA,
! "alloca", ECF_MALLOC | ECF_NOTHROW | ECF_LEAF);
}
/* If we're checking the stack, `alloca' can throw. */
*************** build_common_builtin_nodes (void)
*** 9288,9294 ****
ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_init_trampoline", ftype,
BUILT_IN_INIT_TRAMPOLINE,
! "__builtin_init_trampoline", ECF_NOTHROW);
ftype = build_function_type_list (ptr_type_node, ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_adjust_trampoline", ftype,
--- 9291,9297 ----
ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_init_trampoline", ftype,
BUILT_IN_INIT_TRAMPOLINE,
! "__builtin_init_trampoline", ECF_NOTHROW | ECF_LEAF);
ftype = build_function_type_list (ptr_type_node, ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_adjust_trampoline", ftype,
*************** build_common_builtin_nodes (void)
*** 9322,9333 ****
ftype = build_function_type_list (ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_stack_save", ftype, BUILT_IN_STACK_SAVE,
! "__builtin_stack_save", ECF_NOTHROW);
ftype = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_stack_restore", ftype,
BUILT_IN_STACK_RESTORE,
! "__builtin_stack_restore", ECF_NOTHROW);
ftype = build_function_type_list (void_type_node, NULL_TREE);
local_define_builtin ("__builtin_profile_func_enter", ftype,
--- 9325,9336 ----
ftype = build_function_type_list (ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_stack_save", ftype, BUILT_IN_STACK_SAVE,
! "__builtin_stack_save", ECF_NOTHROW | ECF_LEAF);
ftype = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
local_define_builtin ("__builtin_stack_restore", ftype,
BUILT_IN_STACK_RESTORE,
! "__builtin_stack_restore", ECF_NOTHROW | ECF_LEAF);
ftype = build_function_type_list (void_type_node, NULL_TREE);
local_define_builtin ("__builtin_profile_func_enter", ftype,
*************** build_common_builtin_nodes (void)
*** 9342,9348 ****
ftype = build_function_type_list (void_type_node, NULL_TREE);
local_define_builtin ("__builtin_cxa_end_cleanup", ftype,
BUILT_IN_CXA_END_CLEANUP,
! "__cxa_end_cleanup", ECF_NORETURN);
}
ftype = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
--- 9345,9351 ----
ftype = build_function_type_list (void_type_node, NULL_TREE);
local_define_builtin ("__builtin_cxa_end_cleanup", ftype,
BUILT_IN_CXA_END_CLEANUP,
! "__cxa_end_cleanup", ECF_NORETURN | ECF_LEAF);
}
ftype = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
*************** build_common_builtin_nodes (void)
*** 9361,9372 ****
ftype = build_function_type_list (ptr_type_node,
integer_type_node, NULL_TREE);
local_define_builtin ("__builtin_eh_pointer", ftype, BUILT_IN_EH_POINTER,
! "__builtin_eh_pointer", ECF_PURE | ECF_NOTHROW);
tmp = lang_hooks.types.type_for_mode (targetm.eh_return_filter_mode (), 0);
ftype = build_function_type_list (tmp, integer_type_node, NULL_TREE);
local_define_builtin ("__builtin_eh_filter", ftype, BUILT_IN_EH_FILTER,
! "__builtin_eh_filter", ECF_PURE | ECF_NOTHROW);
ftype = build_function_type_list (void_type_node,
integer_type_node, integer_type_node,
--- 9364,9375 ----
ftype = build_function_type_list (ptr_type_node,
integer_type_node, NULL_TREE);
local_define_builtin ("__builtin_eh_pointer", ftype, BUILT_IN_EH_POINTER,
! "__builtin_eh_pointer", ECF_PURE | ECF_NOTHROW | ECF_LEAF);
tmp = lang_hooks.types.type_for_mode (targetm.eh_return_filter_mode (), 0);
ftype = build_function_type_list (tmp, integer_type_node, NULL_TREE);
local_define_builtin ("__builtin_eh_filter", ftype, BUILT_IN_EH_FILTER,
! "__builtin_eh_filter", ECF_PURE | ECF_NOTHROW | ECF_LEAF);
ftype = build_function_type_list (void_type_node,
integer_type_node, integer_type_node,
*************** build_common_builtin_nodes (void)
*** 9408,9418 ****
built_in_names[mcode] = concat ("__mul", mode_name_buf, "3", NULL);
local_define_builtin (built_in_names[mcode], ftype, mcode,
! built_in_names[mcode], ECF_CONST | ECF_NOTHROW);
built_in_names[dcode] = concat ("__div", mode_name_buf, "3", NULL);
local_define_builtin (built_in_names[dcode], ftype, dcode,
! built_in_names[dcode], ECF_CONST | ECF_NOTHROW);
}
}
}
--- 9411,9421 ----
built_in_names[mcode] = concat ("__mul", mode_name_buf, "3", NULL);
local_define_builtin (built_in_names[mcode], ftype, mcode,
! built_in_names[mcode], ECF_CONST | ECF_NOTHROW | ECF_LEAF);
built_in_names[dcode] = concat ("__div", mode_name_buf, "3", NULL);
local_define_builtin (built_in_names[dcode], ftype, dcode,
! built_in_names[dcode], ECF_CONST | ECF_NOTHROW | ECF_LEAF);
}
}
}
Index: tree.h
===================================================================
*** tree.h (revision 164477)
--- tree.h (working copy)
*************** extern tree build_duplicate_type (tree);
*** 5224,5229 ****
--- 5224,5231 ----
/* Function does not read or write memory (but may have side effects, so
it does not necessarily fit ECF_CONST). */
#define ECF_NOVOPS (1 << 9)
+ /* The function does not lead to calls within current function unit. */
+ #define ECF_LEAF (1 << 10)
extern int flags_from_decl_or_type (const_tree);
extern int call_expr_flags (const_tree);
Index: ipa-reference.c
===================================================================
*** ipa-reference.c (revision 164477)
--- ipa-reference.c (working copy)
*************** ipa_reference_get_not_read_global (struc
*** 200,205 ****
--- 200,207 ----
info = get_reference_optimization_summary (fn);
if (info)
return info->statics_not_read;
+ else if (flags_from_decl_or_type (fn->decl) & ECF_LEAF)
+ return all_module_statics;
else
return NULL;
}
*************** ipa_reference_get_not_written_global (st
*** 217,222 ****
--- 219,226 ----
info = get_reference_optimization_summary (fn);
if (info)
return info->statics_not_written;
+ else if (flags_from_decl_or_type (fn->decl) & ECF_LEAF)
+ return all_module_statics;
else
return NULL;
}
*************** propagate_bits (ipa_reference_global_var
*** 299,307 ****
for (e = x->callees; e; e = e->next_callee)
{
struct cgraph_node *y = e->callee;
/* Only look into nodes we can propagate something. */
! if (cgraph_function_body_availability (e->callee) > AVAIL_OVERWRITABLE)
{
int flags = flags_from_decl_or_type (e->callee->decl);
if (get_reference_vars_info (y))
--- 303,315 ----
for (e = x->callees; e; e = e->next_callee)
{
struct cgraph_node *y = e->callee;
+ enum availability avail;
+ avail = cgraph_function_body_availability (e->callee);
/* Only look into nodes we can propagate something. */
! if (avail > AVAIL_OVERWRITABLE
! || (avail == AVAIL_OVERWRITABLE
! && (flags_from_decl_or_type (e->callee->decl) & ECF_LEAF)))
{
int flags = flags_from_decl_or_type (e->callee->decl);
if (get_reference_vars_info (y))
*************** read_write_all_from_decl (struct cgraph_
*** 573,589 ****
{
tree decl = node->decl;
int flags = flags_from_decl_or_type (decl);
! if (flags & ECF_CONST)
;
else if ((flags & ECF_PURE)
|| cgraph_node_cannot_return (node))
! *read_all = true;
else
{
/* TODO: To be able to produce sane results, we should also handle
common builtins, in particular throw. */
*read_all = true;
*write_all = true;
}
}
--- 581,608 ----
{
tree decl = node->decl;
int flags = flags_from_decl_or_type (decl);
! if ((flags & ECF_LEAF)
! && cgraph_function_body_availability (node) <= AVAIL_OVERWRITABLE)
! ;
! else if (flags & ECF_CONST)
;
else if ((flags & ECF_PURE)
|| cgraph_node_cannot_return (node))
! {
! *read_all = true;
! if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, " %s/%i -> read all\n",
! cgraph_node_name (node), node->uid);
! }
else
{
/* TODO: To be able to produce sane results, we should also handle
common builtins, in particular throw. */
*read_all = true;
*write_all = true;
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, " %s/%i -> read all, write all\n",
+ cgraph_node_name (node), node->uid);
}
}
*************** propagate (void)
*** 629,634 ****
--- 648,658 ----
node_info = get_reference_vars_info (node);
gcc_assert (node_info);
+
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, "Starting cycle with %s/%i\n",
+ cgraph_node_name (node), node->uid);
+
node_l = &node_info->local;
node_g = &node_info->global;
*************** propagate (void)
*** 647,655 ****
if (!(ie->indirect_info->ecf_flags & ECF_CONST))
{
read_all = true;
if (!cgraph_edge_cannot_lead_to_return (ie)
&& !(ie->indirect_info->ecf_flags & ECF_PURE))
! write_all = true;
}
--- 671,685 ----
if (!(ie->indirect_info->ecf_flags & ECF_CONST))
{
read_all = true;
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, " indirect call -> read all\n");
if (!cgraph_edge_cannot_lead_to_return (ie)
&& !(ie->indirect_info->ecf_flags & ECF_PURE))
! {
! if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, " indirect call -> write all\n");
! write_all = true;
! }
}
*************** propagate (void)
*** 659,664 ****
--- 689,697 ----
w = w_info->next_cycle;
while (w && (!read_all || !write_all))
{
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, " Visiting %s/%i\n",
+ cgraph_node_name (w), w->uid);
/* When function is overwrittable, we can not assume anything. */
if (cgraph_function_body_availability (w) <= AVAIL_OVERWRITABLE)
read_write_all_from_decl (w, &read_all, &write_all);
*************** propagate (void)
*** 671,679 ****
if (!(ie->indirect_info->ecf_flags & ECF_CONST))
{
read_all = true;
if (!cgraph_edge_cannot_lead_to_return (ie)
&& !(ie->indirect_info->ecf_flags & ECF_PURE))
! write_all = true;
}
w_info = (struct ipa_dfs_info *) w->aux;
--- 704,718 ----
if (!(ie->indirect_info->ecf_flags & ECF_CONST))
{
read_all = true;
+ if (dump_file && (dump_flags & TDF_DETAILS))
+ fprintf (dump_file, " indirect call -> read all\n");
if (!cgraph_edge_cannot_lead_to_return (ie)
&& !(ie->indirect_info->ecf_flags & ECF_PURE))
! {
! write_all = true;
! if (dump_file && (dump_flags & TDF_DETAILS))
! fprintf (dump_file, " indirect call -> write all\n");
! }
}
w_info = (struct ipa_dfs_info *) w->aux;
*************** propagate (void)
*** 841,847 ****
continue;
node_info = get_reference_vars_info (node);
! if (cgraph_function_body_availability (node) > AVAIL_OVERWRITABLE)
{
node_g = &node_info->global;
--- 880,887 ----
continue;
node_info = get_reference_vars_info (node);
! if (cgraph_function_body_availability (node) > AVAIL_OVERWRITABLE
! || (flags_from_decl_or_type (node->decl) & ECF_LEAF))
{
node_g = &node_info->global;
Index: testsuite/gcc.dg/tree-ssa/leaf.c
===================================================================
*** testsuite/gcc.dg/tree-ssa/leaf.c (revision 0)
--- testsuite/gcc.dg/tree-ssa/leaf.c (revision 0)
***************
*** 0 ****
--- 1,20 ----
+ /* { dg-do compile } */
+ /* { dg-options "-O2 -fdump-tree-optimized" } */
+ static int local_static;
+ void __attribute__ ((leaf)) leaf_call (void);
+
+ int
+ clobber_it (void)
+ {
+ return local_static++;
+ }
+ int
+ test (void)
+ {
+ local_static = 9;
+ leaf_call ();
+ return local_static;
+ }
+ /* { dg-final { scan-tree-dump-times "return 9" 1 "optimized"} } */
+
+ /* { dg-final { cleanup-tree-dump "optimized" } } */
Index: calls.c
===================================================================
*** calls.c (revision 164477)
--- calls.c (working copy)
*************** flags_from_decl_or_type (const_tree exp)
*** 610,615 ****
--- 610,617 ----
if (DECL_IS_NOVOPS (exp))
flags |= ECF_NOVOPS;
+ if (lookup_attribute ("leaf", DECL_ATTRIBUTES (exp)))
+ flags |= ECF_LEAF;
if (TREE_NOTHROW (exp))
flags |= ECF_NOTHROW;
Index: tree-cfg.c
===================================================================
*** tree-cfg.c (revision 164477)
--- tree-cfg.c (working copy)
*************** stmt_can_make_abnormal_goto (gimple t)
*** 2314,2320 ****
if (computed_goto_p (t))
return true;
if (is_gimple_call (t))
! return gimple_has_side_effects (t) && cfun->has_nonlocal_label;
return false;
}
--- 2314,2321 ----
if (computed_goto_p (t))
return true;
if (is_gimple_call (t))
! return (gimple_has_side_effects (t) && cfun->has_nonlocal_label
! && !(gimple_call_flags (t) & ECF_LEAF));
return false;
}