This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)
- From: Martin Liška <mliska at suse dot cz>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: Qing Zhao <qing dot zhao at oracle dot com>, Martin Jambor <mjambor at suse dot cz>, live-patching at vger dot kernel dot org, gcc Patches <gcc-patches at gcc dot gnu dot org>
- Date: Wed, 7 Nov 2018 15:21:54 +0100
- Subject: Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)
- References: <ri6d0ssu4rt.fsf@suse.cz> <5CB6BDBE-3F49-4BFE-AF10-5E8181C49181@oracle.com> <1a023bdc-28a6-eb41-b449-4d096f12064f@suse.cz> <048D9997-B7AF-444A-BF7E-79944DE8F174@oracle.com> <add5773e-0cea-58c9-b458-bf256f81d057@suse.cz> <3E37D3A8-2D19-41C2-BA8A-8F0EFA1B4D5C@oracle.com> <10a54034-279b-a406-8466-55558effbf24@suse.cz> <20181003090457.GJ57692@kam.mff.cuni.cz> <54a75932-201b-671c-0a63-d1a5d8d7b562@suse.cz> <90c91045-cb9d-0bd2-fad3-d16426ceede6@suse.cz> <20181105095135.j3mnzox6rkktkoto@kam.mff.cuni.cz>
On 11/5/18 10:51 AM, Jan Hubicka wrote:
>> @honza: PING
>>
>> On 10/3/18 12:53 PM, Martin Liška wrote:
>>> On 10/3/18 11:04 AM, Jan Hubicka wrote:
>>>>>
>>>>> That was promised to be done by Honza Hubička. He's very skilled in IPA optimizations and he's aware
>>>>> of optimizations that cause troubles for live-patching.
>>>>
>>>> :) I am not sure how skilful I am, but here is what I arrived to.
>>>
>>> Heh! Thanks for the analysis.
>>>
>>>>
>>>> We have transformations that are modeled as clonning, which are
>>>> - inlining (can't be disabled completely because of always inline, but -fno-inline
>>>> does most of stuff)
>>>> - cloning (disabled via -fno-ipa-cp)
>>>> - ipa-sra (-fno-ipa-sra)
>>>> - splitting (-fno-partial-inlining)
>>>> These should play well with Martin's tracking code
>>>
>>> I hope so!
>>>
>>>>
>>>> We propagate info about side effects of function:
>>>> - function attribute discovery (pure, const, nothrow, malloc)
>>>> Some of this can be disabled by -fno-ipa-pure-const, but not all
>>>> of it.
>>>
>>> Would it be possible to add option for the remaining ones?
>
> Sure, I can prepare patch unless you beat me :)
Are you sure there's a call to 'analyze_function' where the analysis is done
when one sets -fno-ipa-pure-const?
>>>
>>> Nothrow does not have flag but it is obviously not a concern
>>>> for C++
>>>
>>> s/C++/C?
>
> Yep for C
>>>
>>>> - ipa-pta (disabled by default, -fno-ipa-pta)
>>>> - ipa-reference (list of accessed/modified global vars), disable by -fno-ipa-refernece
>>>> - stack alignment requirements (no flag to disable)
>>>
>>> Would it be possible to add flag for it? Can you please point to a location where
>>> the optimization happen?
>
> In expand_call
>
> /* Figure out the amount to which the stack should be aligned. */
> preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
> if (fndecl)
> {
> struct cgraph_rtl_info *i = cgraph_node::rtl_info (fndecl);
> /* Without automatic stack alignment, we can't increase preferred
> stack boundary. With automatic stack alignment, it is
> unnecessary since unless we can guarantee that all callers will
> align the outgoing stack properly, callee has to align its
> stack anyway. */
> if (i
> && i->preferred_incoming_stack_boundary
> && i->preferred_incoming_stack_boundary < preferred_stack_boundary)
> preferred_stack_boundary = i->preferred_incoming_stack_boundary;
> }
I'm attaching patch candidate for that.
>
>>>
>>>> - inter-procedural register allocation (-fno-ipa-ra)
>>>>
>>>> We perform discovery of functions/variables with no address taken and
>>>> optimizations that are not valid otherwise such as duplicating them
>>>> or doing skipping them for alias analysis (no flag to disable)
>>>
>>> Can you be please more verbose here? What optimizations do you mean?
>
> See ipa_discover_readonly_nonaddressable_vars. If addressable bit is
> cleared we start analyzing uses of the variable via ipa_reference or so.
> If writeonly bit is set, we start removing writes to the variable and if
> readonly bit is set we skip any analysis about whether vairable changed.
Likewise for this.
>>>
>>>>
>>>> Identical code folding merges function bodies that are semanticaly equivalent
>>>> and thus one can't patch one without patching another, -fno-ipa-icf
>>>
>>> Agree, I recommend disabling that.
>>>
>>>>
>>>> Unreachable code/variable removal may be concern too (no flag to disable)
>>>
>>> For functions that should be fine and handled by my script.
>>> For variables can be problem when a variable becomes alive But that
>>> should be extremely rare for live-patching.
>>>
>>>>
>>>> Write only global variable discovery (no flag to dosable)
>>>
>>> Similarly.
>>>
>>>>
>>>> Visibility changes with -flto and/or -fwhole-program
>>>>
>>>> We also have profile propagation (discovery of cuntions used only in cold regions,
>>>> but that I guess is only performance issue not correctness)
>>>> No flag to disable
>>>
>>> Hope these 2 does not happen for current Linux kernel.
>
> 2 will happen in kernel. We will try to propagate cold code
> inter-procedurally based on what we think will be undefined effect at
> runtime. Still i guess it is not big deal as it only affects
> size optimization.
Then let's ignore it.
Thoughts about the patches?
Martin
>
> Honza
>>>
>>> Martin
>>>
>>>>
>>>> Honza
>>>>
>>>>>
>>>>> Martin
>>>>>
>>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> Qing
>>>>>>
>>>>>
>>>
>>
>From ee912514f61ec2c4d126cf6d43b69d01a08886c8 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 7 Nov 2018 13:47:40 +0100
Subject: [PATCH 2/2] Come up with the flag -fipa-stack-alignment.
gcc/ChangeLog:
2018-11-07 Martin Liska <mliska@suse.cz>
* common.opt: Add -fipa-stack-alignment flag.
* doc/invoke.texi: Document it.
* final.c (rest_of_clean_state): Guard stack
shrinking with flag.
gcc/testsuite/ChangeLog:
2018-11-07 Martin Liska <mliska@suse.cz>
* gcc.target/i386/ipa-stack-alignment.c: New test.
---
gcc/common.opt | 4 ++++
gcc/doc/invoke.texi | 7 ++++++-
gcc/final.c | 3 ++-
gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c | 13 +++++++++++++
4 files changed, 25 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c
diff --git a/gcc/common.opt b/gcc/common.opt
index 6a64b0e27d5..6ee48fbcfc4 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1724,6 +1724,10 @@ fipa-reference-addressable
Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization
Discover read-only and write-only addressable variables.
+fipa-stack-alignment
+Common Report Var(flag_ipa_stack_alignment) Init(1) Optimization
+Reduce stack alignment on call sites if possible.
+
fipa-matrix-reorg
Common Ignore
Does nothing. Preserved for backward compatibility.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 82c6fa913e8..2332e643993 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -413,7 +413,7 @@ Objective-C and Objective-C++ Dialects}.
-finline-small-functions -fipa-cp -fipa-cp-clone @gol
-fipa-bit-cp -fipa-vrp @gol
-fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable @gol
--fipa-icf -fira-algorithm=@var{algorithm} @gol
+-fipa-stack-alignment -fipa-icf -fira-algorithm=@var{algorithm} @gol
-fira-region=@var{region} -fira-hoist-pressure @gol
-fira-loop-pressure -fno-ira-share-save-slots @gol
-fno-ira-share-spill-slots @gol
@@ -8901,6 +8901,11 @@ Enabled by default at @option{-O} and higher.
Discover read-only and write-only addressable variables.
Enabled by default at @option{-O} and higher.
+@item -fipa-stack-alignment
+@opindex fipa-stack-alignment
+Reduce stack alignment on call sites if possible.
+Enabled by default.
+
@item -fipa-pta
@opindex fipa-pta
Perform interprocedural pointer analysis and interprocedural modification
diff --git a/gcc/final.c b/gcc/final.c
index 6e61f1e17a8..0c1ac625f37 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -4890,7 +4890,8 @@ rest_of_clean_state (void)
/* We can reduce stack alignment on call site only when we are sure that
the function body just produced will be actually used in the final
executable. */
- if (decl_binds_to_current_def_p (current_function_decl))
+ if (flag_ipa_stack_alignment
+ && decl_binds_to_current_def_p (current_function_decl))
{
unsigned int pref = crtl->preferred_stack_boundary;
if (crtl->stack_alignment_needed > crtl->preferred_stack_boundary)
diff --git a/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c b/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c
new file mode 100644
index 00000000000..1176b59aa5f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/ipa-stack-alignment.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-ipa-stack-alignment -O" } */
+
+typedef struct {
+ long a;
+ long b[];
+} c;
+
+c *d;
+void e() { d->b[0] = 5; }
+void f() { e(); }
+
+/* { dg-final { scan-assembler "sub.*%.sp" } } */
--
2.19.1
>From 8691490a142228021ed65313a72d176d06966829 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 7 Nov 2018 13:31:41 +0100
Subject: [PATCH 1/2] Come up with -fipa-reference-addressable flag.
gcc/ChangeLog:
2018-11-07 Martin Liska <mliska@suse.cz>
* cgraph.h (ipa_discover_readonly_nonaddressable_vars): Rename
to ...
(ipa_discover_nonaddressable_vars): ... this.
* common.opt: Come up with new flag -fipa-reference-addressable.
* doc/invoke.texi: Document it.
* ipa-reference.c (propagate): Call the renamed fn.
* ipa-visibility.c (whole_program_function_and_variable_visibility):
Likewise.
* ipa.c (ipa_discover_readonly_nonaddressable_vars): Renamed to
...
(ipa_discover_nonaddressable_vars): ... this. Discove
non-addressable variables only with the newly added flag.
* opts.c: Enable the newly added flag with -O1 and higher
optimization level.
gcc/testsuite/ChangeLog:
2018-11-07 Martin Liska <mliska@suse.cz>
* gcc.dg/tree-ssa/writeonly-2.c: New test.
---
gcc/cgraph.h | 2 +-
gcc/common.opt | 6 +++++-
gcc/doc/invoke.texi | 10 ++++++++--
gcc/ipa-reference.c | 2 +-
gcc/ipa-visibility.c | 2 +-
gcc/ipa.c | 11 +++++++----
gcc/opts.c | 1 +
gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c | 20 ++++++++++++++++++++
8 files changed, 44 insertions(+), 10 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index c13d79850fa..bf65d426cda 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -2403,7 +2403,7 @@ void record_references_in_initializer (tree, bool);
/* In ipa.c */
void cgraph_build_static_cdtor (char which, tree body, int priority);
-bool ipa_discover_readonly_nonaddressable_vars (void);
+bool ipa_discover_nonaddressable_vars (void);
/* In varpool.c */
tree ctor_for_folding (tree);
diff --git a/gcc/common.opt b/gcc/common.opt
index 2971dc21b1f..6a64b0e27d5 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1718,7 +1718,11 @@ Perform Identical Code Folding for variables.
fipa-reference
Common Report Var(flag_ipa_reference) Init(0) Optimization
-Discover readonly and non addressable static variables.
+Discover read-only and non addressable static variables.
+
+fipa-reference-addressable
+Common Report Var(flag_ipa_reference_addressable) Init(0) Optimization
+Discover read-only and write-only addressable variables.
fipa-matrix-reorg
Common Ignore
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ae260c6ac6d..82c6fa913e8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,8 +412,8 @@ Objective-C and Objective-C++ Dialects}.
-finline-functions -finline-functions-called-once -finline-limit=@var{n} @gol
-finline-small-functions -fipa-cp -fipa-cp-clone @gol
-fipa-bit-cp -fipa-vrp @gol
--fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-icf @gol
--fira-algorithm=@var{algorithm} @gol
+-fipa-pta -fipa-profile -fipa-pure-const -fipa-reference -fipa-reference-addressable @gol
+-fipa-icf -fira-algorithm=@var{algorithm} @gol
-fira-region=@var{region} -fira-hoist-pressure @gol
-fira-loop-pressure -fno-ira-share-save-slots @gol
-fno-ira-share-spill-slots @gol
@@ -7866,6 +7866,7 @@ compilation time.
-fipa-pure-const @gol
-fipa-profile @gol
-fipa-reference @gol
+-fipa-reference-addressable @gol
-fmerge-constants @gol
-fmove-loop-invariants @gol
-fomit-frame-pointer @gol
@@ -8895,6 +8896,11 @@ Discover which static variables do not escape the
compilation unit.
Enabled by default at @option{-O} and higher.
+@item -fipa-reference-addressable
+@opindex fipa-reference-addressable
+Discover read-only and write-only addressable variables.
+Enabled by default at @option{-O} and higher.
+
@item -fipa-pta
@opindex fipa-pta
Perform interprocedural pointer analysis and interprocedural modification
diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
index 43bbdae5d66..2cdce3cbfa6 100644
--- a/gcc/ipa-reference.c
+++ b/gcc/ipa-reference.c
@@ -705,7 +705,7 @@ propagate (void)
if (dump_file)
cgraph_node::dump_cgraph (dump_file);
- remove_p = ipa_discover_readonly_nonaddressable_vars ();
+ remove_p = ipa_discover_nonaddressable_vars ();
generate_summary ();
/* Propagate the local information through the call graph to produce
diff --git a/gcc/ipa-visibility.c b/gcc/ipa-visibility.c
index 000207fa31b..1da594111f8 100644
--- a/gcc/ipa-visibility.c
+++ b/gcc/ipa-visibility.c
@@ -911,7 +911,7 @@ whole_program_function_and_variable_visibility (void)
{
function_and_variable_visibility (flag_whole_program);
if (optimize || in_lto_p)
- ipa_discover_readonly_nonaddressable_vars ();
+ ipa_discover_nonaddressable_vars ();
return 0;
}
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 3b6b5e5c8d4..eb53e7dcd06 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -752,10 +752,10 @@ clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED)
return false;
}
-/* Discover variables that have no longer address taken or that are read only
- and update their flags.
+/* Discover variables that have no longer address taken, are read-only or
+ write-only and update their flags.
- Return true when unreachable symbol removan should be done.
+ Return true when unreachable symbol removal should be done.
FIXME: This can not be done in between gimplify and omp_expand since
readonly flag plays role on what is shared and what is not. Currently we do
@@ -764,8 +764,11 @@ clear_addressable_bit (varpool_node *vnode, void *data ATTRIBUTE_UNUSED)
make sense to do it before early optimizations. */
bool
-ipa_discover_readonly_nonaddressable_vars (void)
+ipa_discover_nonaddressable_vars (void)
{
+ if (!flag_ipa_reference_addressable)
+ return false;
+
bool remove_p = false;
varpool_node *vnode;
if (dump_file)
diff --git a/gcc/opts.c b/gcc/opts.c
index 34c283dd765..9b9018c6c48 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -451,6 +451,7 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_fif_conversion2, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_fipa_pure_const, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_fipa_reference, NULL, 1 },
+ { OPT_LEVELS_1_PLUS, OPT_fipa_reference_addressable, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_fipa_profile, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_fmerge_constants, NULL, 1 },
{ OPT_LEVELS_1_PLUS, OPT_freorder_blocks, NULL, 1 },
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c b/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c
new file mode 100644
index 00000000000..2272d15b171
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/writeonly-2.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-tree-optimized -fno-ipa-reference-addressable" } */
+static struct a {int magic1,b;} a;
+volatile int magic2;
+static struct b {int a,b,c,d,e,f;} magic3;
+
+struct b foo();
+
+void
+t()
+{
+ a.magic1 = 1;
+ magic2 = 1;
+ magic3 = foo();
+}
+/* { dg-final { scan-tree-dump "magic1" "optimized"} } */
+/* { dg-final { scan-tree-dump "magic3" "optimized"} } */
+/* { dg-final { scan-tree-dump "magic2" "optimized"} } */
+/* { dg-final { scan-tree-dump "foo" "optimized"} } */
+
--
2.19.1