Attachment 'removing-global-state-from-gcc.txt'
Download 1 Removal of Global State from GCC
2 ================================
3
4
5 Removal of Global State from GCC
6 --------------------------------
7
8 A proposal for major internal changes to GCC.
9
10 This is just a summary.
11
12 See http://gcc.gnu.org/ml/gcc/2013-06/msg00215.html
13 for the extended version.
14
15 Why?
16 ----
17
18 Support embedding GCC as a shared library
19
20 * thread-safe: the state of each GCC instance within the process is
21 completely independent of each other GCC instance.
22
23 * Just-In-Time compilation (JIT)
24 ** language runtimes (Python, Ruby, Java, etc)
25 ** spam filters
26 ** OpenGL shaders
27 ** etc.
28
29 * Static code analysis
30
31 * Documentation generators
32
33 * etc
34
35
36 Non-plans
37 ---------
38
39 * Outwardly-visible behavior changes
40
41 * Changing the license
42
43 * Changes to requirements of classic "monolithic binaries" use-case
44 ** e.g. needing LTO
45 ** e.g. needing TLS
46
47 * Changes to (measurable) performance of said use-case
48
49
50 What else would we need to support JIT-compilation?
51 ---------------------------------------------------
52
53 The following are out-of-scope of my state-removal plan:
54
55 * Providing an API with an ABI that can have useful stability guarantee
56 * Generating actual machine code rather than just assembler
57 (e.g. embedding of binutils)
58 * Picking an appropriate subset of passes for JIT
59 * Providing an example for people to follow.
60
61
62 Scale of Problem
63 ----------------
64
65 * 3500 global variables
66 * 100000 sites in the code directly using them
67
68
69 High-level Summary
70 ------------------
71
72 * Multiple "parallel universes" of state within one GCC process
73
74 * Move all global variables and functions into classes
75 ** these classes will be "singletons" in the normal build
76 ** they will have multiple instances in a shared library build
77
78 * Minimal disturbance to existing code: "just add classes" (minimizing
79 merger risks and ability to grok the project history)
80
81 * Various tricks to:
82 ** maintain the performance of the standard "monolithic binaries" use case
83 ** minimize the patching and backporting pain relative to older GCC source trees
84
85
86 "Universe" vs "context"
87 -----------------------
88
89 class universe
90 {
91 public:
92 /* Instance of the garbage collector. */
93 gc_heap *heap_;
94 ...
95 /* Instance of the callgraph. */
96 callgraph *cgraph_;
97 ...
98 /* Pass management. */
99 pipeline *passes_;
100 ...
101 /* Important objects. */
102 struct gcc_options global_options_;
103 frontend *frontend_;
104 backend *backend_;
105 FILE * dump_file_;
106 int dump_flags_;
107 // etc
108 ...
109 location_t input_location_;
110 ...
111 /* State shared by many passes. */
112 struct df_d *df_;
113 redirect_edge_var_state *edge_vars_;
114 ...
115 /* Passes that have special state-handling needs. */
116 mudflap_state *mudflap_;
117 }; // class universe
118
119 Passes become C++ classes
120 -------------------------
121
122 static const pass_data pass_data_vrp =
123 {
124 GIMPLE_PASS, /* type */
125 "vrp", /* name */
126 OPTGROUP_NONE, /* optinfo_flags */
127 true, /* has_gate */
128 true, /* has_execute */
129 TV_TREE_VRP, /* tv_id */
130 PROP_ssa, /* properties_required */
131 0, /* properties_provided */
132 0, /* properties_destroyed */
133 0, /* todo_flags_start */
134 TODO_cleanup_cfg | TODO_update_ssa | TODO_verify_ssa | TODO_verify_flow, /* todo_flags_finish */
135 };
136
137 Passes (2)
138 ----------
139
140 class pass_vrp : public gimple_opt_pass
141 {
142 public:
143 pass_vrp(universe &uni)
144 : gimple_opt_pass(pass_data_vrp, uni)
145 {}
146 /* opt_pass methods: */
147 opt_pass * clone () { return new pass_vrp (uni_); }
148 bool gate () { return gate_vrp (); }
149 unsigned int execute () { return execute_vrp (); }
150 }; // class pass_vrp
151
152 gimple_opt_pass *
153 make_pass_vrp (universe &uni)
154 {
155 return new pass_vrp (uni);
156 }
157
158
159 Pass state
160 ----------
161 Various types of per-pass state, which can be moved:
162
163 * onto the stack
164 * inside the pass instance
165 * in a private object shared by all instances of a pass
166 * in a semi-private object "owned" by the universe
167
168
169 Which universe am I in?
170 -----------------------
171 * Passes become C++ classes, with a ref back to their universe (usable
172 from execute hook)
173
174 * a "universe *" is also available in thread-local store, for use
175 in macros:
176
177 #if SHARED_BUILD
178 extern __thread universe *uni_ptr;
179 #else
180 extern universe g;
181 #endif
182
183 /* Macro for getting a (universe &) */
184 #if SHARED_BUILD
185 /* Read a thread-local pointer: */
186 #define GET_UNIVERSE() (*uni_ptr)
187 #else
188 /* Access the global singleton: */
189 #define GET_UNIVERSE() (g)
190 #endif
191
192
193 Minimizing merge pain vs "doing it properly"
194 --------------------------------------------
195
196 Consider:
197
198 #define timevar_push(TV) GET_UNIVERSE().timevars_->push (TV)
199 #define timevar_pop(TV) GET_UNIVERSE().timevars_->pop (TV)
200 #define timevar_start(TV) GET_UNIVERSE().timevars_->start (TV)
201 #define timevar_stop(TV) GET_UNIVERSE().timevars_->stop (TV)
202
203 vs a patch that touches all 200+ sites that use the timevar API:
204
205 void
206 jump_labels::
207 rebuild_jump_labels_1 (rtx f, bool count_forced)
208 {
209 rtx insn;
210 - timevar_push (TV_REBUILD_JUMP);
211 + uni_.timevar_push (TV_REBUILD_JUMP);
212 init_label_info (f);
213
214 The universe sits below GTY/GGC
215 -------------------------------
216
217 * Each universe gets its own GC heap
218 ** Needs special-case handling as its own root (not a pointer).
219 ** Gradually becomes the only root, as global GTY roots are removed.
220
221 Status:
222
223 * I have this working for GC
224 * Not yet working with PCH (but I think this is doable)
225
226 * Assumption: the universe instance is the single thing that:
227 ** can own refs on GC objects AND
228 ** isn't itself in the GC heap
229
230
231 Performance
232 -----------
233
234 * I won't be adding fields to any major types, so memory usage shouldn't
235 noticably change.
236
237 * We know there'll be a hit of a few % for adding -fPIC/-fpic (so this will
238 be a configure-time opt-in).
239
240 * We can't yet know what the impact of passing around context will
241 be (register pressure etc).
242
243 * How expensive is TLS on various archs?
244
245
246 What should my benchmark suite look like?
247 -----------------------------------------
248
249 Benchmark 1: compile time of Linux kernel
250
251 Benchmark 2: building Firefox with LTO
252
253 I have a systemtap script to watch all process invocation, gathering various
254 timings, so we can track per-TU timings "from outside".
255
256
257 Ways of avoiding performance hit
258 --------------------------------
259
260 * Configure-time opt-in to shared library
261
262 * Ways of eliminating context pointers
263
264
265 Eliminating context ptrs (1)
266 ----------------------------
267
268 #if GLOBAL_STATE
269 /* When using global state, all methods and fields of state classes
270 become "static", so that there is effectively a single global
271 instance of the state, and there is no implicit "this->" being passed
272 around. */
273 # define MAYBE_STATIC static
274 #else
275 /* When using on-stack state, all methods and fields of state classes
276 lose the "static", so that there can be multiple instances of the
277 state with an implicit "this->" everywhere the state is used. */
278 # define MAYBE_STATIC
279 #endif
280
281 Example of MAYBE_STATIC
282 -----------------------
283
284 cgraph.h
285
286 class GTY((user)) callgraph
287 {
288 public:
289 callgraph(universe &uni);
290 MAYBE_STATIC void dump (FILE *) const;
291 MAYBE_STATIC void dump_cgraph_node (FILE *, struct cgraph_node *) const;
292 MAYBE_STATIC void remove_edge (struct cgraph_edge *);
293 MAYBE_STATIC void remove_node (struct cgraph_node *);
294 MAYBE_STATIC struct cgraph_edge *
295 create_edge (struct cgraph_node *,
296 struct cgraph_node *,
297 gimple, gcov_type, int);
298 /* etc */
299
300
301 Eliminating context ptrs (2)
302 ----------------------------
303
304 #if USING_IMPLICIT_STATIC
305 #define SINGLETON_IN_STATIC_BUILD __attribute__((force_static))
306 #else
307 #define SINGLETON_IN_STATIC_BUILD
308 #endif
309
310 class GTY((user)) SINGLETON_IN_STATIC_BUILD callgraph
311 {
312 public:
313 callgraph(universe &uni);
314 void dump (FILE *) const;
315 void dump_cgraph_node (FILE *, struct cgraph_node *) const;
316 void remove_edge (struct cgraph_edge *);
317 void remove_node (struct cgraph_node *);
318 struct cgraph_edge *
319 create_edge (struct cgraph_node *,
320 struct cgraph_node *,
321 gimple, gcov_type, int);
322 /* etc */
323
324
325 Eliminating context ptrs (3)
326 ----------------------------
327
328 #if USING_SINGLETON_ATTRIBUTE
329 #define SINGLETON_IN_STATIC_BUILD(INSTANCE) \
330 __attribute__((singleton(INSTANCE))
331 #else
332 #define SINGLETON_IN_STATIC_BUILD(INSTANCE)
333 #endif
334
335 #if USING_SINGLETON_ATTRIBUTE
336 class callgraph the_cgraph;
337 #endif
338
339 class GTY((user)) SINGLETON_IN_STATIC_BUILD(the_cgraph) callgraph
340 {
341 public:
342 callgraph(universe &uni);
343 void dump (FILE *) const;
344 void dump_cgraph_node (FILE *, struct cgraph_node *) const;
345 void remove_edge (struct cgraph_edge *);
346 void remove_node (struct cgraph_node *);
347 struct cgraph_edge *
348 create_edge (struct cgraph_node *,
349 struct cgraph_node *,
350 gimple, gcov_type, int);
351 /* etc */
352
353
354 Branch management
355 -----------------
356 Given perf concerns, my thinking is:
357
358 * do it on a (git) branch, merging from trunk regularly
359 * measure performance relative to 4.8 and to trunk regularly
360 * tactical patches to trunk to minimize merger pain
361 * when would the merge into trunk need to happen by for 4.9/4.10?
362 * autogenerate burndown charts measuring # of globals and # of usage sites
363
364
365 What I'm hoping for from Cauldron
366 ---------------------------------
367
368 * Consensus that this is desirable
369 * Consensus that my work could be mergable
370 * Branch management plans
371 * Performance Guidelines
372
373
374 Discussion
375 ----------
376 What nasty problems have I missed?
Attached Files
To refer to attachments on a page, use attachment:filename, as shown below in the list of files. Do NOT use the URL of the [get] link, since this is subject to change and can break easily.You are not allowed to attach a file to this page.