This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Tuples for GIMPLE


A few days ago we had chatted with Ian on IRC about the general idea of
using some tuple-like structure for GIMPLE instead of the current notion
of treating everything as a 'tree'.  We also chatted briefly with Zdenek
about this when he proposed turning compiler temporaries into non-decls.
 I think all these threads are converging and we should probably start
doing something about it.

Although I haven't thought too much about it, I know this has been on
people's minds for a long time.  I gathered some stats, which look
interesting.  At least they're skewed in the right direction.

I got these stats over a collection of C/C++ code from cc1-i-files,
SPEC2000, DLV, MICO, TRAMP3D and a couple of C++ bugzilla testcases.
The instrumentation is right after we go into SSA form, so no code has
been cleaned up yet (I wanted to maximize the variety of expressions).
This works to about 6.3M IL statements and 12.3M operands):

----------------------------------------------------------------------------
GIMPLE statement codes (6295370 statements)

modify_expr     =   4770176 ( 75%)
label_expr      =    860960 ( 13%)
cond_expr       =    393865 (  6%)
call_expr       =    210765 (  3%)
return_expr     =     31029 (  0%)
resx_expr       =     24399 (  0%)
switch_expr     =      3960 (  0%)
asm_expr        =       215 (  0%)
goto_expr       =         1 (  0%)


Number of operands per statement

2 operand(s):   4770176 ( 75%)
1 operand(s):    916389 ( 14%)
3 operand(s):    608590 (  9%)
4 operand(s):       215 (  0%)
5 operand(s):         0 (  0%)


Operands used (12283371 operands)

ssa_name        =   5520669 ( 44%)
label_decl      =    860960 (  7%)
nop_expr        =    808041 (  6%)
component_ref   =    787782 (  6%)
goto_expr       =    787730 (  6%)
addr_expr       =    525451 (  4%)
var_decl        =    478025 (  3%)
integer_cst     =    282051 (  2%)
plus_expr       =    222208 (  1%)
tree_list       =    202241 (  1%)
indirect_ref    =    201901 (  1%)
ne_expr         =    176549 (  1%)
call_expr       =    152942 (  1%)
array_ref       =    148189 (  1%)
eq_expr         =    126723 (  1%)
mult_expr       =    125067 (  1%)
convert_expr    =    114905 (  0%)
minus_expr      =    108870 (  0%)
filter_expr     =     48798 (  0%)
exc_ptr_expr    =     48798 (  0%)
bit_and_expr    =     42959 (  0%)
gt_expr         =     32576 (  0%)
le_expr         =     31018 (  0%)
lt_expr         =     22302 (  0%)
real_cst        =     18568 (  0%)
modify_expr     =     16644 (  0%)
rshift_expr     =     13704 (  0%)
lshift_expr     =     13583 (  0%)
trunc_div_expr  =     11816 (  0%)
truth_not_expr  =     11152 (  0%)
ge_expr         =     10991 (  0%)
constructor     =      9405 (  0%)
bit_field_ref   =      8874 (  0%)
float_expr      =      8584 (  0%)
rdiv_expr       =      8428 (  0%)
truth_and_expr  =      7400 (  0%)
exact_div_expr  =      7315 (  0%)
bit_ior_expr    =      7216 (  0%)
trunc_mod_expr  =      5408 (  0%)
negate_expr     =      4875 (  0%)
truth_or_expr   =      4855 (  0%)
bit_xor_expr    =      4050 (  0%)
tree_vec        =      3960 (  0%)
parm_decl       =      3260 (  0%)
result_decl     =      2015 (  0%)
abs_expr        =      1896 (  0%)
bit_not_expr    =      1530 (  0%)
obj_type_ref    =      1276 (  0%)
fix_trunc_expr  =       660 (  0%)
min_expr        =       454 (  0%)
rrotate_expr    =       439 (  0%)
max_expr        =       438 (  0%)
string_cst      =       220 (  0%)
truth_xor_expr  =        76 (  0%)
complex_cst     =        24 (  0%)
realpart_expr   =        22 (  0%)
imagpart_expr   =        22 (  0%)
unordered_expr  =         4 (  0%)
complex_expr    =         4 (  0%)
----------------------------------------------------------------------------

Not really surprising, but it is nice that the numbers are so skewed.
MODIFY_EXPR and SSA_NAMES should be "easy" to shrink.

If these codes were a simple struct that does not inherit from
'tree_common', perhaps we could save some memory.  So we could have
something like

struct gimple_stmt
{
  struct tree_common common;
  char subcode;
  gimple_op_t ops[1];
};

I'd keep gimple_stmt as a 'tree' because we'll need the usual debugging
information in tree_common and whatnot.

By far the most common operand is an SSA name.  Those need not be trees
either, they could even be an int field, as we keep them in a separate
table.  Zdenek and I have been talking about something related in the
'Not using VAR_DECLs for temporary variables' thread.  Perhaps we could
join these ideas.

The gimple_op_t structure would then be these handful of expressions.
This is roughly what I had in mind, but I have not put a lot of thought
on this.

The ugly part of the work will be fighting against the pervasive notion
that everything in the IL is a 'tree'.  Perhaps we will need to start
with unwrapping routines to interface with the tree helpers.  That
worries me a bit, as it could add unnecessary slowness.  I had a taste
of this when I converted the OMP_CLAUSE_* codes into subcodes of
OMP_CLAUSE.  ugh.

Thoughts?  Other ideas?  Anyone interested in getting a branch going or
working on an existing branch?  I think several of these ideas are worth
exploring.


Thanks.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]