Flattening Memory Reference Trees in the GIMPLE IL

The goal of this project is to canonicalize and flatten the representation of memory references in the GIMPLE IL.

A first shot at an implementation is on a branch in SVN, svn://gcc.gnu.org/svn/gcc/branches/mem-ref. This branch has been retired and not updated for the GIMPLE tuple changes.

The idea has been simplified and a re-implementation was done on the svn://gcc.gnu.org/svn/gcc/branches/mem-ref2 branch which has been merged to trunk as rev 161655.

Current Trunk IL State

Currently there are several different tree codes that can be nested to form a single memory reference. This includes all tree codes of the tcc_reference tree code class such as COMPONENT_REF, BIT_FIELD_REF, INDIRECT_REF, ALIGN_INDIRECT_REF, MISALIGNED_INDIRECT_REF, TARGET_MEM_REF, ARRAY_REF, ARRAY_RANGE_REF, VIEW_CONVERT_EXPR, REALPART_EXPR and IMAGPART_EXPR. These are used to for example create a reference like __real a.b[c].d[e].

Several of those reference trees are used on both memory and registers, where for operation on registers partial stores are not allowed.

Operations on Registers

In addition to operating on memory arguments, REALPART_EXPR and IMAGPART_EXPR are used to access the real or imaginary part of a complex register. Stores to a complex register require the complete complex value to be rebuilt using COMPLEX_EXPR which combines the real and imaginary part of a complex value.

BIT_FIELD_REF is used in a similar way to access parts of a vector register. CONSTRUCTOR is used as the counterpart to COMPLEX_EXPR for vectors. BIT_FIELD_REF is also used to extract arbitrary bit-parts from a register. The mem-ref proposal includes a new BIT_FIELD_EXPR which can be used to combine bit-field parts.

VIEW_CONVERT_EXPR is used to interpret the bit representation of a register as a register of a different type. It is unspecified if this is allowed to change the register size. If disallowed this case needs to go through an integer mode and an intermediate BIT_FIELD_REF.

Pointer Types

Pointer types carry important information for TBAA such as the TYPE_REF_CAN_ALIAS_ALL flag. This means that we have to preserve original pointer types for the INDIRECT_REF pointer operand. Which in turn means we cannot treat value-preserving conversions of pointer types as unnecessary. Which in turn means that pointer value-equivalency is harder to detect than necessary.

Alignment Information

There is no convenient way (apart from building new pointed-to types) to annotate accesses with alignment information. Not even mentioning mis-alignment information as possible with MISALIGNED_INDIRECT_REF.

New Incremental Approach

The following is the proposal as implemented on svn://gcc.gnu.org/svn/gcc/branches/mem-ref2.

There is an imminent need to fix PR42834 which requires a way to encode an alias-set zero access to plain decls or component references on decls. TARGET_MEM_REF can be used for this as shown in a prototype patch.

The new first-step proposal is to build a new memory access tree similar to an INDIRECT_REF but addressing several current issues.

MEM_REF

Only a single MEM_REF operation is introduced, handling both previous MEM_REF and INDIRECT_MEM_REF. The way to accomplish this is to allow both SSA names and ADDR_EXPRs as operand of the memory reference (thus MEM_REF is an indirect memory access). We restrict ourselves to function invariant addresses though, thus operands according to is_gimple_val.

MEM_REFs are still sub-settable using existing component references - they can serve as a trivial substitution for INDIRECT_REFs.

MEM_REF gets an additional type, a pointer type designating the base object and the lvalue and thus the alias-set of the access. This allows value-preserving pointer conversions to be useless as effectively each MEM_REF operation has an implicit conversion to the original pointer type built-in. It also allows to explicitely encode base object alias information by subtituting a pointer-type that does not agree with the type of the MEM_REF itself. Conveniently we can encode the pointer type as the type of a constant offset operand.

The type of the MEM_REF is the type of the memory access result.

Thus sofar MEM_REF <type, ptr, const-offset> can be interpreted as

Pointer SSA names will get annotated with alignment information in their SSA_NAME_PTR_INFO structure to handle MISALIGNED_INDIRECT_REF with MEM_REF.

Old First Shot

The following is the proposal as implemented on svn://gcc.gnu.org/svn/gcc/branches/mem-ref.

Proposal for Register Operations

Operation of REALPART_EXPR, IMAGPART_EXPR, BIT_FIELD_REF and VIEW_CONVERT_EXPR is disallowed for memory operands. A new BIT_FIELD_EXPR tree code is introduced with the following semantics:

Given a word OP0, a value OP1 and a bitfield position OP2 and size OP3 within the word, produce the value that results if replacing the specified parts of the word with value. Thus,

computes the value

where BS(word) is the bitsize of word. The value computed has the same type as that of the first argument, word.

This allows for BIT_FIELD_REF stores to be decomposed into a read-modify-write cycle, using BIT_FIELD_EXPR for the modify part. In turn this allows for BIT_FIELD_EXPR and BIT_FIELD_REF operations on registers to be combined and simplified where applicable.

Proposal for Memory Operations

Overview

MEM_REF and INDIRECT_MEM_REF operations are introduced. Both have a base as operand zero and an offset as operand one. In addition to these, the effective alias-set and the alignment of the memory access are stored and can be accessed via MEM_REF_ALIAS_SET and MEM_REF_ALIGN respectively.

INDIRECT_MEM_REF <ptr, offset> loads the object of the type of the INDIRECT_MEM_REF reference at the address ptr + offset.

MEM_REF <object, offset> loads the object of the type of the MEM_REF reference at the address &object + offset.

The offset is specified in bytes, the target addresses ptr + offset and &object + offset are aligned at least to MEM_REF_ALIGN of the reference and the access conflicts with all accesses whose alias-set conflicts with the alias-set MEM_REF_ALIAS_SET of the reference.

Alias and Alignment Information

The specification of the alias-set involved in the memory access allows to simplify the TBAA code in the middle-end by pushing all language-level knowledge to the frontends, respective the get_alias_set langhook and its shared generic code. It allows for maintaining the alias-sets as given in the source while exchanging the types the memory accesses work on, like what happens during auto-vectorization or lowering of aggregate copies.

Likewise maintaining alignment information in the reference trees allows for analysis results to be preserved which should ease for example versioning for alignment in the autovectorizer. This also gets rid of the need for ALIGNED_INDIRECT_REF and MISALIGNED_INDIRECT_REF variants.

Both of these additional informations are also present in the RTL mem expressions and such can be maintained across expansion. This also should remove the need to keep MEM_EXPR and some of the flags.

Lowering Process

The GENERIC memory reference trees (which are not changed, so frontends are unaffected by this project) are lowered by pass_lower_mem.

ARRAY_REF Lowering, IDX_EXPR

It is important to preserve index, stride and size information of (multi-dimensional) array accesses for data-dependency analysis and loop optimizations to be effective. At present this information is encoded in the recursive memory reference tree and the intermediate array types referenced.

The ARRAY_REF and ARRAY_RANGE_REF accesses are lowered as follows. The index operand is converted into a byte index appropriate for MEM_REF by encoding the multiplication with the element size and the chaining with other offsets in a new expression, IDX_EXPR. IDX_EXPR behaves as a multiply-add operation while preserving operand position to give them semantics.

IDX_EXPR <offset, index, size> computes offset + (T)index * (T)size

The type T of IDX_EXPR shall match that of a POINTER_PLUS_EXPR offset, likewise the offset argument. The index and size arguments are appropriately converted to T before the multiplication (FIXME?).

Operands of IDX_EXPR should not be re-associated with other expressions, nor should the index and size operands be swapped. Constant-folding of IDX_EXPR should be limited to reducing the whole IDX_EXPR to a constant (FIXME?).

Examples

  float A[8][8];
  ...
  return A[i][j];

is lowered as

  j.2 = (unsigned int) j.1;
  D.1666 = IDX <0 + j.2 * 4>;
  i.3 = (unsigned int) i.0;
  D.1668 = IDX <D.1666 + i.3 * 32>;
  D.1660 = MEM <&A + D.1668 {0}>;
  return D.1660;

struct {
  int a : 3;
  int b : 4;
  int c : 9;
} m;
...
  return m.b;

is lowered as

  MEML.0 = MEM <&m {0}>;
  D.1185 = BIT_FIELD_REF <MEML.0, 4, 3>;
  D.1184 = (int) D.1185;
  return D.1184;

  m.b = e;

is lowered as

  D.1185 = (<unnamed-signed:4>) e;
  MEML.0 = MEM <&m {0}>;
  MEML.0 = BIT_FIELD_EXPR <MEML.0, D.1185, 4, 3>;
  MEM <&m {0}> = MEML.0;

Proposal for Canonicalizing Adresses and Objects

ADDR_EXPR gets an extra offset operand. This allows to retain invariant addresses we allow now, like &a.x[2] with a single tree code.

  ADDR_EXPR <object, offset>  <-->  POINTER_PLUS_EXPR <ptr, offset>
            |               \     /         |
            |               /     \         |
  MEM_REF <object, offset>   <-->  INDIRECT_MEM_REF <ptr, offset>

The symmetry allows simple propagation of addresses and base objects across offset chains, as well as taking addresses or dereferencing references and objects.

Advantages of MEM_REF

More intermediate values that are computed get SSA_NAMEs assigned and thus are exposed to scalar optimizers. In particular this is true for multi-dimensional array accesses where some of these values can be CSEd and are loop-invariant.

Querying offset, size and alias-set of an access is cheaper.

SCCVN can easily handle punning through unions by value numbering type-agnostic and inserting VIEW_CONVERT_EXPRs at replacement time where needed.

Alignment information derived from VRP or by versioning can be kept associated to the memory accesses.

bit-field lowering makes manual bit-field implementation and structure component bit-fields canonical.

Storing the effective alias-set allows for more propagation of address expressions, simplifying the IL.

TARGET_MEM_REF can be simply represented as INDIRECT_MEM_REF <BASE, IDX <OFFSET, INDEX, STEP> > or MEM_REF <SYMBOL, IDX <OFFSET, INDEX, STEP> >.

Disadvantages of MEM_REF

  struct { int a[3]; int b; } m;

With MEM_REF we cannot easily say that m.a[i] does not alias m.b. gcc.dg/tree-ssa/alias-10.c is exactly this case. The proposal is to keep persistent value range information for i to handle this case. Zdeneks earlier proposal included the array size per array ref operand (http://gcc.gnu.org/ml/gcc/2007-04/msg00096.html).

Open Questions

We preserve IDX_EXPRs once they are not fully constant. So we even get things like

  D.1189_4 = D.1188_2 + i_3(D);
  D.1193_5 = IDX <0 + D.1189_4 * 1>;
  D.1190_6 = MEM <char {0}, &regs_ever_live + D.1193_5>;

where D.1193_5 is a simple (obfuscated) copy of D.1189_4. Still it makes it easier to recognize an array access this way. And it would be consistent with the IL for a constant but non-1 stride.

The whole issue about invariant addresses with offsets is still open. At the moment &a p+ CST is considered a GIMPLE invariant just like &a[CST] is in the current GIMPLE IL. But this is a nested tree (ADDR_EXPR inside POINTER_PLUS_EXPR) which is not too nice, a ADDR_EXPR with an offset operand or a ADDR_PLUS_EXPR would solve this (minor) issue.

Branch Status

The branch bootstraps ok for C, C++ and Fortran on x86_64 with multilibs on. The C execute.exp and compile.exp tortures are clean (the builtins.exp torture shows missed optimizations). At this point no regressions in the C testsuite should be checked for any patch.

Implementation TODO

The following still needs to be done ontop of the current patch:

The following basic optimizations are missing:

The following optimization passes need rewrite or adjustments:

None: MemRef (last edited 2010-07-01 09:07:27 by RichardGuenther)