[Design notes, RFC] Address-lowering prototype design (PR46556)

Fri Jun 10 15:47:00 GMT 2011

On Tue, 2011-06-07 at 16:49 +0200, Richard Guenther wrote:
> On Tue, Jun 7, 2011 at 4:14 PM, William J. Schmidt
> <wschmidt@linux.vnet.ibm.com> wrote:

<snip>

> >> > Loss of aliasing information
> >> > ============================
> >> > The most serious problem I've run into is degraded performance due to poorer
> >> > instruction scheduling choices.  I tracked this down to
> >> > alias.c:nonoverlapping_component_refs_p.
> >> >
> >> > This code proves that two memory accesses don't overlap by attempting to prove
> >> > that they access different fields of the same structure.  This is done using
> >> > the MEM_EXPRs of the two rtx's, which record the expression trees that were
> >> > translated into the rtx's during expand.  When address lowering is not
> >> > present, a simple COMPONENT_REF will appear in the MEM_EXPR:  x.a, for
> >> > example.  However, address lowering changes the simple COMPONENT_REF into a
> >> > [TARGET_]MEM_REF that is no longer necessarily identifiable as a field
> >> > reference.  Thus the aliasing machinery can no longer prove that two such
> >> > field references are disjoint.
> >> >
> >> > This has severe consequences for performance, and has to be dealt with if
> >> > address lowering is to be successful.
> >> >
> >> > I've worked around this with an admittedly fragile solution; I'll discuss the
> >> > drawbacks below.  The idea is to construct a mapping from replacement mem_refs
> >> > to the original expressions that they replaced.  When a MEM_EXPR is being set
> >> > during expand, we first look up the mem_ref in the mapping.  If present, the
> >> > MEM_EXPR is set to the original expression, rather than to the mem_ref.  This
> >> > essentially duplicates the behavior in the absence of address lowering.
> >>
> >> Ick.  We had this in the past via TMR_ORIGINAL which caused all sorts
> >> of problems.  Removing it didn't cause much degradation because we now
> >> preserve points-to information.
> >>
> >> Originally I played with lowering all memory accesses to MEM_REFs
> >> (see the old mem-ref branch), and the loss of type-based alias
> >> disambiguation was indeed an issue.
> >>
> >> But - I definitely do not like the idea of preserving something similar
> >> to TMR_ORIGINAL.  Instead we can try preserving some information
> >> we derive from it.  We keep the original access type that we can use
> >> for TBAA but do not retain knowledge on whether the type of the
> >> MEM_REF is valid for TBAA or if it is view-converted.
> >
> > Yes, I really don't like what I have at the moment, either.  I put it in
> > place as a stopgap to let me proceed to look for other performance
> > problems.
> >
> > The question is how we can infer useful information for TBAA from the
> > MEM_REFs and TMRs.  I poked at trying to identify types and offsets from
> > the MEM_EXPRs, but this ended up being useless; I had to constrain too
> > many cases to maintain correctness, and couldn't prove the type
> > information for the important cases in SPEC I was trying to address.
> >
> > Unfortunately, the whole design goes down the drain if we can't find a
> > way to solve the TBAA issue.  The performance degradations are too
> > costly.
> 
> If you look at what basic TBAA the alias oracle performs then it boils
> down to the fact that get_alias_set for a.b.c might end up using the
> alias-set of the type of C but for MEM[&a + 4] it will use the alias set
> of the type of a.  The tree alias-oracle extracts both alias sets, that
> of the outermost valid type and that of the innermost as both are
> equally useful.  But the MEM_REF (or TARGET_MEM_REF) tree
> only have storage for one such alias-set.  Thus my idea at some point
> was to store the other one as well in some form.  It will not be
> the full information (after all, the complete access path does provide
> some extra information - see aliasing_component_refs_p).

This is what concerns me.  TBAA information for the outer and inner
components doesn't seem sufficient to provide what
nonoverlapping_component_refs_p is currently able to prove.  The latter
searches for a common RECORD_TYPE somewhere along the two access paths,
and then disambiguates if the two associated referenced fields differ.
For a simple case like "struct x { int a; int b; };", a and b have the
same type and alias-set, so the alias-set information doesn't add
anything.  It isn't sufficient alone for the disambiguation of x1.a =
MEM_REF[&x1, 0] and x2.b = MEM_REF[&x2, 4].

Obviously the offset is sufficient to disambiguate for this simple case
with a common base type, but when the shared record types aren't at the
outermost level, we can't detect whether it is.

At the moment I don't see how we can avoid degradation unless we keep
the full access path around somewhere, for [TARGET_]MEM_REFs built from
COMPONENT_REFs.  I hope I'm wrong.

> 
> Btw, I'm looking at lowering bitfield accesses to read-modify-write
> cycles and in that context also to lowering unaligned accesses
> for targets that do not support them.
> 
> Richard.
> 
> > <snip>
> >
> > Thanks,
> > Bill
> >
> >