This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Aliasing rules for unannotated SYMBOL_REFs


TL;DR: if we have two bare SYMBOL_REFs X and Y, neither of which have an
associated source-level decl and neither of which are in an anchor block:

(Q1) can a valid byte access at X+C alias a valid byte access at Y+C?

(Q2) can a valid byte access at X+C1 alias a valid byte access at Y+C2,
     C1 != C2?

Also:

(Q3) If X has a source-level decl and Y doesn't, and neither of them are
     in an anchor block, can valid accesses based on X alias valid accesses
     based on Y?

(well, OK, that wasn't too short either...)

The reason for asking is that memrefs_conflict_p seems to have an
odd structure.  It first checks whether two addresses based on
SYMBOL_REFs refer to the same object, with a tristate result:

      int cmp = compare_base_symbol_refs (x,y);

AFAICT the return values mean:

  1: the SYMBOL_REFs are known to be equal
  0: in-range accesses based on X cannot alias in-range accesses based on Y
 -1: all other cases

If the addresses are known to be equal, we can use an offset-based check:

      /* If both decls are the same, decide by offsets.  */
      if (cmp == 1)
        return offset_overlap_p (c, xsize, ysize);

This part seems obvious enough.  But then, apart from the special case of
forced address alignment, we use an offset-based check even for cmp==-1:

      /* Assume a potential overlap for symbolic addresses that went
	 through alignment adjustments (i.e., that have negative
	 sizes), because we can't know how far they are from each
	 other.  */
      if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
	return -1;
      /* If decls are different or we know by offsets that there is no overlap,
	 we win.  */
      if (!cmp || !offset_overlap_p (c, xsize, ysize))
	return 0;

So we seem to be taking cmp==-1 to mean that although we don't know
the relationship between the symbols, it must be the case that either
(a) the symbols are equal (e.g. via aliasing) or (b) the accesses are
to non-overlapping objects.  In other words, one of the situations
described by cmp==1 or cmp==0 must be true, but we don't know which
at compile time.

This means that in practice, the answer to (Q1) appears to be "yes"
but the answer to (Q2) appears to be "no".

This somewhat contradicts:

  /* In general we assume that memory locations pointed to by different labels
     may overlap in undefined ways.  */
  return -1;

at the end of compare_base_symbol_refs, which seems to be saying
that the answer to (Q2) ought to be "yes" instead.  Which is right?

In PR92294 we have a symbol X at ANCHOR+OFFSET that's preemptible.
Under the (Q1)==yes/(Q2)==no assumption, cmp==-1 means that either
(a) X = ANCHOR+OFFSET or (b) X and ANCHOR reference non-overlapping
objects.  So we should take the offset into account when doing:

      if (!cmp || !offset_overlap_p (c, xsize, ysize))
	return 0;

Let's call this FIX1.

But that then brings us to: why does memrefs_conflict_p return -1
when one symbol X has a decl and the other symbol Y doesn't, and neither
of them are block symbols?  Is the answer to (Q3) that we allow equality
but not overlap here too?  E.g. a linker script could define Y to X but
not to a region that contains X at a nonzero offset?

If so, and if one symbol X is an anchor symbol and the other Y has no
information, we have to assume that the linker script could point Y at
any decl in X's block (even if it can't point Y at a constant offset
from those decls).  So we'd need to skip the offset-based check in that
case at least, unless perhaps the block has a single decl.  Let's call
this FIX2.

FIX2 seems like a strange special case though.

On the other hand, if the answer to (Q2) is supposed to be "yes",
I guess we should remove the cmp==-1 offset check altogether.
Let's call this FIX3.

So it looks like there are several "sensible" possibilities:

  Q1  Q2  Q3  | Fixes       | Notes
  ------------+-------------+--------------------------
  yes no  yes | FIX1 + FIX2 | apparently the status quo
  yes yes yes | FIX3        |
  yes no  no  | FIX1        | (N1)
  yes yes no  | FIX3        | (N1)
  no  no  no  | other       | (N2)

(N1) the x_decl && !y_decl and !x_decl && y_decl cases in
     compare_base_symbol_refs are too conservative

(N2) several compare_base_symbol_refs cases are too conservative

Sorry for the overblown write-up.  I was just trying to capture all
the twisty corners I'd turned while working on this PR...

Thanks,
Richard


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]