This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: [tree ssa] DSE extension

In message <>, Dale Johannesen wr
 >In C++, it seems that an awful lot of unnecessary stack temp memory 
 >accesses are making it through the optimizers.   For example (reduced
 > from eon in SPEC):
 >class YY { public:
 >   YY(const YY &v) { e[0] = v.e[0]; e[1] = v.e[1]; e[2] = v.e[2]; }
 >   double &y() { return e[1]; }
 >   double e[3];  };
 >class XX { public:
 >   YY direction() const { return v; }
 >   YY v;  };
 >int foo(XX& r) {
 >   if (r.direction().y() < 0.000001) return 0;
 >   return 1; }
 > The temporary produced by C++ FE for the result of r.direction() is 
 > marked Addressable, as it's passed by address to r.direction().  After
 > inlining, it doesn't need to live in memory any more, but the code in
 > tree-ssa-alias.c that is supposed to figure  this out can't deal with
 > the code from y(), which looks like:
 >   this.3<D1528>_11 = (double<D53> *)&<D1502>;      <-- nameless <D1502> 
 >is the problem; address taken
 >   T.4<D1529>_12 = this.3<D1528>_11 + 8B;
 >   <D1527>_13 = (double<D53> &)T.4<D1529>_12;
 >   #   VUSE <<D1502>_21>;
 >   T.8<D1505>_16 = *<D1527>_13;                   <-- <D15202>.e[1]; 
 >doesn't really need to be in memory
 >So the dead stores survive into RTL.  In the example above the RTL 
 >optimizers remove them, but that doesn't work in eon, and that's not
 > how to do it anyway.
 > One possibility is to extend DSE to do better on stack temps, by temporarily
 > inserting phony stores in the exit block for all (nonvolatile) automatics
 > (as suggested in Morgan 10.7.2).

 >That will at least get rid of the dead stores, although I don't think 
 >it will move the element(s) that are used out of memory.  Am I missing
 >some reason this is a bad idea?  What's a good way to insert such stores?
Just a note, from what I can see this really isn't a job for DSE.  Adding
a virtual store at the exit points will not result in any additional store

Let's look at the code in more detail:
  # BLOCK 0
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  r.7_2 = (struct XX *)r_1;
  this_3 = (struct XX * const)r.7_2;
  v_4 = (struct YY &)this_3;
  #   VUSE <<D1595>_19>;
  T.0_7 = v_4->e[0];
  #   <D1595>_22 = VDEF <<D1595>_19>;
  <D1595>.e[0] = T.0_7;
  #   VUSE <<D1595>_22>;
  T.1_8 = v_4->e[1];
  #   <D1595>_21 = 
  <D1595>.e[1] = T.1_8;
  #   VUSE <<D1595>_21>;
  T.2_9 = v_4->e[2];
  #   <D1595>_20 = VDEF <<D1595>_21>;
  <D1595>.e[2] = T.2_9;
  this.3_11 = (double *)&<D1595>;
  T.4_12 = this.3_11 + 8B;
  <D1616>_13 = (double &)T.4_12;
  #   VUSE <<D1595>_20>;
  T.8_16 = *<D1616>_13;
  if (T.8_16 < 9.99999999999999954748111825886258685613938723691e-7) goto <L2>
; else goto <L3>;
  # SUCC: 2 [33.0%]  (false,exec) 1 [67.0%]  (true,exec)

Note very very carefully the VUSE D1595_20 -- also note that it follows the
last store into D1595.  Thus placing a dummy store at the exit points is _not_
going to expose any redundant stores.

The most fundamental problem here is that we lowered array notation down
to pointer arithmetic.

That in and of itself wouldn't be a huge problem, except that we choose
"this" as the base for the pointer arithmetic.  "this" is not an array
type and thus all the nice code we have to reconstruct array notation from
pointer arithmetic fails miserably.

Our inability to turn the pointer arithmetic back into array notation means
that D1595 must be marked as addressable as we need its address for the
pointer arithmetic.

And as long as D1595 is marked as addressable we're going to be forced to 
actually allocate stack space for it and store values into memory.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]