This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Revision 107218 changed addressing mode generation


On Sun, 29 Mar 2009, Richard Guenther wrote:

> On Sun, Mar 29, 2009 at 10:46 PM, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> > Richard Guenther wrote:
> >>
> >> I think CSE added/removed by either canoncial form is less important
> >> than things like SCEV analysis being unaffected and IVOPTs still
> >> creating proper induction variables on the tree level.
> >>
> >> Of course a canonical form should minimize the number of operations,
> >> so folding i * 4 + j * 4 to (i + j) * 4 should still be done. ?But
> >> folding i * 4 + j * 2 to (i * 2 + j) * 2 and not i * 4 + 4 to (i + 1) * 4
> >> looks strange.
> >>
> >> I have been resistant to change this canonicalization simply because
> >> the current one seems to work well with the tree loop optimizers and
> >> date-dependence analysis nowadays, thus also my hint at doing
> >> such change during stage1 instead of stage4 (after all this "problem"
> >> now exists since 4.1 ...?)
> >>
> >> Implementing the change should be straight-forward (we should
> >> make sure the tree reassociation pass agrees with it).
> >
> > So, how do we proceed? ?IMO the canonical form should be that addition of a
> > constant is always the outermost operation. ?That seems to be what the
> > EXPAND_SUM machinery expects, and what extract_muldiv wants to do. It's also
> > IMO most likely to be helpful when generating addressing modes.
> >
> > The patch below should do that, but unfortunately:
> > +FAIL: gcc.dg/vect/vect-103.c scan-tree-dump-times vect "dependence distance
> > modulo vf == 0" 1
> 
> *((int *)p + x + i) = *((int *)p + x + i + 8);
> 
> > +FAIL: gcc.dg/vect/no-vfa-vect-102.c scan-tree-dump-times vect "possible
> > dependence between data-refs" 1
> 
> *((int *)p + x + i + 1) = *((int *)p + x + i);
> 
> these are exactly cases that I was concerned about.

Ok, the main issue is that with the canonicalization

 *((int *)p + (x + i + 8) * 4)

we can re-construct an array access from

  D.2535_23 = &p_5->a[0];
  D.2536_24 = (long unsigned int) x_17(D);
  D.2537_25 = (long unsigned int) i_2;
  D.2538_26 = D.2536_24 + D.2537_25;
  D.2541_27 = D.2538_26 + 8;
  D.2542_28 = D.2541_27 * 4;
  D.2543_29 = D.2535_23 + D.2542_28;
  D.2544_30 = *D.2543_29;

during forwprop (p_5->a[D.2541_27]) but with the canonicalization

 *((int *)p + (x + i) * 4 + 32)

this fails because in

  D.2535_23 = &p_5->a[0];
  D.2536_24 = (long unsigned int) x_17(D);
  D.2537_25 = (long unsigned int) i_2;
  D.2538_26 = D.2536_24 + D.2537_25;
  D.2539_27 = D.2538_26 * 4;
  D.2541_28 = D.2539_27 + 32;
  D.2542_29 = D.2535_23 + D.2541_28;
  D.2543_30 = *D.2542_29;

the offset addition to the base pointer is not multiplied by the
array element size.

Thus later data dependence analysis gets confused.

To fix this in forwprop (and CCP for the invariant address case) we
would need to apply the re-association there, producing new
intermediate stmts, etc. - not necessarily a good idea and a non-trivial
task anyway.

The problem with data-dependence analysis is that for the different
canonicalization we have two DRs with completely different bases:

for the non-array form:

        base_address: (int *) p_5 + (long unsigned int) pretmp.24_57 * 4
        offset from base address: 0
        constant offset from base address: 32
        step: 4
        aligned to: 128
        base_object: *((int *) p_5 + (long unsigned int) pretmp.24_57 * 4)

and for the other access which is in proper array form:

        base_address: p_5
        offset from base address: (<unnamed-signed:64>) ((long unsigned 
int) pretmp.24_57 * 4)
        constant offset from base address: 0
        step: 4
        aligned to: 4
        base_object: p_5->a[0]

where at least the base_address of the non-array form looks bogus (the
offset part should be in offset really).

Thus, I'm trying to have a look at data-dependence analysis.

But - given the above - wouldn't it be easier to fix the MULT_EXPR
case in expand_expr_real_1 to do the un-canonicalization of
(X + CST1) * CST2 to X * CST2 + CST3?

Thanks,
Richard.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]