This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Operator Strength Reduction on trees


Hello,

> > All the operations can be implemented
> > by a few lines of code using the existing infrastructure.
> > find_or_create_biv_with_evolution just searches the loop of EV
> > for the biv with this evolution...
> 
> What is a BIV and how is it different from an IV?
>
> It is not sufficient to create a new IV with the specified evolution,
> because there are many ways to get the same evolution.  You have to
> make sure the new IV has updates in the right places, so the uses of
> the new IV will have the right values.

I think you may just put all the updates at the end of the loop (or just
before the loop exit).

> > > So perhaps
> > > the main advantage is that this algorithm is implemented now.
> > Ugh... I can implement the described scev-based algorithm in about half
> > an hour...
> > I think it is pretty clear from the metacode I provided; I can send
> > you a full implementation if you have a problem with that.
> 
> I don't have a problem, but I would like to see the full
> implementation. In part because of what I wrote above, I don't think
> it will be so trivial. If I am wrong I would like to learn, and the
> best way is probably to see your code.

OK, I will write down some prototype tomorrow.

> > > It hasn't failed any tests I've
> > > been able to find. What kinds of "conversions and ... tricky parts" do
> > > you expect would be a challenge for it?
> > 
> > See tree-chrec.c:convert_affine_scev, for example.
> 
> Everything there appears to be dealing with the possible wrapping of
> unsigned types. In the OSR implementation, we simply don't bother
> doing LFTR on unsigned types.

You will miss a lot of opportunities, that way.

> > > For you to say that it is not
> > > already useful seems kind of inflammatory, don't you think? Why do you
> > > think the analysis "would have to be improved to make it useful"?
> > It does not handle many of the cases that appear in the real-world
> > code.
> 
> I can't seem to elicit a failure with any code I've found or written,
> "real-world" or otherwise. If you have an example, that might help me.

I do not speak about failures, just about missed optimization
opportunities.

> > > ... Examples are trivial to find, though. Any loop
> > > that needs LFTR will be an example.
> > Well, hardly any, we do LFTR in many cases already.  Again, could
> > you please provide some of the concrete examples?
> 
> OK, I am appending some at the bottom of this mail. One example is
> that the preprocessed C source I've appended, the function
> s_should_lftr_loop gets LFTR if and only if OSR is enabled.

Thanks, I will have a look.

> > 2) Add a separate pass to do LFTR, and remove this ability from
> >    ivopts; i.e., something similar to your pass, but without the
> >    redundant induction variable analysis and strength reduction
> >    parts.
> 
> It is not at all obvious how to do this because LFTR is so
> conceptually bound together with IV analysis. I don't think a
> standalone LFTR pass makes sense because it will necessarily have to
> duplicate work done by other passes (strength reduction in
> particular).

I do not see how LFTR is "conceptually bound together with IV analysis" --
it uses the results of IV analysis, but that's it.

> > 3) As a part of rewrite of ivopts -- I imagine we would like to
> >    have ivopts rewritten as a series of passes working over some
> >    common data structures representing the actual state of the
> >    induction variables and their uses, and LFTR would be one of these
> >    passes.
> 
> The present OSR could potentially fit into such a scheme.

Maybe, however, we would either have to throw away everything we
have done so far and start building on the framework introduced in OSR,
or suffer the code duplication it introduces.  None of that appears too
appealing to me.

> As I've said before, I think the right answer is to integrate SR and
> LFTR with PRE,

As I said before, I will believe that when I see it done :-)

Zdenek

> # 1 "time_lftr.c"
> # 1 "<built-in>"
> # 1 "<command-line>"
> # 1 "time_lftr.c"
> # 1 "time_lftr.h" 1
> 
> 
> 
> int s_should_lftr_loop(signed int m, signed int n);
> int u_should_lftr_loop(unsigned int m, unsigned int n);
> int d_should_lftr_loop(unsigned int m, unsigned int n);
> int hand_lftr_loop(unsigned int m, unsigned int n);
> int hand_lftr_combine_loop(unsigned int m, unsigned int n);
> # 2 "time_lftr.c" 2
> 
> 
> int a[(1024 * 1024)], b, c;
> 
> extern int f(int x, int y);
> 
> int
> u_should_lftr_loop(unsigned int m,
>                    unsigned int n) {
>   unsigned int i, j;
> 
>   for (j = 0; j < m; j += 1) {
>     for (i = 0; i < n; i += 1) {
>       if (f(a[j * n + i], b)) {
>         return a[j * n + i] + b;
>       }
>     }
>   }
>   return 0;
> }
> 
> int
> s_should_lftr_loop(signed int m,
>                    signed int n) {
>   signed int i, j;
> 
>   for (j = 0; j < m; j += 1) {
>     for (i = 0; i < n; i += 1) {
>       if (f(a[j * n + i], b)) {
>         return a[j * n + i] + b;
>       }
>     }
>   }
>   return 0;
> }
> 
> 
> extern int g(double q, int y);
> 
> int
> d_should_lftr_loop(unsigned int m,
>                  unsigned int n) {
>   signed int j;
>   double i;
> 
>   for (j = 0; j < m; j += 1) {
>     for (i = 0.0; i < (double) n; i += 1.0) {
>       g(i * 4.0, b);
>       if (f(a[j * n + (int) i], b)) {
>         return a[j * n + (int) i] + b + g(i, b);
>       }
>     }
>   }
>   return 0;
> }
> 
> int
> hand_lftr_loop(unsigned int m,
>                unsigned int n) {
>   int *pa;
> 
>   for (pa = a; pa != a + (m * n); ) {
>     int *pa_end = pa + n;
>     for ( ; pa != pa_end; pa += 1) {
>       if (f(*pa, b)) {
>         return *pa + b;
>       }
>     }
>   }
>   return 0;
> }
> 
> 
> int
> hand_lftr_combine_loop(unsigned int m,
>                        unsigned int n) {
>   int *pa = a;
> 
>   for (pa = a; pa != a + (m * n); pa += 1) {
>     if (f(*pa, b)) {
>       return *pa + b;
>     }
>   }
>   return 0;
> }


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]