This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [4.5] Find more autoinc addressing for induction variables


On Fri, May 15, 2009 at 1:08 PM, Bernd Schmidt <bernds_cb1@t-online.de> wrote:
> Steven Bosscher wrote:
>
>> But in any case I think the patch should not go onto the trunk in its
>> current form for other technical reasons.
>
> And these would be...?

Some examples:

(1) You don't update the pass description at the head of the file, so
you make the pass do something without explaining that in the pass
description.  This is a minor issue, but it is important that the pass
description and implementation match.

(2) Some old-school-isms in the patch (s/BLOCK_NUM/BLOCK_FOR_INSN/ for example).

(3) I still don't understand what happens with the reg_next_* arrays
if you do transformations across basic block boundaries.  To me it
looks like you may end up filling them with insns from different basic
blocks.  I don't know if that is a problem (it probably works thanks
to the "lazy cleanup"), but it is certainly not what was intended in
the code as it is now. Have you looked at this?

(4) You explicitly only allow FORM_POST_INC. Why? (Well, probably
because it is the only interesting case for bfin, but you know what I
mean ;-)

(5) Have you looked at compile time of the painful test cases from the
PRs about the cost of use-def and def-use chains? Comparing just the
test suite and compiler itself is apparently not enough, or we would
not have had those PRs in the first place (for fwprop, Paulo and I
looked at the testsuite, bootstrap times, nillrock, spec2k, etc. and
we never saw the problem). See http://gcc.gnu.org/PR33928 as an
example. From the time report:

 df reaching defs      :  73.40 (21%) usr   6.51 (24%) sys  79.91
(21%) wall       0 kB ( 0%) ggc
 df use-def / def-use chains:  37.99 (11%) usr   9.94 (37%) sys  47.92
(12%) wall       0 kB ( 0%) ggc

(And note that this only looks bad instead of horrible because "tree
CFG cleanup" and "alias stmt walking" together take 37% of the compile
time, too. Without that, RD and UD/DU would be 32% for the RD problem
and 17% for the use-def / def-use problem, or ~50% of the total
compile time of this test case.)


> ?So far you've only given alternatives which aren't
> equivalent and wouldn't work.

And you have not proven that your proposed patch is correct and
beneficial, and that there is no easier way to implement the checks
you are doing.

I am just trying to think about how this can be implemented with
standard cfg analysis instead of your CFG walks.  If you don't want me
to try and help, just say so and I'll just shut up.  But I hope no-one
will approve your patch as-is. And in the mean time, I'll just try to
think of a way to implement this idea of yours in a way that is easier
to understand and hopefully cheaper too.

I still think we could use reg-def and reg-use chains instead of
def-use and use-def chains, for example.


>> FWIW I do understand the important of this optimization and I know
>> auto-inc/dec support in GCC should be improved. But there has to be a
>> cheaper and easier, more generic way
>
> Feel free to suggest one if one occurs to you that actually works. ?As long
> as you don't have a realistic suggestion, "there has to be" isn't patch
> review, it's wishful thinking and obstructionism. ?I'm trying to solve a
> long-standing problem with gcc here.

I'm not trying to obstruct anything. You're acting as if I would have
the powers to obstruct this patch, where in fact I can't approve or
reject the patch. If someone wants to approve your patch in this form,
then so be it.


> We could look for increment/memory reference pairs in the other order, but
> given how most loops are structured, that seems unlikely to trigger very
> often - bivs tend to be incremented at the end.

OK, so perhaps it is not worth it (would deserve a comment, still).
And adding FORM_POST_ADD should not be very difficult

>>> x = &y[0];
>>> for (i = 0; i < n; i++) {
>>> ?for (j = 0; j < m; j++) {
>>> ? foo (*x);
>>> ? if (bar ())
>>> ? ? break;
>>> ? x++;
>>> ?}
>>> }
>>>
>>> If the outer loop isn't there, or the initialization is moved into it, we
>>> can use autoincrement, but here we need to avoid doing so.
>>
>> Would things simplify if you can make sure that B and C are in the
>> same loop? ?This is easy to test (they must have the same
>> loop_father).
>
> That's a necessary condition, and I've experimented with using it to bail
> out early if it isn't true. ?However, it didn't seem to have any effect on
> run time.

OK, that's not really unexpected (the biggest additional costs of your
patch should be in the calculation of the RD and UD/DU problems). But
it brings me back (sorry ;-) to my post-dominator question.

For the example of your patch:

     A
     |
     B <---\
     |     |
     ...   |
     |     |
     C ----/
     |
     ...

AFAICS it would be sufficient to know that B and C are in the same
loop, and C post-dominates B.

For the other example you gave:

     A
     |
     B <-----\
     |       |
     D <---\ |
     |     | |
     C ----/ |
     |       |
     F ------/
     |

The post_inc would not happen because B and C would not be in the same loop.

And for the code example:

x = &y[0];
for (i = 0; i < n; i++) {
 for (j = 0; j < m; j++) {
  foo (*x);
  if (bar ())
    break;
  x++;
 }
}

==> something like this:

     A
     |
     D <-----\
     |       |
     B <---\ |
     |     | |
  /--G     | |
  |  |     | |
  |  C ----/ |
  |  |       |
  \->F ------/
     |


The post_inc would not happen because C does not post-dominate B.

Looking forward to your next counter-example. I was never good at
picturing all the possible situations :-)


> I'll send another one out eventually; I'll probably try for another
> improvement in the heuristic first. ?For testing on the Blackfin, you can
> use either one of the versions I already posted.

OK, thanks.  I want to play with that patch a bit, to try for myself
whether my suggestions actually work. I should have some time for that
this weekend.

Ciao!
Steven


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]