This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: making the new if-converter not mangle IR that is already vectorizer-friendly


[Richard wrote:]
Yes, the GIMPLE if-converter should strictly be a vectorization enabler.
If the "new" if-converter produces code that the vectorizer does not
handle (on the target targeted) then it has done a bad job.

Understood [or at least I _think_ I did ;-)] and agreed.

OTOH, there _are_ things that apparently the vectorizer
_could_ do better, esp. handling scatter/gather.
That improvement might be pending new vector
instructions in the target ISA[s], though...

... YMMV.  ;-)


[Richard wrote:]
I think I fixed this bug.  It was because of an inverted test.
2015-07-10  Richard Biener  <rguenther@suse.de>
         PR tree-optimization/66823
         * tree-if-conv.c (memrefs_read_or_written_unconditionally):
           Fix inverted predicate.

I can confirm that as of that just after that commit, the bug is gone.

If I understand correctly, the bug was in the analysis leading to
the decision to convert or not, such that the fix makes the old
[pre-scratchpad] converter decide "don`t convert".  Is that correct?

Would you like me to include the DejaGNUified form of the related new test case
in the patch even though this bug no longer exists in the "old" converter?
Maybe it`s nice to keep the test for catching possible future regressions.


[Richard wrote:]
It was known to produce store data-races with
> -ftree-loop-if-convert-stores.
That is something the scratchpad use avoids
(at the expense of a lot(?) of vectorization).

TTBOMK, that "expense" is temporary, pending fixes/improvement
to the new converter.  IOW, I don`t think any of the relevant
regressions are as a result of "essential complexity".


[Richard wrote:]
First of all to fix regressions on the load if-conversion side
> you should stop using a scratchpad for loads (do you have one?
> I seem to remember you do from the design description).

Yes, we do have one.

I respectfully disagree with the suggestion to remove all scratchpad use for loads.
That use is what allows us to if-convert -- and therefor vectorize -- expressions that
may result in invalid pointer dereferences for "condition is false" iterations/columns.
AFAIK the old converter -- due to lack of a scratchpad -- must give up and not
convert such cases, thus resulting in code of that kind not getting vectorized.

As of now, the relevant "improvement" in/due-to the new converter is not always
effective, i.e. it does not always result in a vectorization, due to the
current implementation of the scratchpad mechanism being too generic and
adding indirection that confuses the vectorizer.  We intend to fix this.


[Richard wrote:]
Then you should IMHO keep the old store path
> for --param allow-store-data-races=1.

I have discussed this with Sebastian;
he and I formed a plan to allow the following modes:

  * no if conversion [note: %%]
  * if-conversion of loads            only, the old way
  * if-conversion of           stores only, the old way
  * if-conversion of loads and stores,      the old way
  * if-conversion of loads            only, with scratchpads
  * if-conversion of           stores only, with scratchpads
  * if-conversion of loads and stores,      with scratchpads

"%%": to retain as the default in trunk GCC until we have
      enough improvement, e.g. a cost model to decide whether
      or not it`s good to if-convert a particular loop


Once all the known regressions in the new converter have been
fixed and the numbers [e.g. from SPEC] of loops vectorized
looks good, I expect/hope that it will be possible to
remove the old if converter in favor of the new one.

I expect/hope that we will {enable {if conversion} by default
when autovectorization is enabled} once we have a cost model that
prevents {if conversion that is inadvisable due to it promoting
from conditional to unconditional the evaluation of [pure]
expression[s] that is/are too expensive, thereby making the
conversion a "pessimization" instead of a true optimization}.


[Richard wrote:]
I'd like to see a comparison in the number of vectorized loops for SPEC CPU 2006 FP
> with -Ofast -ftree-loop-if-convert-stores with / without the new
> if-converter (and if you like also without -ftree-loop-if-convert-stores).
> You can use -fopt-info-vec and grep for the number of 'loop vectorized' cases.

Please consider it added to my "TO DO" list.
I`ll report back when I have results.

Regards,

Abe


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]