This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: making the new if-converter not mangle IR that is already vectorizer-friendly

[Abe wrote:]
After finishing fixing the known regressions, I intend/plan to reg-test for AArch64;
after that, I think I`m going to need some community help to reg-test for other ISAs.

[Alan wrote:]
OK, I'm confused. When you write "making the new if-converter not mangle IR"...
> does "the new if-converter" mean your scratchpad fix to PR46029

Basically, yes.  The code I currently have is, IMO, sufficiently-{changed/rewritten/replaced}
relative to {the code currently in the file at that path in trunk} that I am referring to
the pre-patch code in "tree-if-conv.c" as "the old if converter" and referring to the code
in the version of the file at that path in my Git GCC checkout as "the new if converter".

[Alan wrote:]
or is there some other new if-conversion phase that you are still working on and haven't posted yet?

No; as of now, Sebastian and I are not working on adding any more passes or phases: we are just trying to
replace the old, potentially-unsafe, gives-up-more-easily converter with the new one that uses the scratchpad
idea both to produce safer conversions and to be able to convert a greater percentage of the time.

The only significant difficulty with the preceding at this time, AFAIK, is that the work-in-progress of the
new converter produces {too many levels deep} of indirections, so the vectorizer gives up trying to vectorize
the result[s] of the conversion, which makes the whole thing a big "fail" [not in the DejaGNU sense,
although pretty close].  That`s why I would love to have some help in fixing the regressions.

[Alan wrote:]
> I haven't yet understood what you mean about "vectorizer-friendly" IR being mangled; is the problem that your
> new [â] transforms IR that can currently be if-converted by the existing phase, into IR that can't? (Example?)

Not exactly, but close.

"TLDR" for the following [from "vvv" to "^^^"]: "too much unnecessary indirection is being added".

--- vvv --- TLDR`d above --- vvv ---

Let`s pretend the programmer of the code being compiled knows all about if-conversion and vectorization and does the
conversion _manually_.  In other words, the C/C++/etc. code passed in to the compiler _already_ looks something like:

  temp0 = compute_condition();
  temp1 = foo();
  temp2 = bar();
  result = temp0 ? temp1 : temp2;

IOW, essentially IR/assembler written in C.  Since the C code is lowered to
GIMPLE and the '?' operator is lowered [too early IMO *] to blocks and "goto"s,
the if-conversion pass no longer "sees" "result = temp0 ? temp1 : temp2;" --
instead, it sees something like:

  if temp0
    goto <bb 4>
    goto <bb 5>

  /* --- bb 4 --- */
  result = temp1;
  goto <bb 6>

  /* --- bb 4 --- */
  result = temp2;

  /* --- bb 6 --- */

â which the old if converter handles in such a way as to
keep the vectorizer happy, but the new one does not yet.

'*': another issue to discuss and something that IMO should be
     fixed/improved in GCC but outside of "tree-if-conv.c"

The new if converter is doing something at least similar to "ooh!  Ooh!
The programmer wrote a valid-conversion-candidate ''if'' that I know how to convert!"
and messes it up; IIRC the way in which this gets messed up is something like:

  _ifcvt_temp_0 = temp0;
  _ifcvt_temp_1 = temp1;
  _ifcvt_temp_2 = temp2;
  result = _ifcvt_temp_0 ? _ifcvt_temp_1 : _ifcvt_temp_2;

â at which the vectorizer turns up its nose, says "too much indirection" and/or "too much gather",
and doesn`t vectorize --  for a {source code, target architecture} pair for which the code can
and should be vectorized by GCC.  In particular, the old vectorizer handles it well IIRC,
probably b/c the old vectorizer has less indirection overhead, which it can
easily "afford" since it never adds an indirection through a scratchpad.

I think what is needed here is basically to reduce the indirection overhead in the new converter
by making the new converter realize "oh, <foo> is already pure and thread-local, so I don`t need
to copy <foo> into a temporary before using it".  At least, that seems to me like _one_ way to
fix this category of regressions in the new converter.  Another way is to make GCC overall not
convert e.g. "x ? y : z" into basic blocks etc. so early all of the time; my impressions is
that GCC is not doing both of {inspecting the purity, analyzing the cost} of the expressions
and only converting into basic blocks etc. when e.g. 'y'/'z'/both is/are impure and/or
at least one of {'y', 'z'} is an "expensive" thing to compute.  For low-cost pure expressions,
e.g. "x ? y : z" should be retained as-as IMO -- i.e. not lowered into separate BBs --
for as long as possible.  Ideally, it is encoded as a "COND_EXPR" and stays that way for
as long as possible when there is no purity/high-computational-cost problem with either the
second or the third param.  In the worst case, this could involve fixing/improving several
front ends, so I want to push this off into a separate subproject from the if conversion itself.

--- ^^^ --- TLDR`d above --- ^^^ ---

I hope the above is helpful, but since it`s from memory I don`t
guarantee that it`s both 100% accurate and 100% complete.  ;-)

[Alan wrote:]
Then I might (only "might", sorry!) be able to help...

Great!  Thanks.  :-)

Ideally, I/we fix the above problem -- and the rest of the regressions in the new if converter --
without any significant changes to core GCC files outside of "tree-if-conv.c".  IOW, I`d like
to minimize the invasiveness of this patch and get rid of {as many regressions as possible}
the fixes for which lie entirely inside "tree-if-conv.c" before proceeding to fix/improve the
"lowered too early" problem that I perceive current GCC trunk has in its lowering of "x ? y : z".
I`m almost 100% certain that fixing/improving the lowering will require significant alterations
to code in files other than "tree-if-conv.c", so even though that fix/improvement would likely
fix at least one regression in the new if converter, I`d rather do it separately/later/both.

In particular, I remember that "result = condition ? array1[index] : array2[maybe the same index, maybe not]"
is being converted too early IMO.  IOW, somewhere in GCC an array dereference is being considered
as either impure, too-expensive, or both.  "array[index]" in C [not in C++!: operator overloading]
AFAIK is always pure and is always low-cost whenever the index expression is pure and low-cost.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]