This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gsoc 2014] moving fold-const patterns to gimple


On Fri, Mar 14, 2014 at 4:31 PM, Prathamesh Kulkarni
<bilbotheelffriend@gmail.com> wrote:
> On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni
>>> <bilbotheelffriend@gmail.com> wrote:
>>>> Hi Richard,
>>>> Sorry for the late reply. I would like to have few clarifications
>>>> regarding the following points:
>>>>
>>>> a) Pattern matching: Currently, gimple_match_and_simplify() matches
>>>> patterns one-by-one. Could we use a decision tree to do the matching
>>>> instead (similar to insn-recog.c) ?
>>>> For the moment, let's consider pattern matching on only unary
>>>> expressions without valueize and predicates:
>>>> pattern 1: (negate (negate @0))
>>>> pattern 2: (negate (bit_not @0))
>>>>
>>>> from the two AST's corresponding to patterns (match::op), we can build
>>>> a decision tree:
>>>> Some-thing similar to:
>>>>                        NEGATE_EXPR
>>>>         NEGATE_EXPR            BIT_NOT_EXPR
>>>>
>>>> and then generate code corresponding to this decision tree in gimple-match.c
>>>> so the generated code should look something similar to:
>>>>
>>>> tree
>>>> gimple_match_and_simplify (enum tree_code code, tree type, tree op0,
>>>> gimple_seq *seq, tree (*valueize)(tree))
>>>> {
>>>>   if (code == NEGATE_EXPR)
>>>>     {
>>>>       tree captures[4] = {};
>>>>       if (TREE_CODE (op0) != SSA_NAME)
>>>>         return NULL_TREE;
>>>>       gimple def_stmt = SSA_NAM_DEF_STMT (op0);
>>>>       if (!is_gimple_assign (def_stmt))
>>>>         return NULL_TREE;
>>>>       tree op = gimple_assign_rhs1 (def_stmt);
>>>>       if (gimple_assign_rhs_code (op) == NEGATE_EXPR)
>>>>         {
>>>>            /* pattern (negate (negate @0)) matched */
>>>>         }
>>>>       else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR)
>>>>         {
>>>>            /* pattern (negate (bit_not_expr @0)) matched */
>>>>         }
>>>>       else
>>>>            return NULL_TREE;
>>>>      }
>>>>   else
>>>>      return NULL_TREE;
>>>> }
>>>>
>>>> For commutative ops, the pattern can be duplicated by walking the
>>>> children of the node in reverse order.
>>>> (I am not exactly clear so far about representing binary operators in a decision
>>>> tree) Is this the right way to go ? I shall try to shortly post a patch that
>>>> implements this.
>>>
>>> Yes, that's the way to go (well, I'd even use a switch ()).
>>>
>>>> b) Targeting GENERIC, separating AST from gimple/generic:
>>>> For generating a GENERIC pattern should there be another pattern
>>>> something like match_and_simplify_generic ?
>>>
>>> Yes, there is an existing API in GCC for this that operates on GENERIC.
>>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc.  The interface
>>> the GENERIC match_and_simplify variant provides should match
>>> that one.
>>>
>>>> Currently, the AST data structures (operand, expr, etc.)
>>>> are tied to gimple (gen_gimple_match, gen_gimple_transform).
>>>> We could also have similar functions: gen_generic_match,
>>>> gen_generic_transform for generating GENERIC ?
>>>
>>> Yeah, but I'm not sure if keeping the (virtual) methods for generating
>>> code makes much sense with a rewritten code generator.
>>>
>>>> Instead will it be better if we separate the AST
>>>> from target IR (generic/gimple) and make simplify a visitor on AST
>>>> (make simplify
>>>> abstract class, with simplify_generic and simplify_gimple visitor
>>>> classes that generate corresponding IR code) ?
>>>
>>> Yes.  Keep in mind the current state of genmatch.c is "quick hack
>>> to make playing with the API side and with patterns possible" ;)
>>>
>>>> c) Shall it be a good idea in define_match <name> <pattern>, for
>>>> name to act as a substitute for pattern (similar to flex pattern
>>>> definitions), so the name can be used in bigger patterns ?
>>>
>>> Maybe, I suppose we'll see when adding more patterns.
>>>
>>>> d) This is silly, but maybe use constants to denote corresponding tree nodes ?
>>>> for example instead of { build_int_cst (integer_type_node, 0); }, one
>>>> could directly write 0, to denote a INTEGER_CST node with value 0.
>>>
>>> Yes, that might be possible - though it will require more knowledge
>>> in the pattern matcher (you also want to match '0'?) and the code
>>> generator.
>>>
>>>> e) There was a mention on the thread, regarding testing of patterns
>>>> integrated into DSL. I wasn't able to understand that clearly. Could
>>>> you explain that briefly ?
>>>
>>> DSL?  Currently I'd say it would be nice to make sure each pattern
>>> is triggered by at least one GCC testcase - this requires looking
>>> at a particular pass dump (that of forwprop or ccp are probably most suitable
>>> as they are run very early).  I mentioned the possibility to do offline
>>> (thus not with C testcases) testing but that would require some tool
>>> to do that and it would be correctness testing (some automatic proof
>>> generation tool - ISTR academics have this kind of stuff).  But that was
>>> just an idea.
>>>
>>>> Regarding gsoc proposal, I would like to align it on the following points:
>>>> a) Pattern matching using decision tree
>>>
>>> good.
>>>
>>>> b) Generate GIMPLE folding patterns (tree-ssa-forwprop,
>>>> tree-ssa-sccvn, gimple-fold)
>>>
>>> I'd narrow it down a bit, you can optionally do more if time permits.
>>> I'd say
>>>   0) add basic arithmetic identities (x + 0, x  * 1, x / 1, etc., correctly
>>> for all types - you can look into fold-const.c which handles all of them)
>>>   1) target as much as possible of the existing transforms in forwprop
>>>   2) pieces of fold-const.c: most interesting user is
>>> gimple_fold_stmt_to_constant_1 (used by CCP and SCCVN and maybe
>>> others)
>>>   3) fold-const.c's fold_comparison (and fold_binary parts handling
>>> comparison codes), this is also necessary for quite important parts of
>>> forwprop
>>>
>>>> c) Generate GENERIC folding patterns (fold-const)
>>>> Is that fine ? I would like to hear about any changes that you feel should be
>>>> made.
>>>
>>> This isn't easily distinguishable from b), instead we might want to
>>>
>>>   c) Remove parts of fold-const.c that can be subsumed by the GENERIC
>>> variant of the folding patterns
>>>
>>>> I have been mostly experimenting with the patch, by adding few patterns
>>>> from tree-ssa-forwprop, and shall try to do pattern matching by a decision tree.
>>>> Is there anything else you would like to suggest that
>>>> can help me get comfortable with the code-base and the project-idea and
>>>> could possibly help me strengthen my proposal ?
>>>
>>> Nothing off my head right now - there is always bugzilla to look for
>>> missed optimization bugs.
>>
>> There are two meta-bugs (that link specific ones) for this area:
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15459 and
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19986
>>
> Hi Richard,
> Thanks for pointing out the links. I will try and solve the mentioned bugs
>
> I had a look at PR 14753
> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753) from the first
> link. I have tried to implement those transforms (attached patch,
> stage-1 compiled).
> I have written the transforms to operate on GENERIC.
> Is that correct ?

If that means you wrote a pattern for genmatch then yes.  The transforms
should then in the end operate both on GENERIC and GIMPLE.

> The patterns mentioned in the links were:
> a) (X >> CST1) >= CST2 -> X >= CST2 << CST1
> however, an expression Y >= CST gets folded to Y > CST - 1
> so the transform I wrote:
> (X >> CST1) > CST2 -> X > CST2 << CST1
>
> b) (X & ~CST) == 0 -> X <= CST
> However ~CST gets folded.
>
> so the transform I wrote:
> (X & CST) == 0 -> X <= ~CST (~CST is folded by calling fold_unary)

Good.

> After applying the patch, I got this output for the test-case in the
> PR (169t.optimized):
>
> ;; Function foo (foo, funcdef_no=0, decl_uid=1745, symbol_order=0)
> foo (unsigned int a)
> {
>   <bb 2>:
>   if (a_1(D) > 8)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>
>   <bb 3>:
>   bar ();
>
>   <bb 4>:
>   return;
>
> }
>
> ;; Function baz (baz, funcdef_no=1, decl_uid=1748, symbol_order=1)
>
> baz (unsigned int a)
> {
>   <bb 2>:
>   if (a_1(D) <= 7)
>     goto <bb 3>;
>   else
>     goto <bb 4>;
>
>   <bb 3>:
>   bar ();
>
>   <bb 4>:
>   return;
>
> }

Note that to test whether the transform applies when we are still
in GENERIC (thus before gimplification) the testcase needs
to see the whole expression in a single statemtent (like the
testcases in the bugreport are written).  To verify it also applies
on GIMPLE while making sure it isn't simplified earlier on GENERIC
you should separate the to simplify pattern to separate statements like
for example

void
foo (unsigned int a)
{
  /* This one is equivalent to a >= (3 << 2).  */
  int tem = a >> 2;
  if (tem >= 3)
    bar ();
}

then fold-const.c will not be able to simplify it but a later GIMPLE
pass like forwprop should be able to do that.

> I have two very fundamental doubts:
>
> a) It appears to me that folding is performed at all levels: GENERIC
> (fold-const), GIMPLE (gimple-fold) and RTL (simplify-rtx). Is that
> because, lowering IR may introduce opportunities for folding?

It's more because each optimization pass may introduce opportunities
for folding and we have optimization passes on GIMPLE and RTL.
And the frontends that operate on GENERIC (C and C++ for example)
need fold-const.c to fold constant initializers like for 'const int i = a - a'.

> So it is needed
> to be performed at each level ?

Yes.  The goal with writing patterns for genmatch is that we need to
write them only once for GENERIC and GIMPLE (targeting RTL at
the same time is much harder).

> Also since RTL expansion is semi-automatic
> (by that I mean custom C code used in .md files to construct RTL), the
> RTL insns may require folding ?

The .md files are actually only used for instruction selection and RTL
expansion.  Folding opportunities arise due to weaknesses in RTL
expansion and RTL optimization passes exposing new opportunities.
RTL actually has a similar pass like GIMPLE forwprop - it's 'combine'
that combines and simplifies up to four RTL instructions to try matching
more complex instructions from the .md descriptions.

> b) Why do we want to target GENERIC too ?

To not regress and to avoid duplicating code when implementing
transforms that fold-const.c currently does on the GIMPLE level.

> If I understand correctly, currently gimple folding is done by
> genericizing (fold_stmt) ?

Indeed most of the folding we do on GIMPLE happens by building
GENERIC, feeding that to fold-const.c and re-gimplifying the result.
That's super-ugly (and not exactly cheap either).  In tree-ssa-forwprop
there are the beginnings of a real "GIMPLE folder" but pattern matching
manually on GIMPLE is awkward and we'd have to duplicate the actual
simplifications on GENERIC and GIMPLE - the goal with genmatch
is to be able to write each transform only once and get the GENERIC
and GIMPLE folder created automatically.

> Why not move off folding entirely to GIMPLE?
> This shall probably be an issue, when front-ends (C for example)
> require to perform constant folding
> (for example expression in case statement), so the required expression
> needs to be gimplified for evaluation.
> So by generating both GENERIC and GIMPLE from a meta-description,
> we avoid the cost of genericizing (as it's currently done),
> and gimplifying an expression for evaluation (if all folding is moved
> to gimple) ? Or have I completely missed the point here ?

No, that's entirely correct.

> Regarding the proposal, I have the following tentative timeline
> in my mind:
>
> Present - May 18: Get familiar with folding patterns (in fold-const,
> tree-ssa-forwprop),
> solve bugs mentioned in the links, start moving simple patterns to match.pd
>
> May 19 - June 27:
> a) Implement pattern matching using decision tree.
> b) Support generating gimple and generic patterns
> (gimple_match_and_simplify, generic_match_and_simplify)
> c) Basic identities (for example the ones in fold_binary_loc:case
> PLUS_EXPR and similar) for all types.
> d) fold_comparison patterns and fold_binary handling of comparison codes.
>
> June 28 - July 16:
> e) Patterns from tree-ssa-forwprop
>
> July 16 - August 11:
> f) Patterns from fold-const.c
>
> August 12 - August 18:
> Fixing bugs, documentation, testing.
> Also I would like to put somewhere "extending meta-description",
> if required for writing patterns.

Yeah, I'd leave some spare time for that in the May 19 - June 27
time frame - I guess that d) may run into some limitations.

> I would like to put "extending the meta-description", if it would be
> needed while writing patterns.
> I assume it shall take time to move most of patterns from
> tree-ssa-forwprop and fold-const so I put them as single tasks in
> respective time periods. If time permits, I would do more. I am ready
> to commit 60-65 hours per week for the project.

Please also add some time to amend the testsuite - most GENERIC
folding in fold-const.c should have a testcase already (ok, that's a bit
optimistic), testcases for folding happening on GIMPLE are rare.

Ideally there should be testcases for the GENERIC and GIMPLE part for
each pattern you add (testcases that the pattern applies).

> I shall get a draft of
> the proposal ready within a couple of days.

Thanks,
Richard.

> Thanks and Regards,
> Prathamesh
>> Richard.
>>
>>> I've created svn//gcc.gnu.org/svn/gcc/branches/match-and-simplify now
>>> and committed my current patch state.  I've used gcc/ChangeLog.mas
>>> to log changes to the branch.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks and Regards,
>>>> Prathamesh
>>>>>
>>>>> Richard.
>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>>> I have a fluent grasp on C and working knowledge of flex, bison, C++,
>>>>>>>> POSIX api, binutils and shell scripting (bash),
>>>>>>>> I have been through most of dragon book, built an interpreter
>>>>>>>> for a "c-like" language and a C-code generator for a toy language
>>>>>>>> similar to python.
>>>>>>>>
>>>>>>>> As far as gcc goes, I had attended IIT Bombay gcc workshop 2013:
>>>>>>>> http://www.cse.iitb.ac.in/grc/gcc-workshop-13/
>>>>>>>> and have been through the online docs.
>>>>>>>>
>>>>>>>> I have a couple of one-liner patches committed:
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00490.html
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01144.html
>>>>>>>>
>>>>>>>> A few pending patches:
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00143.html
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00143.html
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01220.html
>>>>>>>> http://gcc.gnu.org/ml/gcc/2014-01/msg00268.html
>>>>>>>>
>>>>>>>> and a couple of rejected ones:
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00957.html
>>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01366.html
>>>>>>>>
>>>>>>>> Please do let me know what you think.
>>>>>>>>
>>>>>>>> Thanks and Regards,
>>>>>>>> Prathamesh
>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> On the following test-case:
>>>>>>>>>> int main()
>>>>>>>>>> {
>>>>>>>>>>   int a, b, c;
>>>>>>>>>>   a = ~~~~b;
>>>>>>>>>>   c = ~-a;
>>>>>>>>>>   return 0;
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> The following GIMPLE is generated:
>>>>>>>>>> main ()
>>>>>>>>>> gimple_bind <
>>>>>>>>>>   int D.1748;
>>>>>>>>>>   int D.1749;
>>>>>>>>>>   int D.1750;
>>>>>>>>>>   int D.1751;
>>>>>>>>>>   int D.1752;
>>>>>>>>>>   int a;
>>>>>>>>>>   int b;
>>>>>>>>>>   int c;
>>>>>>>>>>
>>>>>>>>>>   gimple_assign <var_decl, D.1749, b, NULL, NULL>
>>>>>>>>>>   gimple_assign <var_decl, a, D.1749, NULL, NULL>
>>>>>>>>>>   gimple_assign <plus_expr, c, a, -1, NULL>
>>>>>>>>>>   gimple_assign <integer_cst, D.1752, 0, NULL, NULL>
>>>>>>>>>>   gimple_return <D.1752>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The patch generates two tuples for a = ~~~~b,
>>>>>>>>>> where only one is needed, and extra temporaries, which
>>>>>>>>>> are not removed after the folding. How should I go about
>>>>>>>>>> removing that (should I worry about that since subsequent passes,
>>>>>>>>>> shall remove those ?)
>>>>>>>>>>
>>>>>>>>>> b) Some front-ends, C, for example, requires constant folding in certain places,
>>>>>>>>>> like case statement. If constant folding is completely moved off to gimple,
>>>>>>>>>> how shall this be handled ? Shall we gimplify the expression immediately if
>>>>>>>>>> it's required to be evaluated ?
>>>>>>>>>>
>>>>>>>>>> Thanks and Regards,
>>>>>>>>>> Prathamesh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]