This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [GSoC] decision tree first steps


On 6/11/14, Prathamesh Kulkarni <bilbotheelffriend@gmail.com> wrote:
> On 6/11/14, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Wed, Jun 11, 2014 at 12:53 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, Jun 11, 2014 at 10:51 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Tue, Jun 10, 2014 at 1:57 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni
>>>>> <bilbotheelffriend@gmail.com> wrote:
>>>>>> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener
>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni
>>>>>>>> <bilbotheelffriend@gmail.com> wrote:
>>>>>>>>> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener
>>>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>>>> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni
>>>>>>>>>> <bilbotheelffriend@gmail.com> wrote:
>>>>>>>>>>> I have few questions regarding genmatch:
>>>>>>>>>>>
>>>>>>>>>>> a) Why is 4 hard-coded here: ?
>>>>>>>>>>> in write_nary_simplifiers:
>>>>>>>>>>>  fprintf (f, "      tree captures[4] = {};\n");
>>>>>>>>>>
>>>>>>>>>> Magic number (this must be big enough for all cases ...).
>>>>>>>>>> Honestly
>>>>>>>>>> this should be improved (but requires another scan over the
>>>>>>>>>> matcher
>>>>>>>>>> IL
>>>>>>>>>> to figure out the max N used in @N).
>>>>>>>>>>
>>>>>>>>>>> b) Should we add syntax for a symbol to denote multiple
>>>>>>>>>>> operators
>>>>>>>>>>> ?
>>>>>>>>>>> For exampleim in simplify_rotate:
>>>>>>>>>>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^  (CNT1 + CNT2
>>>>>>>>>>> ==
>>>>>>>>>>> bitsize of type of X).
>>>>>>>>>>
>>>>>>>>>> Something to enhance the IL with, yes.  I'd say we support
>>>>>>>>>>
>>>>>>>>>> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR)
>>>>>>>>>>
>>>>>>>>>> thus,
>>>>>>>>>>
>>>>>>>>>> (define_op <operator-name> op...)
>>>>>>>>>>
>>>>>>>>>>> c) Remove for parsing capture in parse_expr since we reject
>>>>>>>>>>> outermost
>>>>>>>>>>> captured expressions ?
>>>>>>>>>>
>>>>>>>>>> but parse_expr is also used for inner expressions, no?
>>>>>>>>>>
>>>>>>>>>> (plus (minus@2 @0 @1) @3)
>>>>>>>>>>
>>>>>>>>>> should still work
>>>>>>>>>>
>>>>>>>>>>> d) I am not able to follow this comment in match.pd:
>>>>>>>>>>> /* Patterns required to avoid SCCVN testsuite regressions.  */
>>>>>>>>>>>
>>>>>>>>>>> /* (x >> 31) & 1 -> (x >> 31).  Folding in fold-const is more
>>>>>>>>>>>    complicated here, it does
>>>>>>>>>>>      Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1))
>>>>>>>>>>>      (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1))
>>>>>>>>>>>      if the new mask might be further optimized.  */
>>>>>>>>>>> (match_and_simplify
>>>>>>>>>>>   (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep)
>>>>>>>>>>>   if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1)
>>>>>>>>>>> == 0)
>>>>>>>>>>>   @0)
>>>>>>>>>>
>>>>>>>>>> The comment is literally copied from the case I extracted the
>>>>>>>>>> (simplified) variant from fold-const.c.  See lines 11961-12056 in
>>>>>>>>>> fold-const.c.
>>>>>>>>>> It'll be a challenge to implement the equivalent in a pattern ;)
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Decision Tree.
>>>>>>>>>>>     I have tried to come up with a prototype for decision tree
>>>>>>>>>>> (patch attached).
>>>>>>>>>>> For simplicity, it handles patterns involving only unary
>>>>>>>>>>> operators
>>>>>>>>>>> and
>>>>>>>>>>> no predicates
>>>>>>>>>>> and returns false when the pattern fails to match (no goto to
>>>>>>>>>>> match
>>>>>>>>>>> another pattern).
>>>>>>>>>>> I meant to post it only for illustration, and I have not really
>>>>>>>>>>> paid
>>>>>>>>>>> attention to code quality (bad formatting, memory leaks, etc.).
>>>>>>>>>>>
>>>>>>>>>>> * Basic Idea
>>>>>>>>>>> A pattern consists of following parts: match, ifexpr and result.
>>>>>>>>>>> Let's call <ifexpr, result> as "simplification" operand.
>>>>>>>>>>> The common prefix between different match operands would be
>>>>>>>>>>> represented
>>>>>>>>>>> by same nodes in the decision tree.
>>>>>>>>>>>
>>>>>>>>>>> Example:
>>>>>>>>>>> (negate (bit_not @0))
>>>>>>>>>>> S1
>>>>>>>>>>>
>>>>>>>>>>> (negate (negate @0))
>>>>>>>>>>> S2
>>>>>>>>>>>
>>>>>>>>>>> S1, S2 denote simplifications for the above patterns
>>>>>>>>>>> respectively.
>>>>>>>>>>>
>>>>>>>>>>> The decision tree would look something like
>>>>>>>>>>> (that's the way it gets constructed with the patch):
>>>>>>>>>>>
>>>>>>>>>>>                 dummy/root
>>>>>>>>>>>                         |
>>>>>>>>>>>            NEGATE_EXPR
>>>>>>>>>>>              /                  \
>>>>>>>>>>>      BIT_NOT           NEGATE_EXPR
>>>>>>>>>>>            |                         |
>>>>>>>>>>>          @0                     @0
>>>>>>>>>>>            |                         |
>>>>>>>>>>>          S1                      S2
>>>>>>>>>>>
>>>>>>>>>>> a) The children of an internal node are number of decisions that
>>>>>>>>>>> can be taken at that node. In the above case it's 2 for outer
>>>>>>>>>>> NEGATE_EXPR.
>>>>>>>>>>> b) Simplification operand represents leaves of the decision tree
>>>>>>>>>>> c) Instead of having list of heads, I have added one dummy node,
>>>>>>>>>>> and heads become children of these dummy root node.
>>>>>>>>>>> d) Code-gen for non-simplification operands involves generating,
>>>>>>>>>>> "matching" code and for simplification operands involves
>>>>>>>>>>> generating
>>>>>>>>>>> "transform" code
>>>>>>>>>>>
>>>>>>>>>>> * Overall Flow:
>>>>>>>>>>> I guess we would build the decision tree from the AST.
>>>>>>>>>>> So the flow would be like:
>>>>>>>>>>> source -> struct simplify (match, ifexpr, result) -> decision
>>>>>>>>>>> tree
>>>>>>>>>>> -> c code.
>>>>>>>>>>>
>>>>>>>>>>> Something like (in main):
>>>>>>>>>>> decision_tree dt;
>>>>>>>>>>> while (there is another pattern)
>>>>>>>>>>> {
>>>>>>>>>>>   simplify *s = parse_match_and_simplify ();
>>>>>>>>>>>   insert s into decision tree;
>>>>>>>>>>> };
>>>>>>>>>>> So parsing routines are concerned with parsing and building the
>>>>>>>>>>> AST (operand),
>>>>>>>>>>> and not with the decision tree. Is that fine ?
>>>>>>>>>>
>>>>>>>>>> Yes, that's good.
>>>>>>>>>>
>>>>>>>>>>> * Representation of decision tree.
>>>>>>>>>>> A decision tree would need a way to represent language
>>>>>>>>>>> constructs
>>>>>>>>>>> like capture, predicate, etc. so in some ways it would be
>>>>>>>>>>> similar
>>>>>>>>>>> to AST.
>>>>>>>>>>> It
>>>>>>>>>>> In the patch, I have created the following heirarchy:
>>>>>>>>>>> dt_operand: represents a general base "operand" of in decision
>>>>>>>>>>> tree
>>>>>>>>>>> dt_expr: for representing expression. Expression contains
>>>>>>>>>>> operation
>>>>>>>>>>> to be performed (e_operation).
>>>>>>>>>>> dt_capture: analogous to capture.
>>>>>>>>>>> dt_simplify: for representing "simplification" operand.
>>>>>>>>>>> simplification consists of ifexpr and result
>>>>>>>>>>> dt_head: to represent "dummy" root. Maybe a separate class is
>>>>>>>>>>> not
>>>>>>>>>>> needed.
>>>>>>>>>>>
>>>>>>>>>>> * Constructing decision tree from AST
>>>>>>>>>>> The algorithm i have used is similar to inserting string in a
>>>>>>>>>>> trie
>>>>>>>>>>> outlined here: http://en.wikipedia.org/wiki/Trie
>>>>>>>>>>> The difference shall be to traverse AST depth-first rather than
>>>>>>>>>>> traversing the string.
>>>>>>>>>>> Apart from that I guess it would be the same (naturally find and
>>>>>>>>>>> compare operations would be different).
>>>>>>>>>>> I haven't given much thought about this. Currently, I construct
>>>>>>>>>>> decision tree for only patterns with unary operators in an
>>>>>>>>>>> ugly way by "flattening" the AST by walking it in pre-order and
>>>>>>>>>>> storing the nodes in vector in the order they are visited
>>>>>>>>>>>
>>>>>>>>>>> So to construct decision tree from the following AST:
>>>>>>>>>>>           negate
>>>>>>>>>>>               |
>>>>>>>>>>>           bit_not
>>>>>>>>>>>               |
>>>>>>>>>>>              @0
>>>>>>>>>>>
>>>>>>>>>>> AST is flattened by walking in preorder (by
>>>>>>>>>>> walk_operand_preorder)
>>>>>>>>>>> and stored
>>>>>>>>>>> as: [ expr, expr, capture ]
>>>>>>>>>>> and this vector is used to construct decision tree.
>>>>>>>>>>> I did it as a "quick and dirty" way, it has to be changed.
>>>>>>>>>>> And it won't work for patterns with n-ary operators.
>>>>>>>>>>>
>>>>>>>>>>> We should visit each node of AST in preorder, and add that node
>>>>>>>>>>> during traversal to decision tree. I am not yet clear on way of
>>>>>>>>>>> doing that.
>>>>>>>>>>>
>>>>>>>>>>> * Comparing operands
>>>>>>>>>>> How do we compare operands ? We shall need to do this while
>>>>>>>>>>> inserting
>>>>>>>>>>> in decision tree, since if the operand is already inserted we do
>>>>>>>>>>> not create
>>>>>>>>>>> a new node.
>>>>>>>>>>> In the patch (cmp_operand), I have used following rules:
>>>>>>>>>>> a) if types not equal, they are clearly not equal.
>>>>>>>>>>> b) if type is expr, compare operation->op->id
>>>>>>>>>>> c) if type is capture, compare the number.
>>>>>>>>>>> d) for predicate, compare on ident ?
>>>>>>>>>>>
>>>>>>>>>>> * Representing patterns with n-ary operators.
>>>>>>>>>>> Consider following two match operands:
>>>>>>>>>>> (plus (minus @0 @1) @1)
>>>>>>>>>>> (plus (negate @0) @1)
>>>>>>>>>>>
>>>>>>>>>>> In decision tree, it would be represented as:
>>>>>>>>>>>
>>>>>>>>>>>             *********plus*********
>>>>>>>>>>>                /    \          /        \
>>>>>>>>>>>           minus  @1  negate  @1
>>>>>>>>>>>             /     \           |
>>>>>>>>>>>         @0   @1       @0
>>>>>>>>>>>
>>>>>>>>>>> plus has got 4 children - (minus @0 @1), @1, (negate @0), @1.
>>>>>>>>>>> However (minus @0 @1) and @1 are not separate "decisions", but
>>>>>>>>>>> children of plus within the same expression.
>>>>>>>>>>>
>>>>>>>>>>> For calculating number of decisions at an expression node:
>>>>>>>>>>> number of decisions =  number of children / arity of operator.
>>>>>>>>>>> in the above case, number of decision = 4 / 2 = 2
>>>>>>>>>>>
>>>>>>>>>>> For other cases, for instance capture node (the children would
>>>>>>>>>>> be
>>>>>>>>>>> simplification operand),
>>>>>>>>>>> number of decisions = number of children.
>>>>>>>>>>
>>>>>>>>>> Hmm.  I think it should be only two children from the plus,
>>>>>>>>>> as pre-order for the first pattern is for example
>>>>>>>>>> [plus minus @0 @1 @1] so you'd have
>>>>>>>>>>
>>>>>>>>>>    plus
>>>>>>>>>>    /     \
>>>>>>>>>> minus  negate
>>>>>>>>>> |           |
>>>>>>>>>> @0     @0
>>>>>>>>>> |           |
>>>>>>>>>> @1     @1
>>>>>>>>>> |
>>>>>>>>>> @1
>>>>>>>>>>
>>>>>>>>>> instead?
>>>>>>>>>>
>>>>>>>>>> Note that you probably have to deal with non-matching capture
>>>>>>>>>> IDs by re-writing them on-the-fly.  That is, the two
>>>>>>>>>> pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...]
>>>>>>>>>> should commonize with renaming the non-matching capture
>>>>>>>>>> somehow.
>>>>>>>>> Um, could you please elaborate on that (non-matching capture ?),
>>>>>>>>> I am not sure if I understood it fully, thanks.
>>>>>>>>
>>>>>>>> I'm worried about
>>>>>>>>
>>>>>>>> (plus (minus@1 @2 @3) @2)
>>>>>>>>
>>>>>>>> vs.
>>>>>>>>
>>>>>>>> (plus (minus @1 @2) @1)
>>>>>>>>
>>>>>>>> (just made-up examples, of course).  Here we'd like to
>>>>>>>> commonize the path checking for (plus (minus ...) ... but
>>>>>>>> of course we can't easily process the captures at that point
>>>>>>>> as the first has two captures (it captures the minus itself while
>>>>>>>> the 2nd doesn't) and the captures that are at the same position
>>>>>>>> use different indexes.  To finally match the minus we have to
>>>>>>>> verify that in one case @2 == @2 and in the other case that
>>>>>>>> @1 == @1.
>>>>>>>>
>>>>>>>> Re-numbering capture indexes in pre-order would improve things
>>>>>>>> for the case where capture positions are in the same places but
>>>>>>>> the indexes do not match.  It won't handle the above case
>>>>>>>> of course.
>>>>>> Thanks, I understand it now -:)
>>>>>
>>>>> I suppose that it would be best (in the end) to not insert the
>>>>> captures into the decision tree separately but leave them
>>>>> "combined" and then only after the decision tree is fully built
>>>>> decide on re-numbering.  And treat those that are required for
>>>>> the matching process differently (those that assert that two
>>>>> ops are equal).
>>>>>
>>>>> I think we can defer the issue to later.
>>>>>
>>>>>>>>
>>>>>>>>> I have attached patch for patterns with binary operators (in
>>>>>>>>> principle
>>>>>>>>> should work for n-ary operators,
>>>>>>>>> but have only tested upto 2), that creates decision tree for
>>>>>>>>> patterns (excluding predicates, and multiple matching). I don't
>>>>>>>>> intend
>>>>>>>>> to commit this (contains many hacks!).
>>>>>>>>>
>>>>>>>>> * Constructing decision tree from AST
>>>>>>>>> We can define our decision tree to store prefixes of preorder
>>>>>>>>> traversals of diffferent AST.
>>>>>>>>> Insertion happens as follows (decision_tree::insert)
>>>>>>>>> a) Get preorder traversal of AST into vector.
>>>>>>>>> b) Insert AST nodes from vector into decision tree.
>>>>>>>>> Currently, I obtain preorder traversal into a vector. We can later
>>>>>>>>> change it to compute
>>>>>>>>> next preorder successor lazily.
>>>>>>>>
>>>>>>>> Just make the code easy to understand, optimizing it should be
>>>>>>>> secondary.
>>>>>>>>
>>>>>>>>> * Code-gen
>>>>>>>>> For n-ary operators, the patch generates code as follows:
>>>>>>>>> if (code == <expr code>)
>>>>>>>>> {
>>>>>>>>>    match operand 0
>>>>>>>>>      match operand 1
>>>>>>>>>         ....
>>>>>>>>>           match operand n-1
>>>>>>>>>             transform
>>>>>>>>>             return true;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> * Adding parent, level, pos fields to AST (operand).
>>>>>>>>> One change (hack), I have made to code-gen, is naming of operands
>>>>>>>>> that are
>>>>>>>>> assigned to gimple_assign_rhs in code-gen of expressions.
>>>>>>>>> I am not sure if this is really needed.
>>>>>>>>>
>>>>>>>>> in AST, we have following representation for expressions:
>>>>>>>>>                       expr-
>>>>>>>>>                      /       \
>>>>>>>>>                   left      right
>>>>>>>>>                operand  operand
>>>>>>>>>
>>>>>>>>> code-gen off AST (expr::gen_gimple_match) produces code like:
>>>>>>>>> {
>>>>>>>>>   tree op = gimple_assign_rhs1 (def_stmt);
>>>>>>>>>   // code-gen for left operand
>>>>>>>>> }
>>>>>>>>> {
>>>>>>>>>   tree op = gimple_assign_rhs2 (def_stmt);
>>>>>>>>>   // code-gen for right operand
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> We can do this since, in expr::gen_gimple_match, we call
>>>>>>>>> left_operand->gen_gimple_match,
>>>>>>>>> come back to expr::gen_gimple_match (after
>>>>>>>>> left_operand->gen_gimple_match() returns), and
>>>>>>>>> then call right_operand->gen_gimple_match().
>>>>>>>>>
>>>>>>>>> However in decision tree, the expression gets represented as
>>>>>>>>> follows:
>>>>>>>>>            expr
>>>>>>>>>              |
>>>>>>>>>            left-operand
>>>>>>>>>              |
>>>>>>>>>            right-operand
>>>>>>>>>
>>>>>>>>> dt_expr::gen_gimple calls left_operand->gen_gimple, which calls
>>>>>>>>> right_operand->gen_gimple,
>>>>>>>>> so we return to dt_expr::gen_gimple, after code is generated for
>>>>>>>>> the
>>>>>>>>> whole subtree of expr.
>>>>>>>>>
>>>>>>>>> So I assign operands at the start before generating code for the
>>>>>>>>> subtree:
>>>>>>>>> tree op00 = gimple_assign_rhs1 (def_stmt);
>>>>>>>>> tree op10 = gimple_assign_rhs2 (def_stmt);
>>>>>>>>> <code for left-operand>
>>>>>>>>>   <code for right-operand>
>>>>>>>>> each of the operands know their positions, so they know which
>>>>>>>>> operand
>>>>>>>>> to use (get_op_name()).
>>>>>>>>> the names are generated as:
>>>>>>>>> op<position><level>, position = index of operand in it's parent's
>>>>>>>>> expr's vector (vec<operand *> ops).
>>>>>>>>> level: level of AST, at which the operand is stored.
>>>>>>>>>
>>>>>>>>> For this I added three fields to operand (unsigned pos, and
>>>>>>>>> unsigned
>>>>>>>>> level, operand *parent), and accordingly
>>>>>>>>> made changes to expr::append_op. In a way this is abusing the AST.
>>>>>>>>> This probably works (works with test-cases i tried), but I don't
>>>>>>>>> like it much.
>>>>>>>>
>>>>>>>> The naming is ok, but indeed it doesn't feel very clean.  Somehow
>>>>>>>> this
>>>>>>>> belongs to the decision tree instead... (but then you need to
>>>>>>>> lookup
>>>>>>>> the decision tree operand info from code-gen which happens off the
>>>>>>>> AST...).
>>>>>>>>
>>>>>>>>> * Order of matching.
>>>>>>>>> Currently the order of matching operands is strictly 0, 1, .. n -
>>>>>>>>> 1
>>>>>>>>> for n-ary operator.
>>>>>>>>> However this might not be the best choice.
>>>>>>>>>
>>>>>>>>> For example, consider following patterns:
>>>>>>>>> (operator  op1  op) S1
>>>>>>>>> (operator  op2 op)  S2
>>>>>>>>>
>>>>>>>>> Both have common operator, and 1st common operand, they differ in
>>>>>>>>> 0th operand.
>>>>>>>>> With patch, the code generated is:
>>>>>>>>> if (code == <expr-code>)
>>>>>>>>> {
>>>>>>>>>   match op1
>>>>>>>>>      match op
>>>>>>>>>        S1
>>>>>>>>>
>>>>>>>>>   match op2
>>>>>>>>>     match op
>>>>>>>>>       S2
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> A better ordering would be to match the 1st operand and then
>>>>>>>>> respective 0th operands of both patterns:
>>>>>>>>>
>>>>>>>>> if (code == <expr-code>
>>>>>>>>> {
>>>>>>>>>   match op
>>>>>>>>>     match op1
>>>>>>>>>       S1
>>>>>>>>>     match op0
>>>>>>>>>       S2
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> This complicates insertion into decision tree.
>>>>>>>>
>>>>>>>> I'd say we leave "optimizing" the decision tree to a later stage -
>>>>>>>> it's a global
>>>>>>>> optimization problem after all (I believe you can't do optimally
>>>>>>>> with
>>>>>>>> just
>>>>>>>> looking at the present decision tree and the AST you want to
>>>>>>>> insert).
>>>>>>>>
>>>>>>>>> * hack: only one generated gimple_match_and_simplify function
>>>>>>>>> In the patch, I generate code for gimple_match_and_simplify with 3
>>>>>>>>> operands version (op0, op1, op2)
>>>>>>>>> and the other two (unary and binary versions), call it with
>>>>>>>>> NULL_TREE
>>>>>>>>> for extra operands.
>>>>>>>>> I will soon change that.
>>>>>>>>>
>>>>>>>>> * Testing patterns.
>>>>>>>>> I think I have written most of the test-cases wrongly in match-2.c
>>>>>>>>> For example this test-case doesn't match
>>>>>>>>> (i haven't checked in "fix testsuite fallout" patch, so this might
>>>>>>>>> not be true
>>>>>>>>> anymore)
>>>>>>>>>
>>>>>>>>> (match_and_simplify
>>>>>>>>>   plus (minus @0 @1) @1)
>>>>>>>>>   @0)
>>>>>>>>>
>>>>>>>>> for this test-case:
>>>>>>>>> /* (x - y) + y -> x */
>>>>>>>>> int f6(int x, int y)
>>>>>>>>> {
>>>>>>>>>   int t1 = x - y;
>>>>>>>>>   return t1 + y;
>>>>>>>>> }
>>>>>>>>> /* { dg-final { scan-tree-dump "gimple_match_and_simplified to
>>>>>>>>> \[^\n\r\]*= x_\\d\+\\(D\\)" "forwprop1" } } */
>>>>>>>>>
>>>>>>>>> I get following output (tree-forwprop-details):
>>>>>>>>>
>>>>>>>>> ;; Function f6 (f, funcdef_no=0, decl_uid=1744, symbol_order=0)
>>>>>>>>>
>>>>>>>>> gimple_match_and_simplified to _4 = y_2(D) + t1_3;
>>>>>>>>> f (int x, int y)
>>>>>>>>> {
>>>>>>>>>   int t1;
>>>>>>>>>   int _4;
>>>>>>>>>
>>>>>>>>>   <bb 2>:
>>>>>>>>>   t1_3 = x_1(D) - y_2(D);
>>>>>>>>>   _4 = x_1(D);
>>>>>>>>>   return _4;
>>>>>>>>>
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> Shouldn't it show: gimple_match_and_simplified to _4 = x_1(D)
>>>>>>>>> instead ?
>>>>>>>>
>>>>>>>> Yeah.  I suppose you hit the issue that we don't commutate
>>>>>>>> patterns,
>>>>>>>> so try with a commutated pattern inserted.
>>>>>>>>
>>>>>>>>> The issue is this test-case fails in isolation, however it passes
>>>>>>>>> when
>>>>>>>>> it is placed with other
>>>>>>>>> test cases that PASS and have same regex in scan-tree-dump as this
>>>>>>>>> test-case:
>>>>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>>>>>> x_\\d\+\\(D\\)"
>>>>>>>>>
>>>>>>>>> Summary:
>>>>>>>>> I think we need to (at-least) deal with following problems for
>>>>>>>>> decision tree:
>>>>>>>>
>>>>>>>> I have to leave for a while now, I'll come back later and have a
>>>>>>>> detailed
>>>>>>>> look at the patch to try answer the questions below.
>>>>>>>
>>>>>>> Now for the remainder.
>>>>>>>
>>>>>>>>> a) Representation and construction of decision tree from AST -
>>>>>>>>> We do this by storing prefixes of preorder traversals of AST in
>>>>>>>>> decision tree. Do we finalize on that ?
>>>>>>>
>>>>>>> Yes.
>>>>>>>
>>>>>>>>> b) Order of operand matching
>>>>>>>
>>>>>>> I think that's the same as a) - the visiting order of the AST
>>>>>>> determines
>>>>>>> the order of operand matching.
>>>>>>>
>>>>>>>>> c) Generating patterns for built-in functions (I guess this will
>>>>>>>>> be
>>>>>>>>> similar to expr-gen).
>>>>>>>
>>>>>>> It should really be transparent.
>>>>>>>
>>>>>>>>> d) Multiple matching patterns
>>>>>>>
>>>>>>> The decision tree phase leaves us with a sequence of
>>>>>>>
>>>>>>>   if (if-expr of pattern1)
>>>>>>>     transform-pattern1
>>>>>>>   else (if-expr of pattern2)
>>>>>>>     transform-pattern2
>>>>>>> ...
>>>>>>>
>>>>>>> etc.  thus that part should be basically unchanged (only the match
>>>>>>> part is driven by a decision tree).
>>>>>>>
>>>>>>>>> e) Commutative ops
>>>>>>>
>>>>>>> I'd do these as a pass over the AST of each pattern, duplicating
>>>>>>> patterns.
>>>>>>>
>>>>>>> Btw, I see you mirror the AST structure with
>>>>>>> dt_{node,operand,expr,...}.
>>>>>>> It seems to me that you can simply stick an AST operand * into
>>>>>>> dt_node and be done with just having dt_node?  That is, the AST
>>>>>>> operand refered to contains all necessary information already.
>>>>>>>
>>>>>>>>>
>>>>>>>>> Pattern Enhancements:
>>>>>>>>> Some of these were already discussed, I am jotting them here:
>>>>>>>>> a) Symbol for multiple operators
>>>>>>>>> b) grouping patterns by ifexpr
>>>>>>>>> c) should we have MATCH_FAILED or FAIL to explicitly denote
>>>>>>>>> failure
>>>>>>>>> in
>>>>>>>>> manual transform (c_expr) ?
>>>>>>>>> d) making operators to be commutative.
>>>>>>>
>>>>>>> I'd say d) would be really nice to have.  Either with a grammar
>>>>>>> extension to say which expression should have commutation applied
>>>>>>> to (for example: (plus! (minus @1 @2) @2) would say that the
>>>>>>> operands of plus should be commutated), or with a simple heuristic.
>>>>>>> I'd say explicitly specifying is better.
>>>>>>>
>>>>>>> a) will probably be needed to avoid pattern explosion.  But similar
>>>>>>> to
>>>>>>> d)
>>>>>>> I'd do it as a pre-pass on the AST, duplicating the patterns.  It's
>>>>>>> also a good question how to define its semantic.  The mode-iterators
>>>>>>> that the machine description supports would suggest we do instead
>>>>>>> sth like
>>>>>>>
>>>>>>> (for OP in (plus minus mult)
>>>>>>>   (match_and_simplify
>>>>>>>     (OP @1 integer_zerop)
>>>>>>>     ....))
>>>>>>>
>>>>>>> c) should be easy, I'm going to try sth like that - also setting
>>>>>>> temporaries
>>>>>>> from the ifexpr and re-using them in the result like
>>>>>>>
>>>>>>>  (match_and_simplify
>>>>>>>    (plus @1 @2)
>>>>>>>    if (INTEGRAL_TYPE_P (TREE_TYPE (@2))
>>>>>>>       && find_better_ops (&@3, &@4, @1, @2))
>>>>>>>    (plus @3 @4))
>>>>>>>
>>>>>>> which also requires us to scan the AST for the highest capture index
>>>>>>> (a good thing anyway, given my hack to statically set the size of
>>>>>>> the captures array ;))
>>>>>>>
>>>>>>> b) I'm not sure it's worth the trouble (apart from doing less typing
>>>>>>> in match.pd).  We're probably going to have GENERIC defined
>>>>>>> when creating the generic routines and GIMPLE when creating
>>>>>>> the gimple routines so we can easily disable patterns for one or
>>>>>>> the other by doing
>>>>>>>
>>>>>>> #ifdef GIMPLE
>>>>>>>  (match_and_simplify
>>>>>>>    ...)
>>>>>>> #endif
>>>>>>>
>>>>>>>>> Once the decision tree is done, I can try working on these.
>>>>>>>>> Till Sunday, I will attempt to have another prototype, that covers
>>>>>>>>> most of the patterns in match.pd
>>>>>>>>> (except for cond_expr, and those requiring GENERIC support), and
>>>>>>>>> accompanying correct test-case
>>>>>>>>> for each pattern. And next-week start working on a fair patch.
>>>>>>>
>>>>>>> Sounds good.
>>>>>> I have attached a patch, that fires most of the patterns from
>>>>>> match.pd
>>>>>> except for those that include
>>>>>> non-matching capture or require GENERIC support, and added a
>>>>>> test-case
>>>>>> match-decision-tree.c
>>>>>> (fires around 33 patterns).
>>>>>>
>>>>>> * Representation of decision tree:
>>>>>> I clubbed dt_capture, dt_predicate, dt_expr into dt_operand, and
>>>>>> added
>>>>>> new gen_gimple_match() function in AST
>>>>>> to generates matching code. dt_operand contains operand *op as
>>>>>> member,
>>>>>> and merely calls op->gen_gimple_match.
>>>>>> (that's the intent anyway, in the patch, i needed to put some hacks
>>>>>> to
>>>>>> get it working).
>>>>>> In the patch, i have kept: dt_node, dt_operand (dt_node), dt_simplify
>>>>>> (dt_node), to separate matching operand
>>>>>> from simplification (ifexpr, result).
>>>>>>
>>>>>> * Constructing decision tree from AST
>>>>>> Same as previous patch, stores prefixes of preorder traversals of
>>>>>> AST.
>>>>>> decision_tree::print() pretty-prints the decision tree.
>>>>>>
>>>>>> I hope we won't need to change much up-to this point in the patch.
>>>>>>
>>>>>> * Code-gen
>>>>>> code gen of (operator op0 op1 ..opN) simplify takes the following
>>>>>> form:
>>>>>> if (code == <expr code>)
>>>>>> {
>>>>>>   match op0
>>>>>>     match op1
>>>>>>        ...
>>>>>>          match opN
>>>>>>             simplify
>>>>>>             return true;
>>>>>> }
>>>>>>
>>>>>> a) Scope of temporaries (the ones that are assigned to
>>>>>> gimple_assign_rhsX):
>>>>>> for code-gen of expressions off the AST, we create a temporary:
>>>>>>
>>>>>> {
>>>>>> tree op = gimple_assign_rhsX ();
>>>>>> generate code for X'th operand
>>>>>> }
>>>>>>
>>>>>> In the decision tree, an expression can only directly access it's 1st
>>>>>> operand (since it's preorder traversal).
>>>>>> So I decided to first create all temporaries and then generate code
>>>>>> for operands:
>>>>>>
>>>>>> {
>>>>>> tree op0 = gimple_assign_rhs1 ();
>>>>>> tree op1 = gimple_assign_rhs2 ();
>>>>>> tree opX-1 = gimple_assign_rhsX ();
>>>>>>
>>>>>> code for matching 0th operand
>>>>>> code for matching 1st operand
>>>>>> ...
>>>>>> code for matching Nth operand
>>>>>> }
>>>>>>
>>>>>> Unfortunately, this creates a problem if the operand itself is an
>>>>>> expression,
>>>>>> since it's temporaries would also be named op0, op1 ... they would
>>>>>> shadow the outer op0, op1 .. temporaries.
>>>>>> So we need distinct names for each temporary (for all temps within
>>>>>> one
>>>>>> pattern).
>>>>>> The temporary name is created by:
>>>>>> op<position of child operand><level of tree>;
>>>>>> and each child of AST knows it's position and level, so it can
>>>>>> correctly access it's temp.
>>>>>> This requires adding members operand *parent, unsigned pos, unsigned
>>>>>> level to struct operand.
>>>>>> I have done it this way in the patch, however it doesn't feel clean,
>>>>>> I
>>>>>> will try to see if we can change this.
>>>>>>
>>>>>> b) Short-ranged gotos
>>>>>> consider: (negate @0) S1
>>>>>> The code is generated as (with the patch):
>>>>>>
>>>>>> if (code == PLUS)
>>>>>> {
>>>>>>   if (!captures [0])
>>>>>>    {
>>>>>>       captures[0] = op0;
>>>>>>       goto L0;
>>>>>>    }
>>>>>>   else if (captures[0] == op0)
>>>>>>    {
>>>>>> L0:
>>>>>>       // simplification code S1
>>>>>>       return true;
>>>>>>    }
>>>>>> }
>>>>>>
>>>>>> Instead of:
>>>>>> if (!captures[0])
>>>>>>   captures[0] = op0;
>>>>>> else if (captures[0] != op0)
>>>>>>   goto L0;
>>>>>>
>>>>>> // Simplification code S1
>>>>>> return true;
>>>>>>
>>>>>> L0:  // match next pattern
>>>>>>
>>>>>> Generating code the former way is more easier, since location of
>>>>>> label
>>>>>> immediately follows
>>>>>> capture code (if (!captures[0]) ...).  In the latter case, I probably
>>>>>> need to store label name in the sibling's node in
>>>>>> the decision tree and then retrieve that.
>>>>>> Similarly for valueize (expr::gen_valueize).
>>>>>
>>>>> But then why not generate
>>>>>
>>>>>   if (!captures[0]
>>>>>       || captures[0] == op0)
>>>>>     {
>>>>>       captures[0] = op0;
>>>>>       // simplification code S1
>>>>>       return true;
>>>>>     }
>>>>>
>>>>> and go without label and goto?  Or did I miss something?
>>>>>
>>>>>> c) Shared captures between all patterns:
>>>>>> In the patch all patterns share the capture array tree captures[4] =
>>>>>> {}
>>>>>> (since all match patterns get interleaved in generated code off
>>>>>> decision tree).
>>>>>> This requires extra code to be generated to make sure captures are
>>>>>> correctly restored for another pattern.
>>>>
>>>> Btw, I see that you back-track the decision tree.  That is, for
>>>>
>>>> root
>>>> |--operand: MINUS_EXPR
>>>> |----operand: PLUS_EXPR
>>>> |------operand: @0
>>>> |--------operand: @1
>>>> |----------operand: @0
>>>> |------------simplify
>>>> |----------operand: @1
>>>> |------------simplify
>>>>
>>>> you do
>>>>
>>>>   if (code == MINUS_EXPR)
>>>>     {
>>>>       tree captures[4] = {};
>>>>       if (TREE_CODE (op0) == SSA_NAME)
>>>>         {
>>>>           gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>           if (is_gimple_assign (def_stmt) && gimple_assign_rhs_code
>>>> (def_stmt) == PLUS_EXPR)
>>>>             {
>>>>               tree op01 = gimple_assign_rhs1 (def_stmt);
>>>>               if (valueize && TREE_CODE (op01) == SSA_NAME)
>>>>                 op01 = valueize (op01);
>>>>               else
>>>>                 goto L0;
>>>>               if (op01)
>>>>                 {
>>>> L0:
>>>>                   tree op11 = gimple_assign_rhs2 (def_stmt);
>>>>                   if (valueize && TREE_CODE (op11) == SSA_NAME)
>>>>                     op11 = valueize (op11);
>>>>                   else
>>>>                     goto L1;
>>>>                   if (op11)
>>>>                     {
>>>> L1:
>>>>                       tree captures_0 = captures[0];
>>>>                       if (!captures[0])
>>>>                         {
>>>>                           captures[0] = op01;
>>>>                           goto L2;
>>>>                         }
>>>>                       else if (captures[0] == op01)
>>>>                         {
>>>> L2:
>>>>                           tree captures_1 = captures[1];
>>>>                           if (!captures[1])
>>>>                             {
>>>>                               captures[1] = op11;
>>>>                               goto L3;
>>>>                             }
>>>>                           else if (captures[1] == op11)
>>>>                             {
>>>> L3:
>>>>                               tree captures_2 = captures[0];
>>>>                               if (!captures[0])
>>>>                                 {
>>>>                                   captures[0] = op1;
>>>>                                   goto L4;
>>>>                                 }
>>>>                               else if (captures[0] == op1)
>>>>                                 {
>>>> L4:
>>>>                                   res_ops[0] = captures[1];
>>>>                                   *res_code = TREE_CODE (res_ops[0]);
>>>>                                   return true;
>>>>                                 }
>>>>                               captures [0] = captures_2;
>>>>                               tree captures_3 = captures[1];
>>>>                               if (!captures[1])
>>>>                                 {
>>>>                                   captures[1] = op1;
>>>>                                   goto L5;
>>>>                                 }
>>>>                               else if (captures[1] == op1)
>>>>                                 {
>>>> L5:
>>>>                                   res_ops[0] = captures[0];
>>>>                                   *res_code = TREE_CODE (res_ops[0]);
>>>>                                   return true;
>>>>                                 }
>>>>                               captures [1] = captures_3;
>>>>                             }
>>>>                           captures [1] = captures_1;
>>>>                         }
>>>>                       captures [0] = captures_0;
>>>>                     }
>>>>                 }
>>>>             }
>>>>         }
>>>>
>>>> but if you processed all children of a decision tree node and didn't
>>>> hit one of the simplifies (which return true) then you know nothing
>>>> else will match and you can just return false.  That early out seems
>>>> to be missing completely and we fall through processing all siblings
>>>> of the parents.
>>>>
>>>> But maybe I am missing something?  I see that if I make capture
>>>> IDs not matching that we'd miss to detect things then (as a
>>>> "first" capture always "matches").
>>>>
>>>> root
>>>> |--operand: MINUS_EXPR
>>>> |----operand: PLUS_EXPR
>>>> |------operand: @0
>>>> |--------operand: @1
>>>> |----------operand: @0
>>>> |------------simplify
>>>> |------operand: @1
>>>> |--------operand: @0
>>>> |----------operand: @0
>>>> |------------simplify
>>>>
>>>> So I suppose we really have to distinguish "first" captures from
>>>> "matching" captures, basically not treating the "first" ones as
>>>> nodes of the decision tree and only populating the capture
>>>> array once we hit the code generation part of a pattern (well,
>>>> or the if-expr).
>>>
>>> It's probably a good idea to pre-parse the AST to classify
>>> captures and matches differently.  Given for example
>>>
>>> (plus @0 (minus@0 @1 @2))
>>>
>>> this would say (well, if we support this), that the plus has two
>>> "same" operands (on GIMPLE the "same" valueized SSA name)
>>> and that this operand is defined by a (minus @1 @2).  But with
>>> the current operator ordering in the decision tree we'd meet
>>> the no-op capture first and the match in an odd position.
>>>
>>> In the RTL machine description the matches are more explicit
>>> like
>>>
>>> (plus (match @0) (minus@0 @1 @2))
>>>
>>> so maybe we want to auto-detect those and rewrite the AST.
>>> Just distinguish between capture-def and capture-ref, where
>>> expression captures are always -def but leafs may be -refs as well.
>>>
>>> In the decision tree builder we'd "ignore" -defs and only make
>>> -refs explicit (and the -refs would need to "materialize" the -defs
>>> just like the ifexpr or transform stage).
>>>
>>> So I'd say add a flag to struct capture telling whether it's a ref
>>> and compute that via an AST traversal (that should then error
>>> for invalid refs).
>>>
>>> In the decision tree I'd keep a vector of operand * and record
>>> a name of a temporary the operand can be accessed via
>>> (in case any of the AST operands represented by the decision tree
>>> node has a capture associated).
>>>
>>> You'd still have "empty" leaf captures that would then only record
>>> the value into a temporary.
>>>
>>> Eventually you need a back-ref to the decision tree node for
>>> each capture when you insert a new AST traversal so you
>>> can decide whether a match (a capture-ref) refers to the same
>>> decision tree node or not.
>>>
>>> Richard.
>>
>> Ok, let me re-phrase after some more thinking and talking with
>> a collegue.
>>
>> When you do the AST traversal computing the array of AST operands *
>> you'd initialize book-keeping like the following
>>
>>  for operand in AST do
>>    if (operand is a capture)
>>     {
>>       if (this capture index was not yet seen)
>>         {
>>           record for this pattern (in its simplify node created at the
>>           decision tree leaf) that the capture index refers to the
>>           operand with the current index
>>           if (capture has a sub-expression)
>>             recurse
>>           else
>>             insert "true" op (matches everything)
>>         }
>>       else
>>         {
>>           generate a decision tree node that matches the decision
>>           tree index of the capture
>>           recurse
>>         }
>>     }
>>   else
>>     record op and recurse
>>
>> So when you generate code you implicitely capture all operands
>> in automatic variables named like 'captureN' with 'N' being the
>> current level of the decision tree (the index of the operand array
>> from the AST walk).  Once you hit a 'match-X' node you
>> simply match against the operand from the refered to decision
>> tree index.
>>
>> Once you reach the ifexpr point you populate the operands[] array
>> from the side-information you initialized in the AST traversal and
>> the automatic variables initialized from the decision tree walk.
>>
>> So
>>
>> (plus (minus@2 @1 @3) @2)
>>
>> would have the decision tree
>>
>> plus - minus - 'true' - 'true' - match(1) - simplify
>>
>> and the generated code would look like (simplified)
>>
>>   [can't capture outermost expr]
>>   if (code == plus)
>>     {
>>       op1 = gimple_assign_rhs1 (plus-stmt);
>>       if (TREE_CODE (op1) == minus)
>>         {
>>           op2 = gimple_assign_rhs1 (minus-stmt);
>>           if (true)
>>             {
>>               op3 = gimple_assign_rhs2 (minus-stmt);
>>               if (true)
>>                 {
>>                    op4 = gimple_assign_rhs2 (plus-stmt);
>>                    if (op4 == op1)
>>                      {
>>                         operands[0] = op2;
>>                         operands[1] = op1;
>>                         operands[2] = op3;
> Did you mean captures ?
> captures[0] = op2;
> captures[1] = op1;
> captures[2] = op3
>>                         <simplify>
>>
>> where the 'true' cases would be matched last (as "else") in case there
>> are multiplie choices here if you'd have also
>>
>> (plus @1 (minus @2 @3))
>>
>> the decision tree would become
>>
>> plus - minus - 'true' - 'true' - match(1) - simplify
>>         \
>>           'true' - minus - 'true' - 'true' - simplify
>>
> Thanks, I will get started on this. Today I added a patch that handles
> "matching-captures" but does not commonize, I am not not posting since
> it's of no
> use now. The above algorithm will also get rid of generating temporary
> as op<pos><level>.
>
> suppose these are patterns:
> (plus @0 @1) S1
> (plus @1 @0) S2
>
> then how would be decision tree represented as ?
>
> plus - true - true - S1
>                         \
>                           S2
>
> I suppose pattern-1 would always match (even with earlier code gen.
Maybe I am missing something, but won't this create a problem for
c_expr since in c_expr we replace @n verbatim with captures[n] ?
Or maybe we can find which opX was the capture assigned to, and
replace @n with opX, and thus get rid of captures array altogether ?

Example:
(negate (negate@0 @1)) ->
if (@0)
  @1

decision tree:
negate - negate - true - simplify

if (code == NEGATE)
{
  if (gimple_assign_rhs_code (def_stmt) == NEGATE)
  {
    tree op1 = gimple_assign_rhs1 (def_stmt);
    if (op1)
    {
      if (true)
      {
        if (cond (op0))
        {
           *res_code = TREE_CODE (op1);
           res_ops[0] = op1;
           return true;
        }
    }
}






>> Richard.
>>
>>>> There are also opportunities to optimize the generated code, but
>>>> basic correctness first I guess.
>>>>
>>>> I suppose we have to work a bit on the capture stuff.
>>>>
>>>> Btw, you can easily play with the code generator by doing inside
>>>> the gcc build directory
>>>>
>>>> obj/gcc> build/genmatch test.pd > test.c
>>>>
>>>> with a small test.pd. I used the following for the above examples:
>>>>
>>>> (match_and_simplify
>>>>   (MINUS_EXPR (PLUS_EXPR @0 @1) @0)
>>>>   @1)
>>>> (match_and_simplify
>>>>   (MINUS_EXPR (PLUS_EXPR @1 @0) @0)
>>>>   @1)
>>
>>>> Richard.
>>>>
>>>>>> I will change this to have capture per pattern
>>>>>> tree captures1[4] = {};  // for pattern-1
>>>>>> tree captures2[4] = {};
>>>>>
>>>>> Hmm, is this the matching captures issue I mentioned?  Btw, I see
>>>>> you do
>>>>>
>>>>> +void
>>>>> +walk_operand_preorder(vec<operand *>& operands, operand *op)
>>>>> +{
>>>>> +  if (op->type == operand::OP_CAPTURE || op->type ==
>>>>> operand::OP_PREDICATE || op->type == operand::OP_C_EXPR)
>>>>> +    {
>>>>> +      operands.safe_push (op);
>>>>> +      return;
>>>>> +    }
>>>>>
>>>>> but that leaves captured expressions as a single operand?
>>>>>
>>>>> (plus (minus@1 @2 @3) @2)
>>>>>
>>>>> would have a decision tree
>>>>>
>>>>>  plus -> minus -> @2
>>>>>
>>>>> correct?
>>>>>
>>>>>> d) Matching multiple patterns.
>>>>>> Code for patterns with same match, but different transforms is
>>>>>> generated as follows:
>>>>>> code for match operand.
>>>>>> if (if-expr of pattern-1)
>>>>>> {
>>>>>>   code for result of pattern-1
>>>>>>   return true;
>>>>>> }
>>>>>> if (if-expr of pattern-2)
>>>>>> {
>>>>>>   code for result of pattern-2
>>>>>>   return true;
>>>>>> }
>>>>>
>>>>> good.
>>>>>
>>>>>> ...
>>>>>> Should we emit a warning for patterns that have same match operand
>>>>>> but
>>>>>> no if-expr and no manual transform ?
>>>>>>
>>>>>> for eg:
>>>>>> (match_and_simplify
>>>>>>   (plus (minus @0 @1) @1)
>>>>>>   @0
>>>>>>
>>>>>> (match_and_simplify
>>>>>>   (plus (minus @0 @1) @1)
>>>>>>   @1  // just for illustration
>>>>>>
>>>>>> Since the matching is ambiguous (same match, no if-expr, no manual
>>>>>> transform).
>>>>>
>>>>> Yeah, I suppose we should.
>>>>>
>>>>>> we are left to choose between result of pattern-1 and result of
>>>>>> pattern-2.
>>>>>> We can emit warning and choose result of pattern-1 (first-match rule
>>>>>> as in flex).
>>>>>>
>>>>>> e) Non-matching captures:
>>>>>> Haven't thought about this yet.
>>>>>>
>>>>>> * Should we add "negative" predicates that match only if the
>>>>>> predicate
>>>>>> fails ?
>>>>>> for eg:
>>>>>> (match_and_simplify
>>>>>>   trunc_mod integer_zerop@0 !integer_zerop)
>>>>>>   @0)
>>>>>
>>>>> well, there could simply be a integer_not_zerop predicate.
>>>>>
>>>>>> * Testing
>>>>>> Sorry to bring this up again, but I am still not clear what regex to
>>>>>> write in scan-tree-dump.
>>>>>>
>>>>>> Suppose we have these two patterns in match.pd:
>>>>>> /* (x + y) - y -> x */
>>>>>> (match_and_simplify
>>>>>>   (minus (plus @0 @1) @1)
>>>>>>   @0)
>>>>>>
>>>>>> /* (x - y) + y -> x */
>>>>>> (match_and_simplify
>>>>>>   (plus (minus @0 @1) @1)
>>>>>>   @0)
>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>>> x_\\d\+\\(D\\)"
>>>>>>
>>>>>> I have following test-cases:
>>>>>> int f1(int x, int y)
>>>>>> {
>>>>>>   int t1 = x + y;
>>>>>>   return t1 - y;
>>>>>> }
>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>>> x_\\d\+\\(D\\)"
>>>>>>
>>>>>> int f2(int x, int y)
>>>>>> {
>>>>>>   int t1 = x - y;
>>>>>>   return t1 + y;
>>>>>> }
>>>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*=
>>>>>> x_\\d\+\\(D\\)"
>>>>>>
>>>>>> both the test-cases have same regex.
>>>>>> Tested in isolation f1 passes (first pattern if fired) and f2 fails
>>>>>> (second pattern doesn't fire, it does after
>>>>>> adding it's commutative variant, but that's irrelevant for this
>>>>>> case).
>>>>>>
>>>>>> However when both test-cases are put in one file both the test cases
>>>>>> PASS.
>>>>>> I think that's because both of them have same regex: \[^\n\r\]*=
>>>>>> x_\\d\+\\(D\\)
>>>>>> so in f1 and f2's regex both match the dump for f1 function in
>>>>>> forwprop dump file:
>>>>>> "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)
>>>>>>
>>>>>> As a quick hack i rewrote f1 and f2 as:
>>>>>> int f1(int x, int y)
>>>>>> {
>>>>>>   int t1 = x + y;
>>>>>>   int f1_val = t1 - y;
>>>>>>   return f1_val;
>>>>>> }
>>>>>> scan-tree-dump "gimple_match_and_simplified to f1_val_\\d\+ =
>>>>>> x_\\d\+\\(D\\)"
>>>>>>
>>>>>> int f2(int x, int y)
>>>>>> {
>>>>>>   int t1 = x - y;
>>>>>>   int f2_val = t1 + y;
>>>>>>   return f2_val;
>>>>>> }
>>>>>> scan-tree-dump "gimple_match_and_simplified to f2_val_\\d\+ =
>>>>>> x_\\d\+\\(D\\)"
>>>>>> so both f1 and f2's scan-tree-dump have different regexes.
>>>>>> and f2's regex does not match dump of f1's function.
>>>>>> This matches all patterns in match-decision-tree.c however this is
>>>>>> not
>>>>>> ideal,
>>>>>> since it does not check for matching dump across newlines.
>>>>>> Could you suggest a better way ?
>>>>>
>>>>> There isn't a good better way (the others explicitely do _not_ match
>>>>> against
>>>>> a newline - see the ^ in the \[\] group).  Well, apart from splitting
>>>>> the testcase
>>>>> into multiple files of course.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> Thanks and Regards,
>>>>>> Prathamesh
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>>>> Thanks and Regards,
>>>>>>>>> Prathamesh
>>>>>>>>>>
>>>>>>>>>> Richard.
>>>>>>>>>>
>>>>>>>>>>> * Code generation.
>>>>>>>>>>> Code shall be generated by walking the decision tree.
>>>>>>>>>>> The way it's constructed, there's no difference between code
>>>>>>>>>>> generation
>>>>>>>>>>> for "matching" and code generation for "transform". For
>>>>>>>>>>> non-simplificaton
>>>>>>>>>>> operands, "matching" code is generated, and for "simplification"
>>>>>>>>>>> operands,
>>>>>>>>>>> "transform" code is generated. The tree shall be walked twice,
>>>>>>>>>>> once to generate GIMPLE code and second time for GENERIC.
>>>>>>>>>>> For simplicity, I currently return false whenever there's a fail
>>>>>>>>>>> in match,
>>>>>>>>>>> instead of trying to match the next pattern.
>>>>>>>>>>>
>>>>>>>>>>> Code-gen for capture - same as capture::gen_gimple_match.
>>>>>>>>>>>
>>>>>>>>>>> Code-gen for predicate -  I haven't added support for predicate
>>>>>>>>>>> in
>>>>>>>>>>> decision tree yet, but I guess that would be the same as
>>>>>>>>>>> predicate::gen_gimple_match
>>>>>>>>>>>
>>>>>>>>>>> Code-gen for expr.
>>>>>>>>>>> There are two types of code-gen for expr.
>>>>>>>>>>> The patch generates following code:
>>>>>>>>>>> Type 1 - expr is child of root node.
>>>>>>>>>>> the only code that gets generated is (in
>>>>>>>>>>> decision_tree::gen_gimple):
>>>>>>>>>>> if (code == <expr code>)
>>>>>>>>>>> {
>>>>>>>>>>> tree captures[4] = {}
>>>>>>>>>>> <generated code for children>
>>>>>>>>>>> }
>>>>>>>>>>> This is similar to generating matching code in
>>>>>>>>>>> write_nary_simplifiers.
>>>>>>>>>>>
>>>>>>>>>>> Type 2 - expr_1 is a child of another expr_0 node.
>>>>>>>>>>> The code gets generated as follows (dt_expr::gen_gimple):
>>>>>>>>>>> {
>>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op);
>>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == <expr_1-code>)
>>>>>>>>>>> {
>>>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt);
>>>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>>>>>>>> {
>>>>>>>>>>>   op = valueize (op);
>>>>>>>>>>>   if (!op) return false;
>>>>>>>>>>> }
>>>>>>>>>>> <code-gen for children of expr_1 node>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Example:
>>>>>>>>>>> (negate (negate @0))
>>>>>>>>>>> S1
>>>>>>>>>>>
>>>>>>>>>>> (negate (bit_not @0))
>>>>>>>>>>> S2
>>>>>>>>>>>
>>>>>>>>>>> decision tree:
>>>>>>>>>>>
>>>>>>>>>>>                 dummy/root
>>>>>>>>>>>                         |
>>>>>>>>>>>            NEGATE_EXPR
>>>>>>>>>>>              /                  \
>>>>>>>>>>>      BIT_NOT           NEGATE_EXPR
>>>>>>>>>>>            |                         |
>>>>>>>>>>>          @0                     @0
>>>>>>>>>>>            |                         |
>>>>>>>>>>>          S1                      S2
>>>>>>>>>>>
>>>>>>>>>>> // code-gen for NEGATE_EXPR (child of root):
>>>>>>>>>>> if (code == NEGATE_EXPR)
>>>>>>>>>>> {
>>>>>>>>>>> tree captures[4] = {};
>>>>>>>>>>> // code gen for BIT_NOT_EXPR
>>>>>>>>>>> {
>>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
>>>>>>>>>>> {
>>>>>>>>>>> tree op = gimple_assign_rhs1 (def_stmt);
>>>>>>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>>>>>>>> {
>>>>>>>>>>>   op = valueize (op);
>>>>>>>>>>>   if (!op) return false;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> // code-gen for @0, child of BIT_NOT_EXPR
>>>>>>>>>>> if (!captures[0])
>>>>>>>>>>>   captures[0] = op;
>>>>>>>>>>> else if (captures[0] != op)
>>>>>>>>>>>   return false;
>>>>>>>>>>>
>>>>>>>>>>> // code-gen for S1, child of @0
>>>>>>>>>>> < same as code generated by .gen_gimple_transform >
>>>>>>>>>>> return true;
>>>>>>>>>>> }
>>>>>>>>>>> // code gen for inner NEGATE_EXPR
>>>>>>>>>>> {
>>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
>>>>>>>>>>> <rest similar to the BIT_NOT case>
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> The following gets duplicated with the patch:
>>>>>>>>>>> {
>>>>>>>>>>> gimple_def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME)
>>>>>>>>>>>   return false;
>>>>>>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>>>>>>     && gimple_assign_rhs_code (def_stmt) == <expr-code>)
>>>>>>>>>>> ...
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> Improving code-gen for expr:
>>>>>>>>>>> "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get
>>>>>>>>>>> duplicated,
>>>>>>>>>>> while they could be factored out, similar to this:
>>>>>>>>>>>
>>>>>>>>>>> {
>>>>>>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>>>>>>> if (TREE_CODE (op0) != SSA_NAME)
>>>>>>>>>>>   return false;
>>>>>>>>>>> if (!is_gimple_assign (def_stmt))
>>>>>>>>>>>   return false;
>>>>>>>>>>> if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
>>>>>>>>>>> {
>>>>>>>>>>>   // code-gen for BIT_NOT_EXPR subtree
>>>>>>>>>>> }
>>>>>>>>>>> else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
>>>>>>>>>>> {
>>>>>>>>>>>   // code-gen for NEGATE_EXPR subtree
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) !=
>>>>>>>>>>> SSA_NAME"
>>>>>>>>>>> we could have that generated at the parent of expr's node rather
>>>>>>>>>>> than
>>>>>>>>>>> at expr. However that would be incorrect for cases where not all
>>>>>>>>>>> children
>>>>>>>>>>> of a node are expressions:
>>>>>>>>>>>
>>>>>>>>>>> Example:
>>>>>>>>>>> // patterns only for illustration
>>>>>>>>>>> (negate (bit_not @0))
>>>>>>>>>>> (negate @0)
>>>>>>>>>>>
>>>>>>>>>>>                    root
>>>>>>>>>>>                      |
>>>>>>>>>>>                   negate
>>>>>>>>>>>                     /       \
>>>>>>>>>>>                 bit_not   @0
>>>>>>>>>>>                     |
>>>>>>>>>>>                   @0
>>>>>>>>>>>
>>>>>>>>>>> we cannot have the above code generated at negate,
>>>>>>>>>>> since it's not applicable negate's 2nd child (@0).
>>>>>>>>>>>
>>>>>>>>>>> This can be done by grouping together children that are
>>>>>>>>>>> expressions.
>>>>>>>>>>> However the patch does not do that.
>>>>>>>>>>>
>>>>>>>>>>> * Code-gen for simplification operand
>>>>>>>>>>> This involves code-gen for ifexpr and result of pattern.
>>>>>>>>>>> Calls gen_gimple_transform of ifexpr and result
>>>>>>>>>>> (dt_simplify::gen_gimple)
>>>>>>>>>>> So this is really code-gen off AST
>>>>>>>>>>
>>>>>>>>>> Right (modulo replacing captures with their replacements).
>>>>>>>>>>
>>>>>>>>>>> * Matching multiple patterns
>>>>>>>>>>> A pattern has following parts: match, ifexpr and result.
>>>>>>>>>>> If pattern fails in match operand, I guess we can safely return
>>>>>>>>>>> false ?
>>>>>>>>>>> We "club" together patterns that have same match operand,
>>>>>>>>>>> and use goto, if one of them fails in their (ifexpr/result) and
>>>>>>>>>>> then goto the
>>>>>>>>>>> (ifexpr/result) of the next operand.
>>>>>>>>>>>
>>>>>>>>>>> Example:
>>>>>>>>>>> /* x & 0 -> 0 */
>>>>>>>>>>> (match_and_simplify
>>>>>>>>>>>   (bit_and @0 @1)
>>>>>>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 ==
>>>>>>>>>>> integer_zero_node))
>>>>>>>>>>>   { integer_zero_node; })
>>>>>>>>>>>
>>>>>>>>>>> /* x & -1 -> x */
>>>>>>>>>>> (match_and_simplify
>>>>>>>>>>>   (bit_and @0 @1)
>>>>>>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>>>>>>>>>>>      && (@1 == integer_minus_one_node)
>>>>>>>>>>>   @0)
>>>>>>>>>>>
>>>>>>>>>>> For both patterns match is same.
>>>>>>>>>>>
>>>>>>>>>>> Decision Tree:
>>>>>>>>>>>                 bit_and
>>>>>>>>>>>                 /        \
>>>>>>>>>>>              @0      @1
>>>>>>>>>>>                            |
>>>>>>>>>>>                         S1,  S2
>>>>>>>>>>
>>>>>>>>>> I think it's worth adding a diagnostic to genmach whenever
>>>>>>>>>> exactly
>>>>>>>>>> same matches appear.  But I'd say generally we'd support this
>>>>>>>>>> by testing the ifexpr of the next pattern.
>>>>>>>>>>
>>>>>>>>>>> S1 represents <ifexpr, result> of pattern-1, and S2 represents
>>>>>>>>>>> <ifexpr, result>
>>>>>>>>>>> of pattern-2 respectively.
>>>>>>>>>>> S1, S2 would be stored as children of @1 (the last operand of
>>>>>>>>>>> n-ary operator),
>>>>>>>>>>> in dt_operand::kids vector.
>>>>>>>>>>>
>>>>>>>>>>> The code would be generated as:
>>>>>>>>>>>
>>>>>>>>>>> matching code.
>>>>>>>>>>> if (! pattern-1 ifexpr condition)
>>>>>>>>>>>   goto simplify2;  // next pattern with the same "match"
>>>>>>>>>>> operand.
>>>>>>>>>>> transform code for pattern 1
>>>>>>>>>>>
>>>>>>>>>>> simplify2:
>>>>>>>>>>> if (! pattern-2 ifexpr condition)
>>>>>>>>>>>   return false;  // last pattern
>>>>>>>>>>> transform code for pattern 2.
>>>>>>>>>>>
>>>>>>>>>>> If matching itself fails, that is neither of the decisions get
>>>>>>>>>>> matched,
>>>>>>>>>>> I believe we can then return false as it cannot match any other
>>>>>>>>>>> pattern ?
>>>>>>>>>>>
>>>>>>>>>>> * patterns needing hacks like cond_expr or GENERIC support
>>>>>>>>>>> I haven't given thought to this yet. I suppose we can look to
>>>>>>>>>>> handle
>>>>>>>>>>> these after adding support for GENERIC. Instead of generating
>>>>>>>>>>> GENERIC
>>>>>>>>>>> matching in
>>>>>>>>>>> gimple_match_and_simplify, could we then call
>>>>>>>>>>> generic_match_and_simplify from
>>>>>>>>>>> within gimple_match_and_simplify ?
>>>>>>>>>>
>>>>>>>>>> Yes (that's what's currently done btw).
>>>>>>>>>>
>>>>>>>>>>> * Tests
>>>>>>>>>>> The patch transformed the following patterns:
>>>>>>>>>>>
>>>>>>>>>>> (match_and_simplify
>>>>>>>>>>>   (negate (bit_not @0))
>>>>>>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>>>>>>>>   (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); }))
>>>>>>>>>>>
>>>>>>>>>>> (match_and_simplify
>>>>>>>>>>>   (negate (negate @0))
>>>>>>>>>>>   @0)
>>>>>>>>>>>
>>>>>>>>>>> I have attached test-case I tried it with (negate.c)
>>>>>>>>>>>
>>>>>>>>>>> * Conclusion
>>>>>>>>>>> Does it sound reasonable ? I am going to be re-writing the
>>>>>>>>>>> decision tree from scratch, but is the basic idea fine ? Or
>>>>>>>>>>> should
>>>>>>>>>>> we
>>>>>>>>>>> take a different approach ?
>>>>>>>>>>>
>>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>> Prathamesh
>>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]