This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [GSoC] decision tree first steps


On Tue, Jun 10, 2014 at 1:06 PM, Prathamesh Kulkarni
<bilbotheelffriend@gmail.com> wrote:
> On Fri, Jun 6, 2014 at 7:18 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Fri, Jun 6, 2014 at 12:02 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Fri, Jun 6, 2014 at 11:02 AM, Prathamesh Kulkarni
>>> <bilbotheelffriend@gmail.com> wrote:
>>>> On Mon, Jun 2, 2014 at 6:14 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Mon, Jun 2, 2014 at 1:16 PM, Prathamesh Kulkarni
>>>>> <bilbotheelffriend@gmail.com> wrote:
>>>>>> I have few questions regarding genmatch:
>>>>>>
>>>>>> a) Why is 4 hard-coded here: ?
>>>>>> in write_nary_simplifiers:
>>>>>>  fprintf (f, "      tree captures[4] = {};\n");
>>>>>
>>>>> Magic number (this must be big enough for all cases ...).  Honestly
>>>>> this should be improved (but requires another scan over the matcher IL
>>>>> to figure out the max N used in @N).
>>>>>
>>>>>> b) Should we add syntax for a symbol to denote multiple operators ?
>>>>>> For exampleim in simplify_rotate:
>>>>>> (X << CNT1) OP (X >> CNT2) with OP being +, |, ^  (CNT1 + CNT2 ==
>>>>>> bitsize of type of X).
>>>>>
>>>>> Something to enhance the IL with, yes.  I'd say we support
>>>>>
>>>>> (define_op additive PLUS_EXPR MINUS_EXPR POINTER_PLUS_EXPR)
>>>>>
>>>>> thus,
>>>>>
>>>>> (define_op <operator-name> op...)
>>>>>
>>>>>> c) Remove for parsing capture in parse_expr since we reject outermost
>>>>>> captured expressions ?
>>>>>
>>>>> but parse_expr is also used for inner expressions, no?
>>>>>
>>>>> (plus (minus@2 @0 @1) @3)
>>>>>
>>>>> should still work
>>>>>
>>>>>> d) I am not able to follow this comment in match.pd:
>>>>>> /* Patterns required to avoid SCCVN testsuite regressions.  */
>>>>>>
>>>>>> /* (x >> 31) & 1 -> (x >> 31).  Folding in fold-const is more
>>>>>>    complicated here, it does
>>>>>>      Fold (X << C1) & C2 into (X << C1) & (C2 | ((1 << C1) - 1))
>>>>>>      (X >> C1) & C2 into (X >> C1) & (C2 | ~((type) -1 >> C1))
>>>>>>      if the new mask might be further optimized.  */
>>>>>> (match_and_simplify
>>>>>>   (bit_and (rshift@0 @1 INTEGER_CST_P@2) integer_onep)
>>>>>>   if (compare_tree_int (@2, TYPE_PRECISION (TREE_TYPE (@1)) - 1) == 0)
>>>>>>   @0)
>>>>>
>>>>> The comment is literally copied from the case I extracted the
>>>>> (simplified) variant from fold-const.c.  See lines 11961-12056 in fold-const.c.
>>>>> It'll be a challenge to implement the equivalent in a pattern ;)
>>>>>
>>>>>>
>>>>>> Decision Tree.
>>>>>>     I have tried to come up with a prototype for decision tree (patch attached).
>>>>>> For simplicity, it handles patterns involving only unary operators and
>>>>>> no predicates
>>>>>> and returns false when the pattern fails to match (no goto to match
>>>>>> another pattern).
>>>>>> I meant to post it only for illustration, and I have not really paid
>>>>>> attention to code quality (bad formatting, memory leaks, etc.).
>>>>>>
>>>>>> * Basic Idea
>>>>>> A pattern consists of following parts: match, ifexpr and result.
>>>>>> Let's call <ifexpr, result> as "simplification" operand.
>>>>>> The common prefix between different match operands would be represented
>>>>>> by same nodes in the decision tree.
>>>>>>
>>>>>> Example:
>>>>>> (negate (bit_not @0))
>>>>>> S1
>>>>>>
>>>>>> (negate (negate @0))
>>>>>> S2
>>>>>>
>>>>>> S1, S2 denote simplifications for the above patterns respectively.
>>>>>>
>>>>>> The decision tree would look something like
>>>>>> (that's the way it gets constructed with the patch):
>>>>>>
>>>>>>                 dummy/root
>>>>>>                         |
>>>>>>            NEGATE_EXPR
>>>>>>              /                  \
>>>>>>      BIT_NOT           NEGATE_EXPR
>>>>>>            |                         |
>>>>>>          @0                     @0
>>>>>>            |                         |
>>>>>>          S1                      S2
>>>>>>
>>>>>> a) The children of an internal node are number of decisions that
>>>>>> can be taken at that node. In the above case it's 2 for outer NEGATE_EXPR.
>>>>>> b) Simplification operand represents leaves of the decision tree
>>>>>> c) Instead of having list of heads, I have added one dummy node,
>>>>>> and heads become children of these dummy root node.
>>>>>> d) Code-gen for non-simplification operands involves generating,
>>>>>> "matching" code and for simplification operands involves generating
>>>>>> "transform" code
>>>>>>
>>>>>> * Overall Flow:
>>>>>> I guess we would build the decision tree from the AST.
>>>>>> So the flow would be like:
>>>>>> source -> struct simplify (match, ifexpr, result) -> decision tree -> c code.
>>>>>>
>>>>>> Something like (in main):
>>>>>> decision_tree dt;
>>>>>> while (there is another pattern)
>>>>>> {
>>>>>>   simplify *s = parse_match_and_simplify ();
>>>>>>   insert s into decision tree;
>>>>>> };
>>>>>> So parsing routines are concerned with parsing and building the AST (operand),
>>>>>> and not with the decision tree. Is that fine ?
>>>>>
>>>>> Yes, that's good.
>>>>>
>>>>>> * Representation of decision tree.
>>>>>> A decision tree would need a way to represent language constructs
>>>>>> like capture, predicate, etc. so in some ways it would be similar to AST.
>>>>>> It
>>>>>> In the patch, I have created the following heirarchy:
>>>>>> dt_operand: represents a general base "operand" of in decision tree
>>>>>> dt_expr: for representing expression. Expression contains operation
>>>>>> to be performed (e_operation).
>>>>>> dt_capture: analogous to capture.
>>>>>> dt_simplify: for representing "simplification" operand.
>>>>>> simplification consists of ifexpr and result
>>>>>> dt_head: to represent "dummy" root. Maybe a separate class is not needed.
>>>>>>
>>>>>> * Constructing decision tree from AST
>>>>>> The algorithm i have used is similar to inserting string in a trie
>>>>>> outlined here: http://en.wikipedia.org/wiki/Trie
>>>>>> The difference shall be to traverse AST depth-first rather than
>>>>>> traversing the string.
>>>>>> Apart from that I guess it would be the same (naturally find and
>>>>>> compare operations would be different).
>>>>>> I haven't given much thought about this. Currently, I construct
>>>>>> decision tree for only patterns with unary operators in an
>>>>>> ugly way by "flattening" the AST by walking it in pre-order and
>>>>>> storing the nodes in vector in the order they are visited
>>>>>>
>>>>>> So to construct decision tree from the following AST:
>>>>>>           negate
>>>>>>               |
>>>>>>           bit_not
>>>>>>               |
>>>>>>              @0
>>>>>>
>>>>>> AST is flattened by walking in preorder (by walk_operand_preorder) and stored
>>>>>> as: [ expr, expr, capture ]
>>>>>> and this vector is used to construct decision tree.
>>>>>> I did it as a "quick and dirty" way, it has to be changed.
>>>>>> And it won't work for patterns with n-ary operators.
>>>>>>
>>>>>> We should visit each node of AST in preorder, and add that node
>>>>>> during traversal to decision tree. I am not yet clear on way of doing that.
>>>>>>
>>>>>> * Comparing operands
>>>>>> How do we compare operands ? We shall need to do this while inserting
>>>>>> in decision tree, since if the operand is already inserted we do not create
>>>>>> a new node.
>>>>>> In the patch (cmp_operand), I have used following rules:
>>>>>> a) if types not equal, they are clearly not equal.
>>>>>> b) if type is expr, compare operation->op->id
>>>>>> c) if type is capture, compare the number.
>>>>>> d) for predicate, compare on ident ?
>>>>>>
>>>>>> * Representing patterns with n-ary operators.
>>>>>> Consider following two match operands:
>>>>>> (plus (minus @0 @1) @1)
>>>>>> (plus (negate @0) @1)
>>>>>>
>>>>>> In decision tree, it would be represented as:
>>>>>>
>>>>>>             *********plus*********
>>>>>>                /    \          /        \
>>>>>>           minus  @1  negate  @1
>>>>>>             /     \           |
>>>>>>         @0   @1       @0
>>>>>>
>>>>>> plus has got 4 children - (minus @0 @1), @1, (negate @0), @1.
>>>>>> However (minus @0 @1) and @1 are not separate "decisions", but
>>>>>> children of plus within the same expression.
>>>>>>
>>>>>> For calculating number of decisions at an expression node:
>>>>>> number of decisions =  number of children / arity of operator.
>>>>>> in the above case, number of decision = 4 / 2 = 2
>>>>>>
>>>>>> For other cases, for instance capture node (the children would be
>>>>>> simplification operand),
>>>>>> number of decisions = number of children.
>>>>>
>>>>> Hmm.  I think it should be only two children from the plus,
>>>>> as pre-order for the first pattern is for example
>>>>> [plus minus @0 @1 @1] so you'd have
>>>>>
>>>>>    plus
>>>>>    /     \
>>>>> minus  negate
>>>>> |           |
>>>>> @0     @0
>>>>> |           |
>>>>> @1     @1
>>>>> |
>>>>> @1
>>>>>
>>>>> instead?
>>>>>
>>>>> Note that you probably have to deal with non-matching capture
>>>>> IDs by re-writing them on-the-fly.  That is, the two
>>>>> pre-oder traversals [plus minus @1 ...] and [plus minus @0 ...]
>>>>> should commonize with renaming the non-matching capture
>>>>> somehow.
>>>> Um, could you please elaborate on that (non-matching capture ?),
>>>> I am not sure if I understood it fully, thanks.
>>>
>>> I'm worried about
>>>
>>> (plus (minus@1 @2 @3) @2)
>>>
>>> vs.
>>>
>>> (plus (minus @1 @2) @1)
>>>
>>> (just made-up examples, of course).  Here we'd like to
>>> commonize the path checking for (plus (minus ...) ... but
>>> of course we can't easily process the captures at that point
>>> as the first has two captures (it captures the minus itself while
>>> the 2nd doesn't) and the captures that are at the same position
>>> use different indexes.  To finally match the minus we have to
>>> verify that in one case @2 == @2 and in the other case that
>>> @1 == @1.
>>>
>>> Re-numbering capture indexes in pre-order would improve things
>>> for the case where capture positions are in the same places but
>>> the indexes do not match.  It won't handle the above case
>>> of course.
> Thanks, I understand it now -:)

I suppose that it would be best (in the end) to not insert the
captures into the decision tree separately but leave them
"combined" and then only after the decision tree is fully built
decide on re-numbering.  And treat those that are required for
the matching process differently (those that assert that two
ops are equal).

I think we can defer the issue to later.

>>>
>>>> I have attached patch for patterns with binary operators (in principle
>>>> should work for n-ary operators,
>>>> but have only tested upto 2), that creates decision tree for
>>>> patterns (excluding predicates, and multiple matching). I don't intend
>>>> to commit this (contains many hacks!).
>>>>
>>>> * Constructing decision tree from AST
>>>> We can define our decision tree to store prefixes of preorder
>>>> traversals of diffferent AST.
>>>> Insertion happens as follows (decision_tree::insert)
>>>> a) Get preorder traversal of AST into vector.
>>>> b) Insert AST nodes from vector into decision tree.
>>>> Currently, I obtain preorder traversal into a vector. We can later
>>>> change it to compute
>>>> next preorder successor lazily.
>>>
>>> Just make the code easy to understand, optimizing it should be
>>> secondary.
>>>
>>>> * Code-gen
>>>> For n-ary operators, the patch generates code as follows:
>>>> if (code == <expr code>)
>>>> {
>>>>    match operand 0
>>>>      match operand 1
>>>>         ....
>>>>           match operand n-1
>>>>             transform
>>>>             return true;
>>>> }
>>>>
>>>> * Adding parent, level, pos fields to AST (operand).
>>>> One change (hack), I have made to code-gen, is naming of operands that are
>>>> assigned to gimple_assign_rhs in code-gen of expressions.
>>>> I am not sure if this is really needed.
>>>>
>>>> in AST, we have following representation for expressions:
>>>>                       expr-
>>>>                      /       \
>>>>                   left      right
>>>>                operand  operand
>>>>
>>>> code-gen off AST (expr::gen_gimple_match) produces code like:
>>>> {
>>>>   tree op = gimple_assign_rhs1 (def_stmt);
>>>>   // code-gen for left operand
>>>> }
>>>> {
>>>>   tree op = gimple_assign_rhs2 (def_stmt);
>>>>   // code-gen for right operand
>>>> }
>>>>
>>>> We can do this since, in expr::gen_gimple_match, we call
>>>> left_operand->gen_gimple_match,
>>>> come back to expr::gen_gimple_match (after
>>>> left_operand->gen_gimple_match() returns), and
>>>> then call right_operand->gen_gimple_match().
>>>>
>>>> However in decision tree, the expression gets represented as follows:
>>>>            expr
>>>>              |
>>>>            left-operand
>>>>              |
>>>>            right-operand
>>>>
>>>> dt_expr::gen_gimple calls left_operand->gen_gimple, which calls
>>>> right_operand->gen_gimple,
>>>> so we return to dt_expr::gen_gimple, after code is generated for the
>>>> whole subtree of expr.
>>>>
>>>> So I assign operands at the start before generating code for the subtree:
>>>> tree op00 = gimple_assign_rhs1 (def_stmt);
>>>> tree op10 = gimple_assign_rhs2 (def_stmt);
>>>> <code for left-operand>
>>>>   <code for right-operand>
>>>> each of the operands know their positions, so they know which operand
>>>> to use (get_op_name()).
>>>> the names are generated as:
>>>> op<position><level>, position = index of operand in it's parent's
>>>> expr's vector (vec<operand *> ops).
>>>> level: level of AST, at which the operand is stored.
>>>>
>>>> For this I added three fields to operand (unsigned pos, and unsigned
>>>> level, operand *parent), and accordingly
>>>> made changes to expr::append_op. In a way this is abusing the AST.
>>>> This probably works (works with test-cases i tried), but I don't like it much.
>>>
>>> The naming is ok, but indeed it doesn't feel very clean.  Somehow this
>>> belongs to the decision tree instead... (but then you need to lookup
>>> the decision tree operand info from code-gen which happens off the AST...).
>>>
>>>> * Order of matching.
>>>> Currently the order of matching operands is strictly 0, 1, .. n - 1
>>>> for n-ary operator.
>>>> However this might not be the best choice.
>>>>
>>>> For example, consider following patterns:
>>>> (operator  op1  op) S1
>>>> (operator  op2 op)  S2
>>>>
>>>> Both have common operator, and 1st common operand, they differ in 0th operand.
>>>> With patch, the code generated is:
>>>> if (code == <expr-code>)
>>>> {
>>>>   match op1
>>>>      match op
>>>>        S1
>>>>
>>>>   match op2
>>>>     match op
>>>>       S2
>>>> }
>>>>
>>>> A better ordering would be to match the 1st operand and then
>>>> respective 0th operands of both patterns:
>>>>
>>>> if (code == <expr-code>
>>>> {
>>>>   match op
>>>>     match op1
>>>>       S1
>>>>     match op0
>>>>       S2
>>>> }
>>>>
>>>> This complicates insertion into decision tree.
>>>
>>> I'd say we leave "optimizing" the decision tree to a later stage - it's a global
>>> optimization problem after all (I believe you can't do optimally with just
>>> looking at the present decision tree and the AST you want to insert).
>>>
>>>> * hack: only one generated gimple_match_and_simplify function
>>>> In the patch, I generate code for gimple_match_and_simplify with 3
>>>> operands version (op0, op1, op2)
>>>> and the other two (unary and binary versions), call it with NULL_TREE
>>>> for extra operands.
>>>> I will soon change that.
>>>>
>>>> * Testing patterns.
>>>> I think I have written most of the test-cases wrongly in match-2.c
>>>> For example this test-case doesn't match
>>>> (i haven't checked in "fix testsuite fallout" patch, so this might not be true
>>>> anymore)
>>>>
>>>> (match_and_simplify
>>>>   plus (minus @0 @1) @1)
>>>>   @0)
>>>>
>>>> for this test-case:
>>>> /* (x - y) + y -> x */
>>>> int f6(int x, int y)
>>>> {
>>>>   int t1 = x - y;
>>>>   return t1 + y;
>>>> }
>>>> /* { dg-final { scan-tree-dump "gimple_match_and_simplified to
>>>> \[^\n\r\]*= x_\\d\+\\(D\\)" "forwprop1" } } */
>>>>
>>>> I get following output (tree-forwprop-details):
>>>>
>>>> ;; Function f6 (f, funcdef_no=0, decl_uid=1744, symbol_order=0)
>>>>
>>>> gimple_match_and_simplified to _4 = y_2(D) + t1_3;
>>>> f (int x, int y)
>>>> {
>>>>   int t1;
>>>>   int _4;
>>>>
>>>>   <bb 2>:
>>>>   t1_3 = x_1(D) - y_2(D);
>>>>   _4 = x_1(D);
>>>>   return _4;
>>>>
>>>> }
>>>>
>>>> Shouldn't it show: gimple_match_and_simplified to _4 = x_1(D) instead ?
>>>
>>> Yeah.  I suppose you hit the issue that we don't commutate patterns,
>>> so try with a commutated pattern inserted.
>>>
>>>> The issue is this test-case fails in isolation, however it passes when
>>>> it is placed with other
>>>> test cases that PASS and have same regex in scan-tree-dump as this test-case:
>>>> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)"
>>>>
>>>> Summary:
>>>> I think we need to (at-least) deal with following problems for decision tree:
>>>
>>> I have to leave for a while now, I'll come back later and have a detailed
>>> look at the patch to try answer the questions below.
>>
>> Now for the remainder.
>>
>>>> a) Representation and construction of decision tree from AST -
>>>> We do this by storing prefixes of preorder traversals of AST in
>>>> decision tree. Do we finalize on that ?
>>
>> Yes.
>>
>>>> b) Order of operand matching
>>
>> I think that's the same as a) - the visiting order of the AST determines
>> the order of operand matching.
>>
>>>> c) Generating patterns for built-in functions (I guess this will be
>>>> similar to expr-gen).
>>
>> It should really be transparent.
>>
>>>> d) Multiple matching patterns
>>
>> The decision tree phase leaves us with a sequence of
>>
>>   if (if-expr of pattern1)
>>     transform-pattern1
>>   else (if-expr of pattern2)
>>     transform-pattern2
>> ...
>>
>> etc.  thus that part should be basically unchanged (only the match
>> part is driven by a decision tree).
>>
>>>> e) Commutative ops
>>
>> I'd do these as a pass over the AST of each pattern, duplicating
>> patterns.
>>
>> Btw, I see you mirror the AST structure with dt_{node,operand,expr,...}.
>> It seems to me that you can simply stick an AST operand * into
>> dt_node and be done with just having dt_node?  That is, the AST
>> operand refered to contains all necessary information already.
>>
>>>>
>>>> Pattern Enhancements:
>>>> Some of these were already discussed, I am jotting them here:
>>>> a) Symbol for multiple operators
>>>> b) grouping patterns by ifexpr
>>>> c) should we have MATCH_FAILED or FAIL to explicitly denote failure in
>>>> manual transform (c_expr) ?
>>>> d) making operators to be commutative.
>>
>> I'd say d) would be really nice to have.  Either with a grammar
>> extension to say which expression should have commutation applied
>> to (for example: (plus! (minus @1 @2) @2) would say that the
>> operands of plus should be commutated), or with a simple heuristic.
>> I'd say explicitly specifying is better.
>>
>> a) will probably be needed to avoid pattern explosion.  But similar to d)
>> I'd do it as a pre-pass on the AST, duplicating the patterns.  It's
>> also a good question how to define its semantic.  The mode-iterators
>> that the machine description supports would suggest we do instead
>> sth like
>>
>> (for OP in (plus minus mult)
>>   (match_and_simplify
>>     (OP @1 integer_zerop)
>>     ....))
>>
>> c) should be easy, I'm going to try sth like that - also setting temporaries
>> from the ifexpr and re-using them in the result like
>>
>>  (match_and_simplify
>>    (plus @1 @2)
>>    if (INTEGRAL_TYPE_P (TREE_TYPE (@2))
>>       && find_better_ops (&@3, &@4, @1, @2))
>>    (plus @3 @4))
>>
>> which also requires us to scan the AST for the highest capture index
>> (a good thing anyway, given my hack to statically set the size of
>> the captures array ;))
>>
>> b) I'm not sure it's worth the trouble (apart from doing less typing
>> in match.pd).  We're probably going to have GENERIC defined
>> when creating the generic routines and GIMPLE when creating
>> the gimple routines so we can easily disable patterns for one or
>> the other by doing
>>
>> #ifdef GIMPLE
>>  (match_and_simplify
>>    ...)
>> #endif
>>
>>>> Once the decision tree is done, I can try working on these.
>>>> Till Sunday, I will attempt to have another prototype, that covers
>>>> most of the patterns in match.pd
>>>> (except for cond_expr, and those requiring GENERIC support), and
>>>> accompanying correct test-case
>>>> for each pattern. And next-week start working on a fair patch.
>>
>> Sounds good.
> I have attached a patch, that fires most of the patterns from match.pd
> except for those that include
> non-matching capture or require GENERIC support, and added a test-case
> match-decision-tree.c
> (fires around 33 patterns).
>
> * Representation of decision tree:
> I clubbed dt_capture, dt_predicate, dt_expr into dt_operand, and added
> new gen_gimple_match() function in AST
> to generates matching code. dt_operand contains operand *op as member,
> and merely calls op->gen_gimple_match.
> (that's the intent anyway, in the patch, i needed to put some hacks to
> get it working).
> In the patch, i have kept: dt_node, dt_operand (dt_node), dt_simplify
> (dt_node), to separate matching operand
> from simplification (ifexpr, result).
>
> * Constructing decision tree from AST
> Same as previous patch, stores prefixes of preorder traversals of AST.
> decision_tree::print() pretty-prints the decision tree.
>
> I hope we won't need to change much up-to this point in the patch.
>
> * Code-gen
> code gen of (operator op0 op1 ..opN) simplify takes the following form:
> if (code == <expr code>)
> {
>   match op0
>     match op1
>        ...
>          match opN
>             simplify
>             return true;
> }
>
> a) Scope of temporaries (the ones that are assigned to gimple_assign_rhsX):
> for code-gen of expressions off the AST, we create a temporary:
>
> {
> tree op = gimple_assign_rhsX ();
> generate code for X'th operand
> }
>
> In the decision tree, an expression can only directly access it's 1st
> operand (since it's preorder traversal).
> So I decided to first create all temporaries and then generate code
> for operands:
>
> {
> tree op0 = gimple_assign_rhs1 ();
> tree op1 = gimple_assign_rhs2 ();
> tree opX-1 = gimple_assign_rhsX ();
>
> code for matching 0th operand
> code for matching 1st operand
> ...
> code for matching Nth operand
> }
>
> Unfortunately, this creates a problem if the operand itself is an expression,
> since it's temporaries would also be named op0, op1 ... they would
> shadow the outer op0, op1 .. temporaries.
> So we need distinct names for each temporary (for all temps within one pattern).
> The temporary name is created by:
> op<position of child operand><level of tree>;
> and each child of AST knows it's position and level, so it can
> correctly access it's temp.
> This requires adding members operand *parent, unsigned pos, unsigned
> level to struct operand.
> I have done it this way in the patch, however it doesn't feel clean, I
> will try to see if we can change this.
>
> b) Short-ranged gotos
> consider: (negate @0) S1
> The code is generated as (with the patch):
>
> if (code == PLUS)
> {
>   if (!captures [0])
>    {
>       captures[0] = op0;
>       goto L0;
>    }
>   else if (captures[0] == op0)
>    {
> L0:
>       // simplification code S1
>       return true;
>    }
> }
>
> Instead of:
> if (!captures[0])
>   captures[0] = op0;
> else if (captures[0] != op0)
>   goto L0;
>
> // Simplification code S1
> return true;
>
> L0:  // match next pattern
>
> Generating code the former way is more easier, since location of label
> immediately follows
> capture code (if (!captures[0]) ...).  In the latter case, I probably
> need to store label name in the sibling's node in
> the decision tree and then retrieve that.
> Similarly for valueize (expr::gen_valueize).

But then why not generate

  if (!captures[0]
      || captures[0] == op0)
    {
      captures[0] = op0;
      // simplification code S1
      return true;
    }

and go without label and goto?  Or did I miss something?

> c) Shared captures between all patterns:
> In the patch all patterns share the capture array tree captures[4] = {}
> (since all match patterns get interleaved in generated code off decision tree).
> This requires extra code to be generated to make sure captures are
> correctly restored for another pattern.
> I will change this to have capture per pattern
> tree captures1[4] = {};  // for pattern-1
> tree captures2[4] = {};

Hmm, is this the matching captures issue I mentioned?  Btw, I see
you do

+void
+walk_operand_preorder(vec<operand *>& operands, operand *op)
+{
+  if (op->type == operand::OP_CAPTURE || op->type ==
operand::OP_PREDICATE || op->type == operand::OP_C_EXPR)
+    {
+      operands.safe_push (op);
+      return;
+    }

but that leaves captured expressions as a single operand?

(plus (minus@1 @2 @3) @2)

would have a decision tree

 plus -> minus -> @2

correct?

> d) Matching multiple patterns.
> Code for patterns with same match, but different transforms is
> generated as follows:
> code for match operand.
> if (if-expr of pattern-1)
> {
>   code for result of pattern-1
>   return true;
> }
> if (if-expr of pattern-2)
> {
>   code for result of pattern-2
>   return true;
> }

good.

> ...
> Should we emit a warning for patterns that have same match operand but
> no if-expr and no manual transform ?
>
> for eg:
> (match_and_simplify
>   (plus (minus @0 @1) @1)
>   @0
>
> (match_and_simplify
>   (plus (minus @0 @1) @1)
>   @1  // just for illustration
>
> Since the matching is ambiguous (same match, no if-expr, no manual transform).

Yeah, I suppose we should.

> we are left to choose between result of pattern-1 and result of pattern-2.
> We can emit warning and choose result of pattern-1 (first-match rule
> as in flex).
>
> e) Non-matching captures:
> Haven't thought about this yet.
>
> * Should we add "negative" predicates that match only if the predicate fails ?
> for eg:
> (match_and_simplify
>   trunc_mod integer_zerop@0 !integer_zerop)
>   @0)

well, there could simply be a integer_not_zerop predicate.

> * Testing
> Sorry to bring this up again, but I am still not clear what regex to
> write in scan-tree-dump.
>
> Suppose we have these two patterns in match.pd:
> /* (x + y) - y -> x */
> (match_and_simplify
>   (minus (plus @0 @1) @1)
>   @0)
>
> /* (x - y) + y -> x */
> (match_and_simplify
>   (plus (minus @0 @1) @1)
>   @0)
> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)"
>
> I have following test-cases:
> int f1(int x, int y)
> {
>   int t1 = x + y;
>   return t1 - y;
> }
> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)"
>
> int f2(int x, int y)
> {
>   int t1 = x - y;
>   return t1 + y;
> }
> scan-tree-dump "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)"
>
> both the test-cases have same regex.
> Tested in isolation f1 passes (first pattern if fired) and f2 fails
> (second pattern doesn't fire, it does after
> adding it's commutative variant, but that's irrelevant for this case).
>
> However when both test-cases are put in one file both the test cases PASS.
> I think that's because both of them have same regex: \[^\n\r\]*= x_\\d\+\\(D\\)
> so in f1 and f2's regex both match the dump for f1 function in
> forwprop dump file:
> "gimple_match_and_simplified to \[^\n\r\]*= x_\\d\+\\(D\\)
>
> As a quick hack i rewrote f1 and f2 as:
> int f1(int x, int y)
> {
>   int t1 = x + y;
>   int f1_val = t1 - y;
>   return f1_val;
> }
> scan-tree-dump "gimple_match_and_simplified to f1_val_\\d\+ = x_\\d\+\\(D\\)"
>
> int f2(int x, int y)
> {
>   int t1 = x - y;
>   int f2_val = t1 + y;
>   return f2_val;
> }
> scan-tree-dump "gimple_match_and_simplified to f2_val_\\d\+ = x_\\d\+\\(D\\)"
> so both f1 and f2's scan-tree-dump have different regexes.
> and f2's regex does not match dump of f1's function.
> This matches all patterns in match-decision-tree.c however this is not ideal,
> since it does not check for matching dump across newlines.
> Could you suggest a better way ?

There isn't a good better way (the others explicitely do _not_ match against
a newline - see the ^ in the \[\] group).  Well, apart from splitting
the testcase
into multiple files of course.

Richard.

> Thanks and Regards,
> Prathamesh
>>
>> Thanks,
>> Richard.
>>
>>>> Thanks and Regards,
>>>> Prathamesh
>>>>>
>>>>> Richard.
>>>>>
>>>>>> * Code generation.
>>>>>> Code shall be generated by walking the decision tree.
>>>>>> The way it's constructed, there's no difference between code generation
>>>>>> for "matching" and code generation for "transform". For non-simplificaton
>>>>>> operands, "matching" code is generated, and for "simplification" operands,
>>>>>> "transform" code is generated. The tree shall be walked twice,
>>>>>> once to generate GIMPLE code and second time for GENERIC.
>>>>>> For simplicity, I currently return false whenever there's a fail in match,
>>>>>> instead of trying to match the next pattern.
>>>>>>
>>>>>> Code-gen for capture - same as capture::gen_gimple_match.
>>>>>>
>>>>>> Code-gen for predicate -  I haven't added support for predicate in
>>>>>> decision tree yet, but I guess that would be the same as
>>>>>> predicate::gen_gimple_match
>>>>>>
>>>>>> Code-gen for expr.
>>>>>> There are two types of code-gen for expr.
>>>>>> The patch generates following code:
>>>>>> Type 1 - expr is child of root node.
>>>>>> the only code that gets generated is (in decision_tree::gen_gimple):
>>>>>> if (code == <expr code>)
>>>>>> {
>>>>>> tree captures[4] = {}
>>>>>> <generated code for children>
>>>>>> }
>>>>>> This is similar to generating matching code in write_nary_simplifiers.
>>>>>>
>>>>>> Type 2 - expr_1 is a child of another expr_0 node.
>>>>>> The code gets generated as follows (dt_expr::gen_gimple):
>>>>>> {
>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op);
>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>     && gimple_assign_rhs_code (def_stmt) == <expr_1-code>)
>>>>>> {
>>>>>> tree op = gimple_assign_rhs1 (def_stmt);
>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>>> {
>>>>>>   op = valueize (op);
>>>>>>   if (!op) return false;
>>>>>> }
>>>>>> <code-gen for children of expr_1 node>
>>>>>> }
>>>>>>
>>>>>> Example:
>>>>>> (negate (negate @0))
>>>>>> S1
>>>>>>
>>>>>> (negate (bit_not @0))
>>>>>> S2
>>>>>>
>>>>>> decision tree:
>>>>>>
>>>>>>                 dummy/root
>>>>>>                         |
>>>>>>            NEGATE_EXPR
>>>>>>              /                  \
>>>>>>      BIT_NOT           NEGATE_EXPR
>>>>>>            |                         |
>>>>>>          @0                     @0
>>>>>>            |                         |
>>>>>>          S1                      S2
>>>>>>
>>>>>> // code-gen for NEGATE_EXPR (child of root):
>>>>>> if (code == NEGATE_EXPR)
>>>>>> {
>>>>>> tree captures[4] = {};
>>>>>> // code gen for BIT_NOT_EXPR
>>>>>> {
>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>     && gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
>>>>>> {
>>>>>> tree op = gimple_assign_rhs1 (def_stmt);
>>>>>> if (valueize && TREE_CODE (op) == SSA_NAME)
>>>>>> {
>>>>>>   op = valueize (op);
>>>>>>   if (!op) return false;
>>>>>> }
>>>>>>
>>>>>> // code-gen for @0, child of BIT_NOT_EXPR
>>>>>> if (!captures[0])
>>>>>>   captures[0] = op;
>>>>>> else if (captures[0] != op)
>>>>>>   return false;
>>>>>>
>>>>>> // code-gen for S1, child of @0
>>>>>> < same as code generated by .gen_gimple_transform >
>>>>>> return true;
>>>>>> }
>>>>>> // code gen for inner NEGATE_EXPR
>>>>>> {
>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>     && gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
>>>>>> <rest similar to the BIT_NOT case>
>>>>>> }
>>>>>>
>>>>>> The following gets duplicated with the patch:
>>>>>> {
>>>>>> gimple_def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>> if (TREE_CODE (op0) != SSA_NAME)
>>>>>>   return false;
>>>>>> if (is_gimple_assign (def_stmt)
>>>>>>     && gimple_assign_rhs_code (def_stmt) == <expr-code>)
>>>>>> ...
>>>>>> }
>>>>>>
>>>>>> Improving code-gen for expr:
>>>>>> "gimple def_stmt = ..." and "if (TREE_CODE (op0)" get duplicated,
>>>>>> while they could be factored out, similar to this:
>>>>>>
>>>>>> {
>>>>>> gimple def_stmt = SSA_NAME_DEF_STMT (op0);
>>>>>> if (TREE_CODE (op0) != SSA_NAME)
>>>>>>   return false;
>>>>>> if (!is_gimple_assign (def_stmt))
>>>>>>   return false;
>>>>>> if (gimple_assign_rhs_code (def_stmt) == BIT_NOT_EXPR)
>>>>>> {
>>>>>>   // code-gen for BIT_NOT_EXPR subtree
>>>>>> }
>>>>>> else if (gimple_assign_rhs_code (def_stmt) == NEGATE_EXPR)
>>>>>> {
>>>>>>   // code-gen for NEGATE_EXPR subtree
>>>>>> }
>>>>>>
>>>>>> For factoring "gimple def_stmt ..." and "if (TREE_CODE (op0) != SSA_NAME"
>>>>>> we could have that generated at the parent of expr's node rather than
>>>>>> at expr. However that would be incorrect for cases where not all children
>>>>>> of a node are expressions:
>>>>>>
>>>>>> Example:
>>>>>> // patterns only for illustration
>>>>>> (negate (bit_not @0))
>>>>>> (negate @0)
>>>>>>
>>>>>>                    root
>>>>>>                      |
>>>>>>                   negate
>>>>>>                     /       \
>>>>>>                 bit_not   @0
>>>>>>                     |
>>>>>>                   @0
>>>>>>
>>>>>> we cannot have the above code generated at negate,
>>>>>> since it's not applicable negate's 2nd child (@0).
>>>>>>
>>>>>> This can be done by grouping together children that are expressions.
>>>>>> However the patch does not do that.
>>>>>>
>>>>>> * Code-gen for simplification operand
>>>>>> This involves code-gen for ifexpr and result of pattern.
>>>>>> Calls gen_gimple_transform of ifexpr and result (dt_simplify::gen_gimple)
>>>>>> So this is really code-gen off AST
>>>>>
>>>>> Right (modulo replacing captures with their replacements).
>>>>>
>>>>>> * Matching multiple patterns
>>>>>> A pattern has following parts: match, ifexpr and result.
>>>>>> If pattern fails in match operand, I guess we can safely return false ?
>>>>>> We "club" together patterns that have same match operand,
>>>>>> and use goto, if one of them fails in their (ifexpr/result) and then goto the
>>>>>> (ifexpr/result) of the next operand.
>>>>>>
>>>>>> Example:
>>>>>> /* x & 0 -> 0 */
>>>>>> (match_and_simplify
>>>>>>   (bit_and @0 @1)
>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && (@1 == integer_zero_node))
>>>>>>   { integer_zero_node; })
>>>>>>
>>>>>> /* x & -1 -> x */
>>>>>> (match_and_simplify
>>>>>>   (bit_and @0 @1)
>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>>>>>>      && (@1 == integer_minus_one_node)
>>>>>>   @0)
>>>>>>
>>>>>> For both patterns match is same.
>>>>>>
>>>>>> Decision Tree:
>>>>>>                 bit_and
>>>>>>                 /        \
>>>>>>              @0      @1
>>>>>>                            |
>>>>>>                         S1,  S2
>>>>>
>>>>> I think it's worth adding a diagnostic to genmach whenever exactly
>>>>> same matches appear.  But I'd say generally we'd support this
>>>>> by testing the ifexpr of the next pattern.
>>>>>
>>>>>> S1 represents <ifexpr, result> of pattern-1, and S2 represents <ifexpr, result>
>>>>>> of pattern-2 respectively.
>>>>>> S1, S2 would be stored as children of @1 (the last operand of n-ary operator),
>>>>>> in dt_operand::kids vector.
>>>>>>
>>>>>> The code would be generated as:
>>>>>>
>>>>>> matching code.
>>>>>> if (! pattern-1 ifexpr condition)
>>>>>>   goto simplify2;  // next pattern with the same "match" operand.
>>>>>> transform code for pattern 1
>>>>>>
>>>>>> simplify2:
>>>>>> if (! pattern-2 ifexpr condition)
>>>>>>   return false;  // last pattern
>>>>>> transform code for pattern 2.
>>>>>>
>>>>>> If matching itself fails, that is neither of the decisions get matched,
>>>>>> I believe we can then return false as it cannot match any other pattern ?
>>>>>>
>>>>>> * patterns needing hacks like cond_expr or GENERIC support
>>>>>> I haven't given thought to this yet. I suppose we can look to handle
>>>>>> these after adding support for GENERIC. Instead of generating GENERIC
>>>>>> matching in
>>>>>> gimple_match_and_simplify, could we then call generic_match_and_simplify from
>>>>>> within gimple_match_and_simplify ?
>>>>>
>>>>> Yes (that's what's currently done btw).
>>>>>
>>>>>> * Tests
>>>>>> The patch transformed the following patterns:
>>>>>>
>>>>>> (match_and_simplify
>>>>>>   (negate (bit_not @0))
>>>>>>   if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>>>   (plus @0 { build_int_cst (TREE_TYPE (@0)), 1); }))
>>>>>>
>>>>>> (match_and_simplify
>>>>>>   (negate (negate @0))
>>>>>>   @0)
>>>>>>
>>>>>> I have attached test-case I tried it with (negate.c)
>>>>>>
>>>>>> * Conclusion
>>>>>> Does it sound reasonable ? I am going to be re-writing the
>>>>>> decision tree from scratch, but is the basic idea fine ? Or should we
>>>>>> take a different approach ?
>>>>>>
>>>>>> Thanks and Regards,
>>>>>> Prathamesh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]