This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gsoc 2014] moving fold-const patterns to gimple


On Thu, Mar 13, 2014 at 4:44 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Mar 11, 2014 at 12:20 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Mar 10, 2014 at 7:29 PM, Prathamesh Kulkarni
>> <bilbotheelffriend@gmail.com> wrote:
>>> Hi Richard,
>>> Sorry for the late reply. I would like to have few clarifications
>>> regarding the following points:
>>>
>>> a) Pattern matching: Currently, gimple_match_and_simplify() matches
>>> patterns one-by-one. Could we use a decision tree to do the matching
>>> instead (similar to insn-recog.c) ?
>>> For the moment, let's consider pattern matching on only unary
>>> expressions without valueize and predicates:
>>> pattern 1: (negate (negate @0))
>>> pattern 2: (negate (bit_not @0))
>>>
>>> from the two AST's corresponding to patterns (match::op), we can build
>>> a decision tree:
>>> Some-thing similar to:
>>>                        NEGATE_EXPR
>>>         NEGATE_EXPR            BIT_NOT_EXPR
>>>
>>> and then generate code corresponding to this decision tree in gimple-match.c
>>> so the generated code should look something similar to:
>>>
>>> tree
>>> gimple_match_and_simplify (enum tree_code code, tree type, tree op0,
>>> gimple_seq *seq, tree (*valueize)(tree))
>>> {
>>>   if (code == NEGATE_EXPR)
>>>     {
>>>       tree captures[4] = {};
>>>       if (TREE_CODE (op0) != SSA_NAME)
>>>         return NULL_TREE;
>>>       gimple def_stmt = SSA_NAM_DEF_STMT (op0);
>>>       if (!is_gimple_assign (def_stmt))
>>>         return NULL_TREE;
>>>       tree op = gimple_assign_rhs1 (def_stmt);
>>>       if (gimple_assign_rhs_code (op) == NEGATE_EXPR)
>>>         {
>>>            /* pattern (negate (negate @0)) matched */
>>>         }
>>>       else if (gimple_assign_rhs_code (op) == BIT_NOT_EXPR)
>>>         {
>>>            /* pattern (negate (bit_not_expr @0)) matched */
>>>         }
>>>       else
>>>            return NULL_TREE;
>>>      }
>>>   else
>>>      return NULL_TREE;
>>> }
>>>
>>> For commutative ops, the pattern can be duplicated by walking the
>>> children of the node in reverse order.
>>> (I am not exactly clear so far about representing binary operators in a decision
>>> tree) Is this the right way to go ? I shall try to shortly post a patch that
>>> implements this.
>>
>> Yes, that's the way to go (well, I'd even use a switch ()).
>>
>>> b) Targeting GENERIC, separating AST from gimple/generic:
>>> For generating a GENERIC pattern should there be another pattern
>>> something like match_and_simplify_generic ?
>>
>> Yes, there is an existing API in GCC for this that operates on GENERIC.
>> It's fold_unary_loc, fold_binary_loc, fold_ternary_loc.  The interface
>> the GENERIC match_and_simplify variant provides should match
>> that one.
>>
>>> Currently, the AST data structures (operand, expr, etc.)
>>> are tied to gimple (gen_gimple_match, gen_gimple_transform).
>>> We could also have similar functions: gen_generic_match,
>>> gen_generic_transform for generating GENERIC ?
>>
>> Yeah, but I'm not sure if keeping the (virtual) methods for generating
>> code makes much sense with a rewritten code generator.
>>
>>> Instead will it be better if we separate the AST
>>> from target IR (generic/gimple) and make simplify a visitor on AST
>>> (make simplify
>>> abstract class, with simplify_generic and simplify_gimple visitor
>>> classes that generate corresponding IR code) ?
>>
>> Yes.  Keep in mind the current state of genmatch.c is "quick hack
>> to make playing with the API side and with patterns possible" ;)
>>
>>> c) Shall it be a good idea in define_match <name> <pattern>, for
>>> name to act as a substitute for pattern (similar to flex pattern
>>> definitions), so the name can be used in bigger patterns ?
>>
>> Maybe, I suppose we'll see when adding more patterns.
>>
>>> d) This is silly, but maybe use constants to denote corresponding tree nodes ?
>>> for example instead of { build_int_cst (integer_type_node, 0); }, one
>>> could directly write 0, to denote a INTEGER_CST node with value 0.
>>
>> Yes, that might be possible - though it will require more knowledge
>> in the pattern matcher (you also want to match '0'?) and the code
>> generator.
>>
>>> e) There was a mention on the thread, regarding testing of patterns
>>> integrated into DSL. I wasn't able to understand that clearly. Could
>>> you explain that briefly ?
>>
>> DSL?  Currently I'd say it would be nice to make sure each pattern
>> is triggered by at least one GCC testcase - this requires looking
>> at a particular pass dump (that of forwprop or ccp are probably most suitable
>> as they are run very early).  I mentioned the possibility to do offline
>> (thus not with C testcases) testing but that would require some tool
>> to do that and it would be correctness testing (some automatic proof
>> generation tool - ISTR academics have this kind of stuff).  But that was
>> just an idea.
>>
>>> Regarding gsoc proposal, I would like to align it on the following points:
>>> a) Pattern matching using decision tree
>>
>> good.
>>
>>> b) Generate GIMPLE folding patterns (tree-ssa-forwprop,
>>> tree-ssa-sccvn, gimple-fold)
>>
>> I'd narrow it down a bit, you can optionally do more if time permits.
>> I'd say
>>   0) add basic arithmetic identities (x + 0, x  * 1, x / 1, etc., correctly
>> for all types - you can look into fold-const.c which handles all of them)
>>   1) target as much as possible of the existing transforms in forwprop
>>   2) pieces of fold-const.c: most interesting user is
>> gimple_fold_stmt_to_constant_1 (used by CCP and SCCVN and maybe
>> others)
>>   3) fold-const.c's fold_comparison (and fold_binary parts handling
>> comparison codes), this is also necessary for quite important parts of
>> forwprop
>>
>>> c) Generate GENERIC folding patterns (fold-const)
>>> Is that fine ? I would like to hear about any changes that you feel should be
>>> made.
>>
>> This isn't easily distinguishable from b), instead we might want to
>>
>>   c) Remove parts of fold-const.c that can be subsumed by the GENERIC
>> variant of the folding patterns
>>
>>> I have been mostly experimenting with the patch, by adding few patterns
>>> from tree-ssa-forwprop, and shall try to do pattern matching by a decision tree.
>>> Is there anything else you would like to suggest that
>>> can help me get comfortable with the code-base and the project-idea and
>>> could possibly help me strengthen my proposal ?
>>
>> Nothing off my head right now - there is always bugzilla to look for
>> missed optimization bugs.
>
> There are two meta-bugs (that link specific ones) for this area:
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=15459 and
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19986
>
Hi Richard,
Thanks for pointing out the links. I will try and solve the mentioned bugs

I had a look at PR 14753
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14753) from the first
link. I have tried to implement those transforms (attached patch,
stage-1 compiled).
I have written the transforms to operate on GENERIC.
Is that correct ?
The patterns mentioned in the links were:
a) (X >> CST1) >= CST2 -> X >= CST2 << CST1
however, an expression Y >= CST gets folded to Y > CST - 1
so the transform I wrote:
(X >> CST1) > CST2 -> X > CST2 << CST1

b) (X & ~CST) == 0 -> X <= CST
However ~CST gets folded.

so the transform I wrote:
(X & CST) == 0 -> X <= ~CST (~CST is folded by calling fold_unary)

After applying the patch, I got this output for the test-case in the
PR (169t.optimized):

;; Function foo (foo, funcdef_no=0, decl_uid=1745, symbol_order=0)
foo (unsigned int a)
{
  <bb 2>:
  if (a_1(D) > 8)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  bar ();

  <bb 4>:
  return;

}

;; Function baz (baz, funcdef_no=1, decl_uid=1748, symbol_order=1)

baz (unsigned int a)
{
  <bb 2>:
  if (a_1(D) <= 7)
    goto <bb 3>;
  else
    goto <bb 4>;

  <bb 3>:
  bar ();

  <bb 4>:
  return;

}

I have two very fundamental doubts:

a) It appears to me that folding is performed at all levels: GENERIC
(fold-const), GIMPLE (gimple-fold) and RTL (simplify-rtx). Is that
because, lowering IR may introduce opportunities for folding ? So it is needed
to be performed at each level ? Also since RTL expansion is semi-automatic
(by that I mean custom C code used in .md files to construct RTL), the
RTL insns may require folding ?

b) Why do we want to target GENERIC too ?
If I understand correctly, currently gimple folding is done by
genericizing (fold_stmt) ?
Why not move off folding entirely to GIMPLE ?
This shall probably be an issue, when front-ends (C for example)
require to perform constant folding
(for example expression in case statement), so the required expression
needs to be gimplified for evaluation.
So by generating both GENERIC and GIMPLE from a meta-description,
we avoid the cost of genericizing (as it's currently done),
and gimplifying an expression for evaluation (if all folding is moved
to gimple) ? Or have I completely missed the point here ?

Regarding the proposal, I have the following tentative timeline
in my mind:

Present - May 18: Get familiar with folding patterns (in fold-const,
tree-ssa-forwprop),
solve bugs mentioned in the links, start moving simple patterns to match.pd

May 19 - June 27:
a) Implement pattern matching using decision tree.
b) Support generating gimple and generic patterns
(gimple_match_and_simplify, generic_match_and_simplify)
c) Basic identities (for example the ones in fold_binary_loc:case
PLUS_EXPR and similar) for all types.
d) fold_comparison patterns and fold_binary handling of comparison codes.

June 28 - July 16:
e) Patterns from tree-ssa-forwprop

July 16 - August 11:
f) Patterns from fold-const.c

August 12 - August 18:
Fixing bugs, documentation, testing.
Also I would like to put somewhere "extending meta-description",
if required for writing patterns.

I would like to put "extending the meta-description", if it would be
needed while writing patterns.
I assume it shall take time to move most of patterns from
tree-ssa-forwprop and fold-const so I put them as single tasks in
respective time periods. If time permits, I would do more. I am ready
to commit 60-65 hours per week for the project. I shall get a draft of
the proposal ready within a couple of days.

Thanks and Regards,
Prathamesh
> Richard.
>
>> I've created svn//gcc.gnu.org/svn/gcc/branches/match-and-simplify now
>> and committed my current patch state.  I've used gcc/ChangeLog.mas
>> to log changes to the branch.
>>
>> Thanks,
>> Richard.
>>
>>> Thanks and Regards,
>>> Prathamesh
>>>>
>>>> Richard.
>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>> I have a fluent grasp on C and working knowledge of flex, bison, C++,
>>>>>>> POSIX api, binutils and shell scripting (bash),
>>>>>>> I have been through most of dragon book, built an interpreter
>>>>>>> for a "c-like" language and a C-code generator for a toy language
>>>>>>> similar to python.
>>>>>>>
>>>>>>> As far as gcc goes, I had attended IIT Bombay gcc workshop 2013:
>>>>>>> http://www.cse.iitb.ac.in/grc/gcc-workshop-13/
>>>>>>> and have been through the online docs.
>>>>>>>
>>>>>>> I have a couple of one-liner patches committed:
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00490.html
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01144.html
>>>>>>>
>>>>>>> A few pending patches:
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00143.html
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00143.html
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01220.html
>>>>>>> http://gcc.gnu.org/ml/gcc/2014-01/msg00268.html
>>>>>>>
>>>>>>> and a couple of rejected ones:
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00957.html
>>>>>>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01366.html
>>>>>>>
>>>>>>> Please do let me know what you think.
>>>>>>>
>>>>>>> Thanks and Regards,
>>>>>>> Prathamesh
>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> On the following test-case:
>>>>>>>>> int main()
>>>>>>>>> {
>>>>>>>>>   int a, b, c;
>>>>>>>>>   a = ~~~~b;
>>>>>>>>>   c = ~-a;
>>>>>>>>>   return 0;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> The following GIMPLE is generated:
>>>>>>>>> main ()
>>>>>>>>> gimple_bind <
>>>>>>>>>   int D.1748;
>>>>>>>>>   int D.1749;
>>>>>>>>>   int D.1750;
>>>>>>>>>   int D.1751;
>>>>>>>>>   int D.1752;
>>>>>>>>>   int a;
>>>>>>>>>   int b;
>>>>>>>>>   int c;
>>>>>>>>>
>>>>>>>>>   gimple_assign <var_decl, D.1749, b, NULL, NULL>
>>>>>>>>>   gimple_assign <var_decl, a, D.1749, NULL, NULL>
>>>>>>>>>   gimple_assign <plus_expr, c, a, -1, NULL>
>>>>>>>>>   gimple_assign <integer_cst, D.1752, 0, NULL, NULL>
>>>>>>>>>   gimple_return <D.1752>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The patch generates two tuples for a = ~~~~b,
>>>>>>>>> where only one is needed, and extra temporaries, which
>>>>>>>>> are not removed after the folding. How should I go about
>>>>>>>>> removing that (should I worry about that since subsequent passes,
>>>>>>>>> shall remove those ?)
>>>>>>>>>
>>>>>>>>> b) Some front-ends, C, for example, requires constant folding in certain places,
>>>>>>>>> like case statement. If constant folding is completely moved off to gimple,
>>>>>>>>> how shall this be handled ? Shall we gimplify the expression immediately if
>>>>>>>>> it's required to be evaluated ?
>>>>>>>>>
>>>>>>>>> Thanks and Regards,
>>>>>>>>> Prathamesh
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	(revision 208539)
+++ gcc/fold-const.c	(working copy)
@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3.
 #include "gimplify.h"
 #include "tree-dfa.h"
 #include "hash-table.h"  /* Required for ENABLE_FOLD_CHECKING.  */
+#include "tree-pretty-print.h"
 
 /* Nonzero if we are folding constants inside an initializer; zero
    otherwise.  */
@@ -9698,6 +9699,33 @@ fold_comparison (location_t loc, enum tr
 				       fold_convert_loc (loc, cmp_type, arg1)));
     }
 
+  /* Fold (X >> CST1) > CST2 -> X > CST2 << CST1 */ 
+  if (TREE_CODE (arg0) == RSHIFT_EXPR && code == GT_EXPR)
+    {
+      tree var = TREE_OPERAND (arg0, 0);
+      tree cst1 = TREE_OPERAND (arg0, 1);
+      if ((TREE_CODE (cst1) == INTEGER_CST || TREE_CODE (arg1) == VECTOR_CST)
+	  && (TREE_CODE (arg1) == INTEGER_CST || TREE_CODE (arg1) == VECTOR_CST))
+	return fold_build2_loc (loc, GT_EXPR, type, var, 
+				fold_convert_loc (loc, TREE_TYPE (var),
+						  int_const_binop (LSHIFT_EXPR, cst1, arg1)));
+    }
+
+  /* Fold (X & CST) == 0 -> X <= ~CST */
+  if (TREE_CODE (arg0) == BIT_AND_EXPR && code == EQ_EXPR && integer_zerop (arg1))
+    { 
+      tree var = TREE_OPERAND (arg0, 0);
+      tree cst = TREE_OPERAND (arg0, 1);
+      tree bit_not_expr; 
+
+      if (TREE_CODE (cst) == INTEGER_CST || TREE_CODE (cst) == VECTOR_CST)
+        {
+          bit_not_expr = fold_unary (BIT_NOT_EXPR, TREE_TYPE (cst), cst);  
+	  return fold_build2_loc (loc, LE_EXPR, type, var,
+				  fold_convert_loc (loc, TREE_TYPE (var), bit_not_expr));
+	}
+    }
+  
   return NULL_TREE;
 }
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]