This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, Oct 29, 2013 at 3:28 AM, Richard Biener <richard.guenther@gmail.com> wrote: > On Sun, Oct 27, 2013 at 7:55 PM, Andrew Pinski <pinskia@gmail.com> wrote: >> On Sat, Oct 26, 2013 at 4:49 PM, Andrew Pinski <pinskia@gmail.com> wrote: >>> On Sat, Oct 26, 2013 at 2:30 PM, Andrew Pinski <pinskia@gmail.com> wrote: >>>> On Fri, Oct 18, 2013 at 2:21 AM, Zhenqiang Chen >>>> <zhenqiang.chen@linaro.org> wrote: >>>>> On 18 October 2013 00:58, Jeff Law <law@redhat.com> wrote: >>>>>> On 10/17/13 05:03, Richard Biener wrote: >>>>>>>>> >>>>>>>>> Is it OK for trunk? >>>>>>>> >>>>>>>> >>>>>>>> I had a much simpler change which did basically the same from 4.7 (I >>>>>>>> can update it if people think this is a better approach). >>>>>>> >>>>>>> >>>>>>> I like that more (note you can now use is_gimple_condexpr as predicate >>>>>>> for force_gimple_operand). >>>>>> >>>>>> The obvious question is whether or not Andrew's simpler change picks up as >>>>>> many transformations as Zhenqiang's change. If not are the things missed >>>>>> important. >>>>>> >>>>>> Zhenqiang, can you do some testing of your change vs Andrew P.'s change? >>>>> >>>>> Here is a rough compare: >>>>> >>>>> 1) Andrew P.'s change can not handle ssa-ifcombine-ccmp-3.c (included >>>>> in my patch). Root cause is that it does not skip "LABEL". The guard >>>>> to do this opt should be the same the bb_has_overhead_p in my patch. >>>> >>>> This should be an easy change, I am working on this right now. >>>> >>>>> >>>>> 2) Andrew P.'s change always generate TRUTH_AND_EXPR, which is not >>>>> efficient for "||". e.g. For ssa-ifcombine-ccmp-6.c, it will generate >>>>> >>>>> _3 = a_2(D) > 0; >>>>> _5 = b_4(D) > 0; >>>>> _6 = _3 | _5; >>>>> _9 = c_7(D) <= 0; >>>>> _10 = ~_6; >>>>> _11 = _9 & _10; >>>>> if (_11 == 0) >>>>> >>>>> With my patch, it will generate >>>>> >>>>> _3 = a_2(D) > 0; >>>>> _5 = b_4(D) > 0; >>>>> _6 = _3 | _5; >>>>> _9 = c_7(D) > 0; >>>>> _10 = _6 | _9; >>>>> if (_10 != 0) >>>> >>>> As mentioned otherwise, this seems like a missed optimization inside >>>> forwprop. When I originally wrote this code there used to be two >>>> cases one for & and one for |, but this was removed sometime and I >>>> just made the code evolve with that. >>> >>> Actually I can make a small patch (3 lines) to my current patch which >>> causes tree-ssa-ifcombine.c to produce the behavior of your patch. >>> >>> if (result_inv) >>> { >>> t = fold_build1 (TRUTH_NOT_EXPR, TREE_TYPE (t), t); >>> result_inv = false; >>> } >>> >>> I will submit a new patch after some testing of my current patch which >>> fixes items 1 and 2. >> >> >> Here is my latest patch which adds the testcases from Zhenqiang's >> patch and fixes item 1 and 2. >> >> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. > > return i; > } > +/* Return a new iterator pointing to the first non-debug non-label statement in > + basic block BB. */ > + > +static inline gimple_stmt_iterator > +gsi_start_nondebug_after_labels_bb (basic_block bb) > > > vertical space before the comment missing. > > @@ -631,7 +669,7 @@ tree_ssa_ifcombine (void) > bbs = single_pred_before_succ_order (); > calculate_dominance_info (CDI_DOMINATORS); > > - for (i = 0; i < n_basic_blocks - NUM_FIXED_BLOCKS; ++i) > + for (i = n_basic_blocks - NUM_FIXED_BLOCKS -1; i >= 0; i--) > { > basic_block bb = bbs[i]; > gimple stmt = last_stmt (bb); > > The original code matches what phiopt does which even has a comment: > > /* Search every basic block for COND_EXPR we may be able to optimize. > > We walk the blocks in order that guarantees that a block with > a single predecessor is processed before the predecessor. > This ensures that we collapse inner ifs before visiting the > outer ones, and also that we do not try to visit a removed > block. */ > bb_order = single_pred_before_succ_order (); > n = n_basic_blocks - NUM_FIXED_BLOCKS; > > for (i = 0; i < n; i++) > > so if the reverse order is better here (and in phiopt?) then it at least > needs a comment explaining why. > > Otherwise the patch looks good, but please iterate with Jeff over > the dom testcase issue. Here is what I applied in the end; Jeff told me just to remove the testcase. I added the comment trying to explain why it was the opposite order of PHI-opt. Thanks, Andrew Pinski ChangeLog: * tree-ssa-ifcombine.c: Include rtl.h and tm_p.h. (ifcombine_ifandif): Handle cases where maybe_fold_and_comparisons fails, combining the branches anyways. (tree_ssa_ifcombine): Inverse the order of the basic block walk, increases the number of combinings. * gimple.h (gsi_start_nondebug_after_labels_bb): New function. testsuite/ChangeLog: * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: New test case. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: New test case. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: New test case. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: New test case. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: New test case. * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: New test case. * gcc.dg/tree-ssa/phi-opt-9.c: Use a function call to prevent conditional move to be used. * gcc.dg/tree-ssa/ssa-dom-thread-3.c: Remove. > > Thanks, > Richard. > >> Thanks, >> Andrew Pinski >> >> ChangeLog: >> * tree-ssa-ifcombine.c: Include rtl.h and tm_p.h. >> (ifcombine_ifandif): Handle cases where >> maybe_fold_and_comparisons fails, combining the branches >> anyways. >> (tree_ssa_ifcombine): Inverse the order of >> the basic block walk, increases the number of combinings. >> * gimple.h (gsi_start_nondebug_after_labels_bb): New function. >> >> >> testsuite/ChangeLog: >> >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: New test case. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: New test case. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: New test case. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: New test case. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: New test case. >> * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: New test case. >> >> * gcc.dg/tree-ssa/phi-opt-9.c: Use a function call to prevent >> conditional move to be used. >> * gcc.dg/tree-ssa/ssa-dom-thread-3.c: Remove check for "one or more >> intermediate". >> >> >> >> >>> >>> Thanks, >>> Andrew Pinski >>> >>> >>>> >>>>> >>>>> 3) The good thing of Andrew P.'s change is that "Inverse the order of >>>>> the basic block walk" so it can do combine recursively. >>>>> >>>>> But I think we need some heuristic to control the number of ifs. Move >>>>> too much compares from >>>>> the inner_bb to outer_bb is not good. >>>> >>>> >>>> I think this depends on the target. For MIPS we don't want an upper >>>> bound as integer comparisons go directly to GPRs. I wrote this patch >>>> with MIPS in mind as that was the target I was working on. >>>> >>>>> >>>>> 4) Another good thing of Andrew P.'s change is that it reuses some >>>>> existing functions. So it looks much simple. >>>> >>>> >>>> Thanks, >>>> Andrew Pinski >>>> >>>>> >>>>>>> >>>>>>> With that we should be able to kill the fold-const.c transform? >>>>>> >>>>>> That would certainly be nice and an excellent follow-up for Zhenqiang. >>>>> >>>>> That's my final goal to "kill the fold-const.c transform". I think we >>>>> may combine the two changes to make a "simple" and "good" patch. >>>>> >>>>> Thanks! >>>>> -Zhenqiang
Attachment:
ssa_ifcombine.patch.txt
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |