This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

From: Jeff Law <law at redhat dot com>
To: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>, Richard Biener <richard dot guenther at gmail dot com>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
Date: Wed, 13 Jan 2016 01:10:54 -0700
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
Authentication-results: sourceware.org; auth=none
References: <37378DC5BCD0EE48BA4B082E0B55DFAA41F3F56C at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295ADCB at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <55D4F921 dot 2020708 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4297704C at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <5643A732 dot 4040707 at redhat dot com> <CAFiYyc07fxKy=pxo4t81M8WVOx9PLnzmMthkbBm52wub93dErQ at mail dot gmail dot com> <5644C6CC dot 90203 at redhat dot com> <5644DB59 dot 9040809 at redhat dot com> <56450B62 dot 4090404 at redhat dot com> <CAFiYyc06=_w4+B0C1d8y0rgCmNCy7vvsFfdvCqstnh6t5zsqbA at mail dot gmail dot com> <56460F19 dot 5010009 at redhat dot com> <0B62FFB6-DF7A-4080-A655-3E51070E1DEE at gmail dot com> <564646AA dot 5030300 at redhat dot com> <564673DA dot 3020403 at redhat dot com> <CAFiYyc2WXOS5kJFeKPa0aqaWbHV-CnDeWJ98zRa5X=FbAZ2THw at mail dot gmail dot com> <5669DBCD dot 1060507 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA429D4950 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <567A40E4 dot 1030508 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA429DED02 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com>

On 01/04/2016 07:32 AM, Ajit Kumar Agarwal wrote:


I am consistently getting the gains for office_ispell and office_stringsearch1, telcom_adpcm_d. I ran it again today and we see gains in the same bench mark tests
with the split path changes.

What functions are being affected that in turn impact performance?


For office_ispell: The function are Function "linit (linit, funcdef_no=0, decl_uid=2535, cgraph_uid=0, symbol_order=2) for lookup.c file".
                                    "Function checkfile (checkfile, funcdef_no=1, decl_uid=2478, cgraph_uid=1, symbol_order=4)"
                                    " Function correct (correct, funcdef_no=2, decl_uid=2503, cgraph_uid=2, symbol_order=5)"
                                    " Function askmode (askmode, funcdef_no=24, decl_uid=2464, cgraph_uid=24, symbol_order=27)"
                                    for correct.c file.

For office_stringsearch1: The function is Function "bmhi_search (bmhi_search, funcdef_no=1, decl_uid=2178, cgraph_uid=1, symbol_order=5)"
for bmhisrch.c file.

So I can see split-paths affecting adpcm & lookup. I don't see itaffecting correct.c or bmhisrch.c.

That's progress though. It's likely one of one or more of the flags iscritical, so thanks for passing those along.

I'm going to focus on adpcm for the moment, in particular adpcm_coder.It appears the key blocks are:



;;   basic block 14, loop depth 1, count 0, freq 9100, maybe hot
;;    prev block 13, next block 15, flags: (NEW, REACHABLE)
;;    pred:       12 [100.0%]  (FALLTHRU,EXECUTABLE)
;;                13 [100.0%]  (FALLTHRU,EXECUTABLE)
  # valpred_12 = PHI <valpred_54(12), valpred_55(13)>
  _112 = MAX_EXPR <valpred_12, -32768>;
  valpred_18 = MIN_EXPR <_112, 32767>;
  delta_56 = delta_7 | iftmp.1_114;
  _57 = indexTable[delta_56];
  index_58 = _57 + index_107;
  _113 = MIN_EXPR <index_58, 88>;
  index_111 = MAX_EXPR <_113, 0>;
  step_59 = stepsizeTable[index_111];
  if (bufferstep_93 != 0)
    goto <bb 15>;
  else
    goto <bb 16>;
;;    succ:       15 [50.0%]  (TRUE_VALUE,EXECUTABLE)
;;                16 [50.0%]  (FALSE_VALUE,EXECUTABLE)

;;   basic block 15, loop depth 1, count 0, freq 4550, maybe hot
;;    prev block 14, next block 16, flags: (NEW, REACHABLE)
;;    pred:       14 [50.0%]  (TRUE_VALUE,EXECUTABLE)
  _60 = delta_56 << 4;
  goto <bb 17>;
;;    succ:       17 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 16, loop depth 1, count 0, freq 4550, maybe hot
;;    prev block 15, next block 17, flags: (NEW, REACHABLE)
;;    pred:       14 [50.0%]  (FALSE_VALUE,EXECUTABLE)
  outp_62 = outp_83 + 1;
  _63 = (signed char) delta_56;
  _65 = (signed char) outputbuffer_90;
  _66 = _63 | _65;
  *outp_83 = _66;
;;    succ:       17 [100.0%]  (FALLTHRU,EXECUTABLE)

;;   basic block 17, loop depth 1, count 0, freq 9100, maybe hot
;;    prev block 16, next block 18, flags: (NEW, REACHABLE)
;;    pred:       15 [100.0%]  (FALLTHRU,EXECUTABLE)
;;                16 [100.0%]  (FALLTHRU,EXECUTABLE)
  # outp_3 = PHI <outp_83(15), outp_62(16)>
  # outputbuffer_21 = PHI <_60(15), outputbuffer_90(16)>
  _109 = bufferstep_93 ^ 1;
  _98 = _109 & 1;
  ivtmp.11_68 = ivtmp.11_105 + 2;
  if (ivtmp.11_68 != _116)
    goto <bb 4>;
  else
    goto <bb 18>;

Block #17 is the join point that we're going to effectively copy intoblocks #15 and #16. Doing so in turn exposes bufferstep_93 as theconstant 0 in block #16, which in turn allows elimination of a couplestatements in the extended version of block #16 and we propagate theconstant 1 for bufferstep_93 to the top of the loop when reached viablock #16. So we save a few instructions. However, I think we'reactually doing a fairly poor job here.

bufferstep is a great example of a flip-flop variable and its value isstatically computable based on the path from the prior loop iterationwhich, if exploited would allow the FSM threader to eliminate theconditional at the end of bb14. I'm going to have to play with that.

Anyway, it's late and I want to rip this test apart a bit more and seehow it interacts with the heuristic that I've cobbled together as wellas see what it would take to have DOM or VRP get data on bufferstep_93on the true path out of BB14 after a path-split.


Jeff

Follow-Ups:
- Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
  - From: Jeff Law
- Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
  - From: Jeff Law

References:
- RE: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
  - From: Ajit Kumar Agarwal

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]