This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

From: Jeff Law <suzanne dot jeff dot law at gmail dot com>
To: Ajit Kumar Agarwal <ajit dot kumar dot agarwal at xilinx dot com>, Richard Biener <richard dot guenther at gmail dot com>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Vinod Kathail <vinodk at xilinx dot com>, Shail Aditya Gupta <shailadi at xilinx dot com>, Vidhumouli Hunsigida <vidhum at xilinx dot com>, Nagaraju Mekala <nmekala at xilinx dot com>
Date: Thu, 4 Feb 2016 01:57:22 -0700
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation
Authentication-results: sourceware.org; auth=none
References: <37378DC5BCD0EE48BA4B082E0B55DFAA41F3F56C at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295ADCB at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <55D4F921 dot 2020708 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA4297704C at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <5643A732 dot 4040707 at redhat dot com> <CAFiYyc07fxKy=pxo4t81M8WVOx9PLnzmMthkbBm52wub93dErQ at mail dot gmail dot com> <5644C6CC dot 90203 at redhat dot com> <5644DB59 dot 9040809 at redhat dot com> <56450B62 dot 4090404 at redhat dot com> <CAFiYyc06=_w4+B0C1d8y0rgCmNCy7vvsFfdvCqstnh6t5zsqbA at mail dot gmail dot com> <56460F19 dot 5010009 at redhat dot com> <0B62FFB6-DF7A-4080-A655-3E51070E1DEE at gmail dot com> <564646AA dot 5030300 at redhat dot com> <564673DA dot 3020403 at redhat dot com> <CAFiYyc2WXOS5kJFeKPa0aqaWbHV-CnDeWJ98zRa5X=FbAZ2THw at mail dot gmail dot com> <5669DBCD dot 1060507 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA429D4950 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com> <567A40E4 dot 1030508 at redhat dot com> <37378DC5BCD0EE48BA4B082E0B55DFAA429DED02 at XAP-PVEXMBX02 dot xlnx dot xilinx dot com>

On 01/04/2016 07:32 AM, Ajit Kumar Agarwal wrote:



-----Original Message-----
From: Jeff Law [mailto:law@redhat.com]
Sent: Wednesday, December 23, 2015 12:06 PM
To: Ajit Kumar Agarwal; Richard Biener
Cc: GCC Patches; Vinod Kathail; Shail Aditya Gupta; Vidhumouli Hunsigida; Nagaraju Mekala
Subject: Re: [Patch,tree-optimization]: Add new path Splitting pass on tree ssa representation

On 12/11/2015 02:11 AM, Ajit Kumar Agarwal wrote:


Mibench/EEMBC benchmarks (Target Microblaze)

Automotive_qsort1(4.03%), Office_ispell(4.29%), Office_stringsearch1(3.5%). Telecom_adpcm_d( 1.37%), ospfv2_lite(1.35%).

I'm having a real tough time reproducing any of these results.  In fact, I'm having a tough time seeing cases where path splitting even applies to the Mibench/EEMBC benchmarks >>mentioned above.

In the very few cases where split-paths might apply, the net resulting assembly code I get is the same with and without split-paths.

How consistent are these results?


I am consistently getting the gains for office_ispell and office_stringsearch1, telcom_adpcm_d. I ran it again today and we see gains in the same bench mark tests
with the split path changes.

What functions are being affected that in turn impact performance?


For office_ispell: The function are Function "linit (linit, funcdef_no=0, decl_uid=2535, cgraph_uid=0, symbol_order=2) for lookup.c file".
                                    "Function checkfile (checkfile, funcdef_no=1, decl_uid=2478, cgraph_uid=1, symbol_order=4)"
                                    " Function correct (correct, funcdef_no=2, decl_uid=2503, cgraph_uid=2, symbol_order=5)"
                                    " Function askmode (askmode, funcdef_no=24, decl_uid=2464, cgraph_uid=24, symbol_order=27)"
                                    for correct.c file.

For office_stringsearch1: The function is Function "bmhi_search (bmhi_search, funcdef_no=1, decl_uid=2178, cgraph_uid=1, symbol_order=5)"
for bmhisrch.c file.

In linit there are two path splitting opportunities. Neither of themare cases where path splitting exposes any CSE or DCE opportunities atthe tree level. In fact, in both cases there are no operands from thepredecessors that feed into the join (that we duplicate the split the path).

There's a path splitting opportunity in correct.c::givehelp which AFAICTis totally uninteresting from a performance standpoint. However, theseare one of the few cases where path splitting actually results insomething that is better optimized at the tree level. How ironic. We'deasily get the same result by sinking a statement down through a PHI ina manner similar to what's been suggested for 64700.

In correct.c::checkfile and correct.c::correct and correct.c::askmodethe path splitting opportunities do not lead to any furthersimplifications at the gimple level.


Similarly for bmhisrch.c::bmhi_search.

So when I look across all these examples, the only one that's really aCSE/DCE opportunity exposed by path splitting that is performanceimportant is adpcm_code.

The rest, AFAICT, benefit at a much lower level -- a diamond in the CFGwill require an unconditional branch from the end of one arm around theother arm to the join point. With path splitting that unconditionalbranch is eliminated. So there's a small gain for that. That gain maybe even larger on the microblaze because of its exposed delay slotarchitecture -- one less slot to fill. It may also result in bettercode layouts which help simplistic branch predictors.

So I find myself wondering if the primary effect we're looking for mostof the time is really elimination of that unconditional branch. And ifit is the case, then we're looking at a very different costing heuristic-- one that favors very small join blocks rather than larger ones (thatsupposedly help expose CSE/DCE, but in reality aren't for the benchmarksI've looked at).

And if that's the case, then we may really be looking at something thatbelongs at the RTL level rather than at the tree/gimple level. Sadly,it's harder to do things like duplicate blocks at the RTL level.


Anyway, I'm going to ponder some more.

jeff

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]