This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATH, SH] Small builtin_strlen improvement

From: Oleg Endo <oleg dot endo at t-online dot de>
To: Christian Bruel <christian dot bruel at st dot com>
Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Kaz Kojima <kkojima at rr dot iij4u dot or dot jp>
Date: Fri, 18 Apr 2014 14:41:50 +0200
Subject: Re: [PATH, SH] Small builtin_strlen improvement
Authentication-results: sourceware.org; auth=none
References: <533288C1 dot 1080306 at st dot com> <1396213352 dot 2352 dot 12 dot camel at yam-132-YW-E178-FTW> <53391CDB dot 3030905 at st dot com>

Sorry for the delayed reply.

On Mon, 2014-03-31 at 09:44 +0200, Christian Bruel wrote:
> On 03/30/2014 11:02 PM, Oleg Endo wrote:
> > Hi,
> >
> > On Wed, 2014-03-26 at 08:58 +0100, Christian Bruel wrote:
> >
> >> This patches adds a few instructions to the inlined builtin_strlen to
> >> unroll the remaining bytes for word-at-a-time loop. This enables to have
> >> 2 distinct execution paths (no fall-thru in the byte-at-a-time loop),
> >> allowing block alignment assignation. This partially improves the
> >> problem reported with by Oleg. in [Bug target/0539] New: [SH] builtin
> >> string functions ignore loop and label alignment
> > Actually, my original concern was the (mis)alignment of the 4 byte inner
> > loop.  AFAIR it's better for the SH pipeline if the first insn of a loop
> > is 4 byte aligned.
> 
> yes, this is why I haven't closed the PR. IMHO the problem is with the
> non-aligned loop stems from to the generic alignment code in final.c.
> changing branch frequencies is quite impacting to BB reordering as well.
> Further tuning of static branch estimations, or tuning of the LOOP_ALIGN
> macro is needed. 

OK, I've updated PR 60539 accordingly.

> Note that my branch estimations in this code is very
> empirical, a dynamic profiling benchmarking would be nice as well.
> My point was just that forcing a local .align in this code is a
> workaround, as we should be able to rely on generic reordering/align
> code  for this. So the tuning of loop alignment is more global (and well
> exhibited here indeed)

I think that those two are separate issues.  I've opened a new PR 60884
for this.  Let's continue the discussions and experiments there.

Cheers,
Oleg

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]