This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug target/31640] cache block alignment is too aggressive on sh-elf

From: "oleg dot endo at t-online dot de" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sat, 31 Dec 2011 17:24:47 +0000
Subject: [Bug target/31640] cache block alignment is too aggressive on sh-elf
Auto-submitted: auto-generated
References: <bug-31640-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31640

--- Comment #3 from Oleg Endo <oleg.endo@t-online.de> 2011-12-31 17:24:47 UTC ---
Created attachment 26208
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26208
Proposed patch

(In reply to comment #0)
> The sh4 port aligns blocks that have no fallthrus and that are either
> frequently executed (JUMP_ALIGN) or preceeded a barrier
> (LABEL_ALIGN_AFTER_BARRIER) on a cache line.
> 
> While in theory this help to avoid cache misses if the block slits over 2 cache
> lines, in practise this reduces cache locality and lenghten distance between
> blocks.
> The number of issued instructions are also impacted. For example the relative
> indirect address in jump tables needs a byte zero extend instruction if the
> distance occupies 8 bits instead of 7 bits. 
> 
> I ran some experiments and benchmarked (eembc) with 2 strategies
> 1) -falign-jumps=1
> 2) Align the block if the size is bigger than a given threshold. (empirically
> set to 16 bytes, half of the cache line size). See illustrating attached patch.
> 
> My conclusion is that in -O3 the performance never degrades (option 2 is a
> little bit better, even improving dhrystone by 3%) when removing this padding.
> And the text size improves by ~15%.

Because of this I would like to propose the following alignment strategies
(unless they are changed by the user with -falign-??? options).

-Os:
  Align everything to 2 byte to get compact code

-O2,-O3:
  Align functions to 4 bytes.
  Align labels and jumps to 2 bytes (to avoid potential code bloat).
  Align loops to 4 bytes.

The attached patch should do that, although not fully tested yet.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]