This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Speed up preprocess_constraints


Analysis with cachegrind of cc1 compiling combine.c for an ARM target 
shows that of the 31 million misses of the data cache for write 
operations, 13 million of these are coming from the memset at the 
beginning of preprocess_constraints.  The reason that most of these writes 
miss the cache is that they are never accessed for any purpose afterwards 
(we are zeroing memory that isn't used afterwards), so nothing acts to 
bring them into the cache for the next iteration (assuming allocate on 
read -- for allocate on write we bring them into the cache at the expense 
of other data, but then never use them).  The arm target makes extensive 
use of this function when fixing up constant pool accesses.

Now it happens that on a StrongARM, writing to memory that isn't in the 
cache is very slow (even if we use store-multiple operations), since the 
write-buffer will only use streaming write operations for cache write-back 
operations.

Fixing this code to only zero those parts of memory that are really going 
to be used saves a not inconsiderable 2.7% on the total time to bootstrap 
the whole of gcc (19 minutes from a total of 11.4 hours).  On a single cc1 
run compiling combine.c, it saves as much as 4%.  Cachegrind shows that 
the number of data-cache write misses in this function dropped from 13 
million to 18900, a reduction of 99.85!

As a result of fixing this I discovered that the arm back end was in any 
case abusing the contents of this array by trying to access it when an 
insn had no constraints.  That was trivial to fix.

Bootstrapped on arm-netbsdelf with regression tests run, and also 
bootstrapped on x86 (linux) with BOOT_CFLAGS set to -O3 to ensure that 
other uses of preprocess_constraints were also exercised.   I've also 
checked that no other back-end accesses recog_op_alt in machine-dependent 
code.

R.

2003-11-23  Richard Earnshaw  <rearnsha@arm.com>

	* recog.c (preprocess_constraints): Only zero those elements of
	recog_op_alt that are needed for this insn.
	* arm.c (note_invalid_constants): A function can't contain invalid
	constants if it has no constraints.


Attachment: prep-constraint.patch
Description: prep-constraint.patch


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]