This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] PR/27733, synth_mult cache missing too much


Hello,

this patch fixes PR27733, where synth_mult takes definitely too much to find the best way to multiply a 64-bit by a very large number (the inverse of 23).

synth_mult's algorithm is known to explode in some cases, and for this reason in http://gcc.gnu.org/ml/gcc-patches/2004-11/msg01420.html
Kazu added a cache to it. Later on, he also reworked a bit the algorithm to improve the effectiveness of the cache and to have a lower branching factor, and fixed PR23971 with these changes.


However, all this very effective and appreciated hacking of synth_mult had a little oversight:

/* The entry for our multiplication cache/hash table.  */
struct alg_hash_entry {
 /* The number we are multiplying by.  */
 unsigned int t;
 ...


Gotcha! :-P The factor is stored as a 32-bit value only, while the argument of synth_mult is a HOST_WIDE_INT. While this could even be the cause of a wrong-code regression, this is extremely unlikely to happen. It is more likely, even in real-world cases like PR27733, that it causes a *lot* of cache misses. In fact, the PR is almost fixed by changing the type of this field.


The compilation time however is still pretty high. I'm therefore proposing to also tune a bit the cache, making it bigger for 64-bit hosts than it is for 32-bit hosts. This because multiplications by 50-60 bit factors are frequent because of the expansion of EXACT_DIV_EXPR, and they take up a lot of cache entries. With a size of 2069, and with a -O0-built compiler, the PR's testcase compilation time is down to 0.3s, and with a size of 1031 it is down to 0.8s.

The attached patch, bootstrapped on i686-pc-linux-gnu, picks the smaller size. Of course, if the reviewer will consider it better, it can be changed to the bigger size. Both sizes are prime numbers of course.

I am regtesting the patch now, but the additional coverage is zero because the patch has absolutely no effect on hosts with 32-bit and sizeof (HOST_WIDE_INT) == sizeof (int). Ok for mainline? What about 4.1?

Paolo
2006-06-08  Paolo Bonzini  <bonzini@gnu.org>

	* expmed.c (struct alg_hash_entry): Fix type of field T
	to match synth_mult argument.
	(NUM_ALG_HASH_ENTRIES): Make it bigger for 64-bit HOST_WIDE_INT.

Index: ../../gcc/expmed.c
===================================================================
--- ../../gcc/expmed.c	(revision 114482)
+++ ../../gcc/expmed.c	(working copy)
@@ -2395,7 +2395,7 @@ struct algorithm
 /* The entry for our multiplication cache/hash table.  */
 struct alg_hash_entry {
   /* The number we are multiplying by.  */
-  unsigned int t;
+  unsigned HOST_WIDE_INT t;
 
   /* The mode in which we are multiplying something by T.  */
   enum machine_mode mode;
@@ -2410,7 +2410,11 @@ struct alg_hash_entry {
 };
 
 /* The number of cache/hash entries.  */
+#if HOST_BITS_PER_WIDE_INT == 64
+#define NUM_ALG_HASH_ENTRIES 1031
+#else
 #define NUM_ALG_HASH_ENTRIES 307
+#endif
 
 /* Each entry of ALG_HASH caches alg_code for some integer.  This is
    actually a hash table.  If we have a collision, that the older

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]