This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Bitset statistics



Hi folks,

I have implemented Dan Berlin's ebitmap idea (well I think I have) and
integrated it with the common bitset interface I have written.  From
my observations the approach has its merits but is not a replacement
for the other bitset implementations (namely a dense array of words of
bits and a linked list of small arrays of words of bits).

To get a handle on things, I have profiled GCC many times using
different bitset implementations.  The bitset operation that dominates
timing profiles in all cases is the determination of which bits are
set in a bitset.  To see what we're dealing with, I instrumented this
function to log statistics about the number of bits set in a bitset,
the size of a bitset, and the density of the bits set.  I've included
an example of the information collected during a bootstrap of
i686-pc-linux-gnu at the end of this mail. 

The executive summary is:

1. Most of the time (87%) we iterate over a sbitset with zero bits set.

2. 40% of the sbitsets are 32 bits or smaller in size.

3. 20% of the time we iterate over a lbitset with zero bits set.

4. 60% of the time we iterate over a lbitset with only a single bit set.
   (I think is predominantly due to approx_reg_cost in cse.c.  The
    numbers suggest that this function may be better to use a simple list
    of registers it finds within an RTX.)

5. My bitset implementation gives approx. 5% speed improvement
   for a bootstrap of GCC.

 (Here a sbitset is my new implementation of GCC's sbitmap and an
  lbitset is my implementation of GCC's bitmap.)

The second most dominant bitset operation in the timing profile is the
operation required to find a given bit in a linked list of bitset
words.  As Dan mentioned, this time grows with the number of elements
in the list.  The number of elements can be reduced by storing more
bits with each element.  However, when bitsets are sparse this slows
down the operation of finding set bits.  I have experimented with a
range of different element sizes but get the best results on my PC
with two 32-bit words/element.

Using my ebitset implementation in place of my lbitset implementation
for regsets I found that there was a small drop in compilation speed
even when I flagged ebitsets when they were to known to have no set
bits.  However, the ebitset implementation is likely to be more
efficient for very large bitsets with a moderate density of set bits.
With the framework I have developed, it is straightforward to 
convert the bitsets transparently on the fly.


Anyway, here are the numbers.  If there is a GCC branch that would
like to try my bitset implementation, I can generate an appropriate
set of diffs.


517134 sbitsets xmalloced, 517134 freed.
0 sbitsets oballoced, 0 freed.
1027992 bitset_lists

sbitset count log histogram
      0	  898066 ( 87.4%)
      1	   59822 (  5.8%)
    2-3	   21294 (  2.1%)
    4-7	   18406 (  1.8%)
   8-15	   12940 (  1.3%)
  16-31	    9516 (  0.9%)
  32-63	    4884 (  0.5%)
 64-127	    1772 (  0.2%)
128-255	     840 (  0.1%)
256-511	     452 (  0.0%)

sbitset size log histogram
          0	       0 (  0.0%)
          1	    7304 (  0.7%)
        2-3	   18398 (  1.8%)
        4-7	   53880 (  5.2%)
       8-15	  123942 ( 12.1%)
      16-31	  201368 ( 19.6%)
      32-63	  213892 ( 20.8%)
     64-127	  164728 ( 16.0%)
    128-255	  113090 ( 11.0%)
    256-511	   76766 (  7.5%)
   512-1023	   37790 (  3.7%)
  1024-2047	   16646 (  1.6%)
  2048-4095	     188 (  0.0%)
  4096-8191	       0 (  0.0%)
 8192-16383	       0 (  0.0%)
16384-32767	       0 (  0.0%)

sbitset density histogram
0-5%	  950892 ( 92.5%)
5-10%	   17952 (  1.7%)
10-15%	   10542 (  1.0%)
15-20%	    5168 (  0.5%)
20-25%	    4234 (  0.4%)
25-30%	    4024 (  0.4%)
30-35%	    3080 (  0.3%)
35-40%	    1472 (  0.1%)
40-45%	    1974 (  0.2%)
45-50%	     722 (  0.1%)
50-55%	    1456 (  0.1%)
55-60%	     776 (  0.1%)
60-65%	    1142 (  0.1%)
65-70%	     938 (  0.1%)
70-75%	     728 (  0.1%)
75-80%	     788 (  0.1%)
80-85%	     738 (  0.1%)
85-90%	     560 (  0.1%)
90-95%	     482 (  0.0%)
95-100%	   20324 (  2.0%)

9879108 lbitsets xmalloced, 9878034 freed.
10928209 lbitsets oballoced, 5434927 freed.
110309237 bitset_lists

lbitset count log histogram
      0	23934104 ( 21.7%)
      1	68368292 ( 62.0%)
    2-3	 7795182 (  7.1%)
    4-7	 5596475 (  5.1%)
   8-15	 3559918 (  3.2%)
  16-31	  612434 (  0.6%)
  32-63	  441316 (  0.4%)
 64-127	     988 (  0.0%)
128-255	     384 (  0.0%)
256-511	     144 (  0.0%)

lbitset size log histogram
          0	20927661 ( 19.0%)
          1	       0 (  0.0%)
        2-3	       0 (  0.0%)
        4-7	       0 (  0.0%)
       8-15	       0 (  0.0%)
      16-31	       0 (  0.0%)
      32-63	       0 (  0.0%)
     64-127	39929374 ( 36.2%)
    128-255	28614366 ( 25.9%)
    256-511	10290740 (  9.3%)
   512-1023	 5719120 (  5.2%)
  1024-2047	 3038270 (  2.8%)
  2048-4095	 1467374 (  1.3%)
  4096-8191	  322332 (  0.3%)
 8192-16383	       0 (  0.0%)
16384-32767	       0 (  0.0%)

lbitset density histogram
0-5%	105177398 ( 95.3%)
5-10%	 4005545 (  3.6%)
10-15%	  677394 (  0.6%)
15-20%	   38670 (  0.0%)
20-25%	    6788 (  0.0%)
25-30%	    3856 (  0.0%)
30-35%	    2074 (  0.0%)
35-40%	    1082 (  0.0%)
40-45%	     660 (  0.0%)
45-50%	     320 (  0.0%)
50-55%	     222 (  0.0%)
55-60%	      70 (  0.0%)
60-65%	      58 (  0.0%)
65-70%	      24 (  0.0%)
70-75%	      24 (  0.0%)
75-80%	  253856 (  0.2%)
80-85%	  141190 (  0.1%)
85-90%	       0 (  0.0%)
90-95%	       6 (  0.0%)
95-100%	       0 (  0.0%)

0 ebitsets xmalloced, 0 freed.
0 ebitsets oballoced, 0 freed.
0 bitset_lists


Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]