This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Bitset statistics
- From: Michael Hayes <m dot hayes at elec dot canterbury dot ac dot nz>
- To: gcc at gcc dot gnu dot org
- Cc: dan at cgsoftware dot com
- Date: Fri, 11 Jan 2002 00:54:54 +1300
- Subject: Bitset statistics
Hi folks,
I have implemented Dan Berlin's ebitmap idea (well I think I have) and
integrated it with the common bitset interface I have written. From
my observations the approach has its merits but is not a replacement
for the other bitset implementations (namely a dense array of words of
bits and a linked list of small arrays of words of bits).
To get a handle on things, I have profiled GCC many times using
different bitset implementations. The bitset operation that dominates
timing profiles in all cases is the determination of which bits are
set in a bitset. To see what we're dealing with, I instrumented this
function to log statistics about the number of bits set in a bitset,
the size of a bitset, and the density of the bits set. I've included
an example of the information collected during a bootstrap of
i686-pc-linux-gnu at the end of this mail.
The executive summary is:
1. Most of the time (87%) we iterate over a sbitset with zero bits set.
2. 40% of the sbitsets are 32 bits or smaller in size.
3. 20% of the time we iterate over a lbitset with zero bits set.
4. 60% of the time we iterate over a lbitset with only a single bit set.
(I think is predominantly due to approx_reg_cost in cse.c. The
numbers suggest that this function may be better to use a simple list
of registers it finds within an RTX.)
5. My bitset implementation gives approx. 5% speed improvement
for a bootstrap of GCC.
(Here a sbitset is my new implementation of GCC's sbitmap and an
lbitset is my implementation of GCC's bitmap.)
The second most dominant bitset operation in the timing profile is the
operation required to find a given bit in a linked list of bitset
words. As Dan mentioned, this time grows with the number of elements
in the list. The number of elements can be reduced by storing more
bits with each element. However, when bitsets are sparse this slows
down the operation of finding set bits. I have experimented with a
range of different element sizes but get the best results on my PC
with two 32-bit words/element.
Using my ebitset implementation in place of my lbitset implementation
for regsets I found that there was a small drop in compilation speed
even when I flagged ebitsets when they were to known to have no set
bits. However, the ebitset implementation is likely to be more
efficient for very large bitsets with a moderate density of set bits.
With the framework I have developed, it is straightforward to
convert the bitsets transparently on the fly.
Anyway, here are the numbers. If there is a GCC branch that would
like to try my bitset implementation, I can generate an appropriate
set of diffs.
517134 sbitsets xmalloced, 517134 freed.
0 sbitsets oballoced, 0 freed.
1027992 bitset_lists
sbitset count log histogram
0 898066 ( 87.4%)
1 59822 ( 5.8%)
2-3 21294 ( 2.1%)
4-7 18406 ( 1.8%)
8-15 12940 ( 1.3%)
16-31 9516 ( 0.9%)
32-63 4884 ( 0.5%)
64-127 1772 ( 0.2%)
128-255 840 ( 0.1%)
256-511 452 ( 0.0%)
sbitset size log histogram
0 0 ( 0.0%)
1 7304 ( 0.7%)
2-3 18398 ( 1.8%)
4-7 53880 ( 5.2%)
8-15 123942 ( 12.1%)
16-31 201368 ( 19.6%)
32-63 213892 ( 20.8%)
64-127 164728 ( 16.0%)
128-255 113090 ( 11.0%)
256-511 76766 ( 7.5%)
512-1023 37790 ( 3.7%)
1024-2047 16646 ( 1.6%)
2048-4095 188 ( 0.0%)
4096-8191 0 ( 0.0%)
8192-16383 0 ( 0.0%)
16384-32767 0 ( 0.0%)
sbitset density histogram
0-5% 950892 ( 92.5%)
5-10% 17952 ( 1.7%)
10-15% 10542 ( 1.0%)
15-20% 5168 ( 0.5%)
20-25% 4234 ( 0.4%)
25-30% 4024 ( 0.4%)
30-35% 3080 ( 0.3%)
35-40% 1472 ( 0.1%)
40-45% 1974 ( 0.2%)
45-50% 722 ( 0.1%)
50-55% 1456 ( 0.1%)
55-60% 776 ( 0.1%)
60-65% 1142 ( 0.1%)
65-70% 938 ( 0.1%)
70-75% 728 ( 0.1%)
75-80% 788 ( 0.1%)
80-85% 738 ( 0.1%)
85-90% 560 ( 0.1%)
90-95% 482 ( 0.0%)
95-100% 20324 ( 2.0%)
9879108 lbitsets xmalloced, 9878034 freed.
10928209 lbitsets oballoced, 5434927 freed.
110309237 bitset_lists
lbitset count log histogram
0 23934104 ( 21.7%)
1 68368292 ( 62.0%)
2-3 7795182 ( 7.1%)
4-7 5596475 ( 5.1%)
8-15 3559918 ( 3.2%)
16-31 612434 ( 0.6%)
32-63 441316 ( 0.4%)
64-127 988 ( 0.0%)
128-255 384 ( 0.0%)
256-511 144 ( 0.0%)
lbitset size log histogram
0 20927661 ( 19.0%)
1 0 ( 0.0%)
2-3 0 ( 0.0%)
4-7 0 ( 0.0%)
8-15 0 ( 0.0%)
16-31 0 ( 0.0%)
32-63 0 ( 0.0%)
64-127 39929374 ( 36.2%)
128-255 28614366 ( 25.9%)
256-511 10290740 ( 9.3%)
512-1023 5719120 ( 5.2%)
1024-2047 3038270 ( 2.8%)
2048-4095 1467374 ( 1.3%)
4096-8191 322332 ( 0.3%)
8192-16383 0 ( 0.0%)
16384-32767 0 ( 0.0%)
lbitset density histogram
0-5% 105177398 ( 95.3%)
5-10% 4005545 ( 3.6%)
10-15% 677394 ( 0.6%)
15-20% 38670 ( 0.0%)
20-25% 6788 ( 0.0%)
25-30% 3856 ( 0.0%)
30-35% 2074 ( 0.0%)
35-40% 1082 ( 0.0%)
40-45% 660 ( 0.0%)
45-50% 320 ( 0.0%)
50-55% 222 ( 0.0%)
55-60% 70 ( 0.0%)
60-65% 58 ( 0.0%)
65-70% 24 ( 0.0%)
70-75% 24 ( 0.0%)
75-80% 253856 ( 0.2%)
80-85% 141190 ( 0.1%)
85-90% 0 ( 0.0%)
90-95% 6 ( 0.0%)
95-100% 0 ( 0.0%)
0 ebitsets xmalloced, 0 freed.
0 ebitsets oballoced, 0 freed.
0 bitset_lists
Michael.