This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Hash Function for "switch statement"


Hi Michael,

Thank you for the detailed response.


On Fri, Mar 19, 2010 at 9:54 AM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> On Thu, Mar 18, 2010 at 05:10:18PM -0700, Jae Hyuk Kwak wrote:
>> Hi Michael,
>>
>> Thank you for the details.
>> If I understood correctly, your point is that O(log N) is fast enough
>> because the size of switch is small in practice.
>> But I am still not convinced that using hash table would not increase the speed.
>> As we know hash table requires O(N) only.
>> There must be some point that O(N) bits O(logN), depending on the size
>> of N and the constant cost of O(N).
>
> Sure, but you need to do a cost analysis whether it is worthwhile. ÂI suspect
> it would not win until the number of cases is about 2000 or so, assuming you
> can't use a jump table. ÂI'm not against the optimization per se, but I don't
> want any current switch statement to slow down because you just assume it is
> faster and the cost analysis used to decide which way to do it did not account
> for all costs. ÂThe only way to solve this is to implement it and run various
> benchmarks on real hardware.
>
> Modulus is a rather expensive operation in almost all processors. ÂSome
> processors additionally do not provide a modulus operation, and you have to get
> it by doing a divide, multiply, and subtract (and some machines don't have a
> divide at all, and you have to do a loop of shift/adds to replicate the
> divide).

I agree that we need to do the performance benchmarks whenever we talk
about performance issue.
It often gives us surprises when we see the actual results.

For the issue of Modulus operation, it does not need to be divide or
hardware modulus operation.
Let me give you an example here:
13 % 2 = 1
13 & (2-1) = 1

In English, we can use AND operator as substitutes of modulus operator.
In the way, the size of hash table should be restricted to certain
values like 2, 4, 8, 16, and so on.
But it is not a big problem because by the nature of Hash Table, we
don't make the table full.
For example, we have 257 cases on a switch statement, then we need a
hash table whose bucket size is 512.
It will have 50% filled hash table, but 50% filled hash table is usual.

> In addition to doing the modulus operation, you need to have an entry that
> gives the value in the switch statement as well as the jump address, and do an
> equality comparison against that value, otherwise you will get false positives.

In order to prevent the false positives, we need to store the key
value as you correctly pointed.
So the final process will be first to the hash key value calculation
and then go to the hash table, and compare whether the value is
correct or not.
In programming it will be like this.

typedef void (*CB)();
void doSomething1();
void doSomething40000();
std::map< int, CB > hashTable;
hashTable[ 1 ] = doSomething1;
hashTable[ 40000 ] = doSomething40000;
const std::pair< int, CB > found = hashTable.find( a );
if( found.first == a ) (*(found.second))();


> I believe the code has been in the compiler since 4.3 at least.

Thank you for the information.


Jay


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]