This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Speedup and cleanup hash-table.h


> Jan Hubicka <hubicka@ucw.cz> writes:
> 
> > Hi,
> > this patch started as experiment moving hash_table_mod1 inline because it shows
> > high in streaming profiles and it represents a branch-less code that is good
> > to schedule to surrounding instructions.
> 
> FWIW with a good modern hash function it shouldn't be needed to have
> prime hash table sizes anymore. Without that just a power of two size 
> can be used, so it would be just a mask.
> 
> I considered this last time I messed with hashes, but I didn't actually
> see this function as beening hot.

If I measure the wpa-stream thread only, I get about 18% in the hashtable
lookup:
 18.33%  lto1-wpa-stream  lto1               [.] hash_table<hash_map<tree_node*, unsigned int, default_hashmap_traits>::hash_entry, xcallocator, true>::find_with_hash(tree_node* const&, uďż˝
  9.53%  lto1-wpa-stream  lto1               [.] DFS::DFS_write_tree(output_block*, DFS::sccs*, tree_node*, bool, bool, bool)                                                              ďż˝
  8.43%  lto1-wpa-stream  lto1               [.] linemap_lookup(line_maps*, unsigned int)                                                                                                  ďż˝
  5.81%  lto1-wpa-stream  lto1               [.] streamer_write_uhwi_stream(lto_output_stream*, unsigned long)                                                                             ďż˝
  4.47%  lto1-wpa-stream  lto1               [.] lto_output_tree(output_block*, tree_node*, bool, bool)                                                                                    ďż˝
  3.11%  lto1-wpa-stream  libc-2.13.so       [.] _int_malloc                                                                                                                               ďż˝
  2.85%  lto1-wpa-stream  lto1               [.] DFS::DFS_write_tree_body(output_block*, tree_node*, DFS::sccs*, bool, bool)                                                               ďż˝

I am not quite sure hash function is the bottleneck though - I would more
expect it to be cache miss sink.  It may be interesting though to replace it by
something more resonable than this pre-computed divide.

Honza
> 
> -Andi
> 
> -- 
> ak@linux.intel.com -- Speaking for myself only


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]