[patch] PR38586, quadratic behavior in find_temp_slot_from_address
Sat Jan 3 12:30:00 GMT 2009
The attached patch speeds up find_temp_slot_from_address by replacing
a quadratic loop with a hash table lookup. The motivation for the
patch is the test case from PR38474, which takes hours to compile even
at -O0. This is only one of the bottle-necks, but it is the biggest
one for me on ia64.
Before the patch, the reduced test case from PR38474 needs ~1200s for
'expand', almost all of it spent in find_temp_slot_from_address.
With the patch, it "only" spends 120s in 'expand' and
find_temp_slot_from_address is not the #1 in the profile anymore (it's
still expensive when one of the address operands is
virtual_stack_var_rtx, but I don't see an easy fix for that).
Bootstrapped&tested on ia64, all languages except ada.
Bootstrapped&tested on ia64 with BOOT_CFLAGS="-O2
-fno-strict-aliasing" (to make combine_temp_slots() do something)
Compared assembly of all cc1-i files before and after the patch: no difference.
And I see Richi already OK'ed this in the Bugzilla audit trail :-)
I'll commit this patch later this weekend.
* function.c (struct temp_slot): Move to the section of the file
that deals with temp slots. Remove field 'address'.
(temp_slot_address_table): New hash table of address -> temp slot.
(struct temp_slot_address_entry): New struct, items for the table.
temp_slot_address_eq, insert_temp_slot_address): Support functions
for the new table.
(find_temp_slot_from_address): Rewrite to use the new hash table.
(remove_unused_temp_slot_addresses): Remove addresses of temp
slots that have been made available.
(remove_unused_temp_slot_addresses_1): Call-back for htab_traverse,
worker function for remove_unused_temp_slot_addresses.
(assign_stack_temp_for_type): Don't clear the temp slot address list.
Add the temp slot address to the address -> temp slot map.
(update_temp_slot_address): Update via insert_temp_slot_address.
(free_temp_slots): Call remove_unused_temp_slot_addresses.
(init_temp_slots): Allocate the address -> temp slot map, or empty
the map if it is already allocated.
(prepare_function_start): Initialize temp slot processing.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 14335 bytes
Desc: not available
More information about the Gcc-patches