This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
I was looking at the routelookup EEMBC benchmark and it has code of the form: while ( this_node->cmpbit > next_node->cmpbit ) { this_node = next_node; if ( proute->dst_addr & (0x1 << this_node->cmpbit) ) next_node = this_node->rlink; else next_node = this_node->llink; } This is where you have a binary tree/trie and you are iterating going down either the right link or left link until you find a stopping condition. The code in ifcvt.c does not handle optimizing these cases for conditional move since the load might trap, and generates code that does if-then-else with loads and jumps. However, since the two elements are next to each other in memory, they are likely in the same cache line, particularly with aligned stacks and malloc returning aligned pointers. Except in unusual circumstances where the pointer is not aligned, this means it is much faster to optimize it as: while ( this_node->cmpbit > next_node->cmpbit ) { this_node = next_node; rtmp = this_node->rlink; ltmp = this_node->llink; if ( proute->dst_addr & (0x1 << this_node->cmpbit) ) next_node = rtmp; else next_node = ltmp; } So I wrote some patches to do this optimization. In ifcvt.c I added a new hook that allows the backend to try and do conditional moves if the machine independent code doesn't handle the special cases that the machine might have. Then in rs6000.c I used that hook to see if the conditional moves are adjacent, and do the optimization. I will note that this type of code comes up quite frequently since binary trees and tries are common data structure. The file splay-tree.c in libiberty is one place in the compiler tree that has conditional adjacent memory moves. So I would like comments on the patch before the 4.8 tree opens up. I feel even if we decide not to add the adjacent memory move patch, the hook is useful, and I have some other ideas for using it for the powerpc. I was thinking about rewriting the rs6000 dependent parts to make it a normal optimization available to all ports. Is this something we want as a normal option? At the moment, I'm not sure it should be part of -O3 because it is possible for a trap to occur if the pointer straddles a page boundary and the test condition would guard against loading up the second value. However, -Ofast might be an appropriate place to do this optimization. At this time I don't have test cases, but I would add them for the real submission. I have bootstraped the compiler on powerpc with this option enabled and it passed the bootstrap and had no regressions in make check. I will do a spec run over the weekend as well. In addition to libibery/splay-tree.c the following files in gcc have adjacent conditional moves that this code would optimize: cfg.c c-typeck.c df-scan.c fold-const.c graphds.c ira-emit.c omp-low.c rs6000.c tree-cfg.c tree-ssa-dom.c tree-ssa-loop-ivops.c tree-ssa-phiopt.c tree-ssa-uncprop.c 2012-02-10 Michael Meissner <meissner@linux.vnet.ibm.com> * target.def (cmove_md_extra): New hook that is called from ifcvt.c to allow the backend to generate additional conditional moves that aren't handled by the machine independent code. Add support to call the hook at the appropriate places. * targhooks.h (default_cmove_md_extra): Likewise. * targhooks.c (default_cmove_md_extra): Likewise. * target.h (enum ifcvt_pass): Likewise. * ifcvt.c (find_if_header): Likewise. (noce_find_if_block): Likewise. (struct noce_if_info): Likewise. (noce_process_if_block): Likewise. (cond_move_process_if_block): Likewise. (if_convert): Likewise. (rest_of_handle_if_conversion): Likewise. (rest_of_handle_if_after_combine): Likewise. (rest_of_handle_if_after_reload): Likewise. * doc/tm.texi (TARGET_CMOVE_MD_EXTRA): Likewise. * doc/tm.texi.in (TARGET_CMOVE_MD_EXTRA): Likewise. * config/rs6000/rs6000.opt (-mcmove-adjacent-memory): Add support for a new switch to optimize conditional moves where each side is a memory reference and the memory locations are adjacent. * config/rs6000/rs6000.c (rs6000_cmove_md_extra): Likewise. (TARGET_CMOVE_MD_EXTRA): Likewise. (rs6000_decompose_offsettable_memref): Likewise. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meissner@linux.vnet.ibm.com fax +1 (978) 399-6899
Attachment:
gcc-power7.patch322b
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |