This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Byte permutation optimization


Hi there.

I recently came across a nice micro-optimization to permutate bytes within a dword. On x86 all 24 combinations can be done using a maximum of 3 instructions (bswap, rotate32 and rotate16).

I'd like to give it a try and write an optimization pass that detects such dword permutations and replaces them with the optimal sequences. In my tests this seems to be a good win in size and speed. I have the feeling that the register-allocator likes the smaller sequences as well.

Anyway, I don't know where to start. I've browsed the source-code and searched for related optimization passes that I could use as a boiler-plate for my experiments.

So far I found two places that look interesting for me:

In optabs.c I've seen the code that combines two shifts and one or into a rotate. Looks like a hard-coded special case for me.

In tree-ssa-math-opts.c I've seen a code that scans for some common math operations and replaces them with faster code.

Since the codebase is huge I have the feeling that I have overlooked something. Does some kind of infrastructure to detect patterns within a SSA tree already exists somewhere else? Where would be the best place in gcc to add an automatic byteswap detection?

I don't know if I'll ever finish the experiment and submit a patch. The code-base *huge* and scary, but I'd at least like to give it a try.

Nils Pipenbrinck




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]