[PATCH] Minor optimisations in operator new(size_t, align_val_t)

Marc Glisse marc.glisse@inria.fr
Fri Aug 17 18:31:00 GMT 2018


On Fri, 17 Aug 2018, Jonathan Wakely wrote:

> On 17/08/18 19:28 +0200, Marc Glisse wrote:
>> On Mon, 13 Aug 2018, Jonathan Wakely wrote:
>> 
>>> Thanks to Lars for the suggestions.
>>>
>>> 	* libsupc++/new_opa.cc (operator new(size_t, align_val_t)): Use
>>> 	__is_pow2 to check for valid alignment. Avoid branching when rounding
>>> 	size to multiple of alignment.
>>> 
>>> Tested x86_64-linux, committed to trunk.
>> 
>> Are you getting better code with __is_pow2 on many platforms? As far as I 
>> can tell from a quick look at the patch (I didn't actually test it, I could 
>> be completely off), this replaces (x&(x-1))==0 with popcount(x)==1. On a 
>> basic x86_64, popcount calls into libgcc, which doesn't seem so good. On a 
>> more recent x86_64 (BMI1), x&(x-1) is a single instruction that sets a flag 
>> when the result is 0, that's hard to beat.
>
> Then shouldn't we do that in __ispow2?
>
> Even better would be a peephole optimisation to turn
> __builtin_popcount(x)==1 into that.

There is the complication of how to handle 0. For x=0, ispow2(x) is false 
while (x&(x-1))==0 is true. So the transformation corresponds to 
__builtin_popcount(x)<=1, and for ==1 we need more complications unless we 
somehow know that x cannot be 0.

>> Or was the goal to accept an alignment of 0, and not an optimization?
>
> Accepting alignment of 0 isn't the goal :-)
>
> std::ispow2 should be the best way to check if an unsigned integer is
> a power of two, so I wanted to use that instead of manual bit
> twiddling.

Makes sense. The best way to test this may not be the same if we know that 
the number cannot be 0, but I guess that if we don't find a better way it 
would be possible to use __builtin_constant_p to select between x&(x-1) 
and popcount==1. Using range information to enable the compiler 
transformation is also possible. Both may require an explicit if(align==0) 
in operator new (whether it replaces align with another value or calls 
__builtin_unreachable() probably doesn't matter).

> I hope that check will go away soon,

In that case, please ignore my comments on the speed of the check.

-- 
Marc Glisse



More information about the Gcc-patches mailing list