[PATCH] Minor optimisations in operator new(size_t, align_val_t)
Marc Glisse
marc.glisse@inria.fr
Fri Aug 17 18:31:00 GMT 2018
On Fri, 17 Aug 2018, Jonathan Wakely wrote:
> On 17/08/18 19:28 +0200, Marc Glisse wrote:
>> On Mon, 13 Aug 2018, Jonathan Wakely wrote:
>>
>>> Thanks to Lars for the suggestions.
>>>
>>> * libsupc++/new_opa.cc (operator new(size_t, align_val_t)): Use
>>> __is_pow2 to check for valid alignment. Avoid branching when rounding
>>> size to multiple of alignment.
>>>
>>> Tested x86_64-linux, committed to trunk.
>>
>> Are you getting better code with __is_pow2 on many platforms? As far as I
>> can tell from a quick look at the patch (I didn't actually test it, I could
>> be completely off), this replaces (x&(x-1))==0 with popcount(x)==1. On a
>> basic x86_64, popcount calls into libgcc, which doesn't seem so good. On a
>> more recent x86_64 (BMI1), x&(x-1) is a single instruction that sets a flag
>> when the result is 0, that's hard to beat.
>
> Then shouldn't we do that in __ispow2?
>
> Even better would be a peephole optimisation to turn
> __builtin_popcount(x)==1 into that.
There is the complication of how to handle 0. For x=0, ispow2(x) is false
while (x&(x-1))==0 is true. So the transformation corresponds to
__builtin_popcount(x)<=1, and for ==1 we need more complications unless we
somehow know that x cannot be 0.
>> Or was the goal to accept an alignment of 0, and not an optimization?
>
> Accepting alignment of 0 isn't the goal :-)
>
> std::ispow2 should be the best way to check if an unsigned integer is
> a power of two, so I wanted to use that instead of manual bit
> twiddling.
Makes sense. The best way to test this may not be the same if we know that
the number cannot be 0, but I guess that if we don't find a better way it
would be possible to use __builtin_constant_p to select between x&(x-1)
and popcount==1. Using range information to enable the compiler
transformation is also possible. Both may require an explicit if(align==0)
in operator new (whether it replaces align with another value or calls
__builtin_unreachable() probably doesn't matter).
> I hope that check will go away soon,
In that case, please ignore my comments on the speed of the check.
--
Marc Glisse
More information about the Gcc-patches
mailing list