Builtin expansion versus headers optimization: Reductions

Jakub Jelinek jakub@redhat.com
Fri Jun 5 09:23:00 GMT 2015

On Fri, Jun 05, 2015 at 11:02:03AM +0200, Ondřej Bílka wrote:
> On Thu, Jun 04, 2015 at 02:34:40PM -0700, Andi Kleen wrote:
> > The compiler has much more information than the headers.
> > 
> > - It can do alias analysis, so to avoid needing to handle overlap
> > and similar.
> Could but it could also export that information which would benefit
> third parties.


> > - It can (sometimes) determine alignment, which is important
> > information for tuning.
> In general case yes, but here its useless. As most functions are aligned
> to 16 bytes in less than 10% of calls you shouldn't add cold branch to
> handle aligned data.
> Also as I mentioned bugs before gcc now doesn't handle alignment well so
> it doesn't optimize following to zero for aligned code.
>  align = ((uintptr_t) x) % 16;

That is simply not true.  E.g.
struct __attribute__((aligned (16))) S { char b[16]; };
struct S a;

unsigned long
foo (void)
  return (((unsigned long) &a) % 16);
is optimized into 0, many other testcases too, the CCP pass takes alignment
info into account and optimize based on that.  If you are talking about
result of malloc, supposedly it is because glibc headers don't properly mark
malloc with the alloc_align attribute yet.

> > - With profile feedback it can use value histograms to determine the
> > best code.
> > 
> Problem is that histograms are not enough as I mentioned before. For
> profiling you need to measure useful data which differs per function and
> should be done in userspace.

For some builtin functions PGO can collect custom extra data, that the
compiler then can use to decide how to expand the builtins.
E.g. for some string op builtins PGO already collects average alignment and
average size.


More information about the Gcc mailing list