As of posix_memalign the issue is not so much that of alias analysis (we could handle it but we don't have a builtin right now) but that of alignment analysis which doesn't implement alignment tracking of pointers stored in memory. We could "lower" posix_memalign (&ptr, align, size); to posix_memalign (&ptr, align, size); ptr = __builtin_assume_algined (ptr, align); and hope for FRE to fix things up enough to make that useful.
Well, the problem with doing it this way is that ptr is still considered address taken and the assume aligned then really can result in a noop move. What I meant is expand: ret = posix_memalign (&ptr, align, size); as { void *temp; ret = posix_memalign (&temp, align, size); void *temp2 = __builtin_passthru_attribute_malloc_size (temp, size); ptr = __typeof (ptr) __builtin_assume_aligned (temp2, align); temp ={v}{CLOBBER}; } The advantages of doing it this way would be that (if ptr is not address taken for other reasons) that it would not need to be address taken, escape and all the like, I think posix_memalign is not reading the old content of the pointer nor storing it anywhere, it just fills it in as another result, just by reference. And the optimizers would know it doesn't alias anything, like if it came from malloc (size);, and is aligned to align bytes at least. The disadvantage would be that if ptr is addressable for other reasons, we wouldn't reuse it's address for posix_memalign, but pass address of another temporary and then copied the mem.
Created attachment 32064 [details] part #1, aliasing I've implemented the aliasing parts (and the builtin obviously). It's true that doing posix_memalign (&ptr, ....); ptr = __builtin_assume_aligned (ptr, ...); will keep ptr address-taken - but isn't it kept address-taken anyway because it's passed to posix_memalign? I think you are mixing the possible optimization we can do to posix_memalign in general with the alignment issue, no? Thus, we could transform posix_memalign (&ptr, ....); to void *tem; posix_memalign (&tem, ....); ptr = tem; independently. Doing it as part of the alignment stuff of course makes sense. But as you say, eventually we'd just use an extra stack slot for no good reason. I've long thought of teaching some more tricks to update_address_taken - basically ignore some of the address-takens and apply simple transforms on the stmts causing them if that would make the var non-address-taken (memcpy comes to my mind as well here).
Hack: when the return value of posix_memalign is ignored, if the platform supports it, replace with a call to aligned_alloc (C11), which has an easier interface.
Created attachment 32066 [details] part #2, C11 aligned_alloc It was noted that aligned_alloc is standard enough to be supported (and with nicer interface albeit possibly clobbering errno ...).
(In reply to Marc Glisse from comment #4) > Hack: when the return value of posix_memalign is ignored, if the platform > supports it, replace with a call to aligned_alloc (C11), which has an easier > interface. The question is if posix_memalign is allowed to change errno. If it is, then making glibc contains say something like: extern int __REDIRECT_NTH (__posix_memalign_alias, (void ** __ptr, size_t __alignment, size_t __size), posix_memalign) __nonnull ((1)) __wur; extern void *__REDIRECT_NTH (__memalign_alias, (size_t __alignment, size_t __size), memalign) __attribute__ ((__malloc__, __alloc_size__ (2))); __extern_inline int posix_memalign (void **__ptr, size_t __alignment, size_t __size) { if (__builtin_constant_p (__alignment)) { if (__alignment == 0 || __alignment & (__alignment - 1)) != 0 || __alignment % sizeof (void *)) return EINVAL; void *__res = __memalign_alias (__alignment, __size); if (__res == NULL) return ENOMEM; *__ptr = __res; return 0; } return __posix_memalign_alias (__ptr, __alignment, __size); } But looking at glibc sources, even posix_memalign actually changes errno. Tbe problem with this inline version is that user aliasing bugs will trigger people more often, and that some hack will need to be find out for the EINVAL and ENOMEM values (because, stdlib.h is not supposed to include errno.h I guess, so it would need to be some __EINVAL/__ENOMEM value determined by configure or something).
(In reply to Jakub Jelinek from comment #6) > (In reply to Marc Glisse from comment #4) > > Hack: when the return value of posix_memalign is ignored, if the platform > > supports it, replace with a call to aligned_alloc (C11), which has an easier > > interface. > > The question is if posix_memalign is allowed to change errno. If it is, > then making glibc contains say something like: > > extern int __REDIRECT_NTH (__posix_memalign_alias, > (void ** __ptr, size_t __alignment, size_t > __size), > posix_memalign) __nonnull ((1)) __wur; > extern void *__REDIRECT_NTH (__memalign_alias, > (size_t __alignment, size_t __size), > memalign) __attribute__ ((__malloc__, > __alloc_size__ (2))); > > __extern_inline int > posix_memalign (void **__ptr, size_t __alignment, size_t __size) > { > if (__builtin_constant_p (__alignment)) > { > if (__alignment == 0 > || __alignment & (__alignment - 1)) != 0 > || __alignment % sizeof (void *)) > return EINVAL; > void *__res = __memalign_alias (__alignment, __size); > if (__res == NULL) > return ENOMEM; > *__ptr = __res; > return 0; > } > return __posix_memalign_alias (__ptr, __alignment, __size); > } > > But looking at glibc sources, even posix_memalign actually changes errno. > Tbe problem with this inline version is that user aliasing bugs will trigger > people more often, and that some hack will need to be find out for the > EINVAL and ENOMEM values (because, stdlib.h is not supposed to include > errno.h I guess, so it would need to be some __EINVAL/__ENOMEM value > determined by configure or something). According to the specification this is wrong. Note that changing errno is hindering optimization. For example int foo (int *p) { *p = 1; malloc (4); return *p; } cannot CSE *p because p may point to errno. (works for float *p and works when using posix_memalign with my patch)
(In reply to Richard Biener from comment #7) > According to the specification this is wrong. Note that changing errno > is hindering optimization. For example > > int foo (int *p) > { > *p = 1; > malloc (4); > return *p; > } > > cannot CSE *p because p may point to errno. (works for float *p and > works when using posix_memalign with my patch) Well, e.g. http://pubs.opengroup.org/onlinepubs/007904975/functions/posix_memalign.html says nothing about errno, I think only functions which explicitly document not to clobber errno may not, all other functions may, but it's value is undefined after the call. For calls that are documented to change errno in some cases, it is again defined only if those calls return a particular value (e.g. -1), otherwise errno is still undefined.
On Thu, 6 Feb 2014, jakub at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60092 > > --- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #7) > > According to the specification this is wrong. Note that changing errno > > is hindering optimization. For example > > > > int foo (int *p) > > { > > *p = 1; > > malloc (4); > > return *p; > > } > > > > cannot CSE *p because p may point to errno. (works for float *p and > > works when using posix_memalign with my patch) > > Well, e.g. > http://pubs.opengroup.org/onlinepubs/007904975/functions/posix_memalign.html > says nothing about errno, I think only functions which explicitly document not > to clobber errno may not, all other functions may, but it's value is undefined > after the call. For calls that are documented to change errno in some cases, > it is again defined only if those calls return a particular value (e.g. -1), > otherwise errno is still undefined. Ok, my manpage says RETURN VALUE aligned_alloc(), memalign(), valloc(), and pvalloc() return a pointer to the allocated memory, or NULL if the request fails. posix_memalign() returns zero on success, or one of the error values listed in the next section on failure. Note that errno is not set. so that must be incorrect. If the value of errno is undefined after posix_memalign that doesn't help us as we then still cannot CSE.
(In reply to rguenther@suse.de from comment #9) > Ok, my manpage says > > RETURN VALUE > aligned_alloc(), memalign(), valloc(), and pvalloc() return a > pointer > to the allocated memory, or NULL if the request fails. > > posix_memalign() returns zero on success, or one of the error > values > listed in the next section on failure. Note that errno is not set. > > so that must be incorrect. > > If the value of errno is undefined after posix_memalign that doesn't > help us as we then still cannot CSE. Linux man pages are often incorrect, better look at the 3p ones ;).
If a function is not allowed to change errno this must be explicitly documented.
(In reply to Andreas Schwab from comment #11) > If a function is not allowed to change errno this must be explicitly > documented. That means Index: gcc/tree-ssa-alias.c =================================================================== --- gcc/tree-ssa-alias.c.orig 2014-02-06 15:43:42.138266256 +0100 +++ gcc/tree-ssa-alias.c 2014-02-06 15:43:33.046266882 +0100 @@ -1847,7 +1847,9 @@ call_may_clobber_ref_p_1 (gimple call, a ao_ref dref; ao_ref_init_from_ptr_and_size (&dref, ptrptr, TYPE_SIZE_UNIT (ptr_type_node)); - return refs_may_alias_p_1 (&dref, ref, false); + return (refs_may_alias_p_1 (&dref, ref, false) + || (flag_errno_math + && targetm.ref_may_alias_errno (ref))); } /* Freeing memory kills the pointed-to memory. More importantly the call has to serve as a barrier for moving loads and stores is necessary.
(In reply to Richard Biener from comment #12) > (In reply to Andreas Schwab from comment #11) > > If a function is not allowed to change errno this must be explicitly > > documented. > > That means > > Index: gcc/tree-ssa-alias.c > =================================================================== > --- gcc/tree-ssa-alias.c.orig 2014-02-06 15:43:42.138266256 +0100 > +++ gcc/tree-ssa-alias.c 2014-02-06 15:43:33.046266882 +0100 > @@ -1847,7 +1847,9 @@ call_may_clobber_ref_p_1 (gimple call, a > ao_ref dref; > ao_ref_init_from_ptr_and_size (&dref, ptrptr, > TYPE_SIZE_UNIT (ptr_type_node)); > - return refs_may_alias_p_1 (&dref, ref, false); > + return (refs_may_alias_p_1 (&dref, ref, false) > + || (flag_errno_math > + && targetm.ref_may_alias_errno (ref))); > } > /* Freeing memory kills the pointed-to memory. More importantly > the call has to serve as a barrier for moving loads and stores > > is necessary. For posix_memalign? I think errno can contain any value, except that library is not allowed to clear errno. So, IMHO *p = 1; posix_memalign (...); return *p; can be still optimized into return 1;, because if p = &errno; then *p after the call has undefined value (just known not to be zero).
On Thu, 6 Feb 2014, jakub at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60092 > > --- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to Richard Biener from comment #12) > > (In reply to Andreas Schwab from comment #11) > > > If a function is not allowed to change errno this must be explicitly > > > documented. > > > > That means > > > > Index: gcc/tree-ssa-alias.c > > =================================================================== > > --- gcc/tree-ssa-alias.c.orig 2014-02-06 15:43:42.138266256 +0100 > > +++ gcc/tree-ssa-alias.c 2014-02-06 15:43:33.046266882 +0100 > > @@ -1847,7 +1847,9 @@ call_may_clobber_ref_p_1 (gimple call, a > > ao_ref dref; > > ao_ref_init_from_ptr_and_size (&dref, ptrptr, > > TYPE_SIZE_UNIT (ptr_type_node)); > > - return refs_may_alias_p_1 (&dref, ref, false); > > + return (refs_may_alias_p_1 (&dref, ref, false) > > + || (flag_errno_math > > + && targetm.ref_may_alias_errno (ref))); > > } > > /* Freeing memory kills the pointed-to memory. More importantly > > the call has to serve as a barrier for moving loads and stores > > > > is necessary. > > For posix_memalign? I think errno can contain any value, except that library > is not allowed to clear errno. > So, IMHO *p = 1; posix_memalign (...); return *p; can be still optimized into > return 1;, because if p = &errno; then *p after the call has undefined value > (just known not to be zero). But on allocation failure posix_memalign may set it to 2, no? So for errno = 0; posix_memalign () errno = 0; ptr = malloc (); if (!ptr) perror (); we may not DSE the 2nd errno = 0 store. Richard.
Author: rguenth Date: Fri Feb 7 09:33:23 2014 New Revision: 207595 URL: http://gcc.gnu.org/viewcvs?rev=207595&root=gcc&view=rev Log: 2014-02-07 Richard Biener <rguenther@suse.de> PR middle-end/60092 * builtin-types.def (BT_FN_INT_PTRPTR_SIZE_SIZE): Add. * builtins.def (BUILT_IN_POSIX_MEMALIGN): Likewise. * tree-ssa-structalias.c (find_func_aliases_for_builtin_call): Handle BUILT_IN_POSIX_MEMALIGN. (find_func_clobbers): Likewise. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Likewise. (call_may_clobber_ref_p_1): Likewise. * gcc.dg/tree-ssa/alias-30.c: New testcase. * gcc.dg/tree-ssa/alias-31.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-30.c trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-31.c Modified: trunk/gcc/ChangeLog trunk/gcc/builtin-types.def trunk/gcc/builtins.def trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-alias.c trunk/gcc/tree-ssa-structalias.c
Author: rguenth Date: Fri Feb 7 13:41:10 2014 New Revision: 207598 URL: http://gcc.gnu.org/viewcvs?rev=207598&root=gcc&view=rev Log: 2014-02-07 Richard Biener <rguenther@suse.de> PR middle-end/60092 * gimple-low.c (lower_builtin_posix_memalign): New function. (lower_stmt): Call it to lower posix_memalign in a way to make alignment info accessible. * gcc.dg/vect/pr60092-2.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/vect/pr60092-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-low.c trunk/gcc/testsuite/ChangeLog
Author: jakub Date: Sat Feb 8 09:09:01 2014 New Revision: 207628 URL: http://gcc.gnu.org/viewcvs?rev=207628&root=gcc&view=rev Log: PR middle-end/60092 * tree-ssa-ccp.c (surely_varying_stmt_p): Don't return true if TYPE_ATTRIBUTES (gimple_call_fntype ()) contain assume_aligned or alloc_align attributes. (bit_value_assume_aligned): Add ATTR, PTRVAL and ALLOC_ALIGN arguments. Handle also assume_aligned and alloc_align attributes. (evaluate_stmt): Adjust bit_value_assume_aligned caller. Handle calls to functions with assume_aligned or alloc_align attributes. * doc/extend.texi: Document assume_aligned and alloc_align attributes. c-family/ * c-common.c (handle_alloc_size_attribute): Use tree_fits_uhwi_p and tree_to_uhwi. (handle_alloc_align_attribute, handle_assume_aligned_attribute): New functions. (c_common_attribute_table): Add alloc_align and assume_aligned attributes. testsuite/ * gcc.dg/attr-alloc_align-1.c: New test. * gcc.dg/attr-alloc_align-2.c: New test. * gcc.dg/attr-alloc_align-3.c: New test. * gcc.dg/attr-assume_aligned-1.c: New test. * gcc.dg/attr-assume_aligned-2.c: New test. * gcc.dg/attr-assume_aligned-3.c: New test. Added: trunk/gcc/testsuite/gcc.dg/attr-alloc_align-1.c trunk/gcc/testsuite/gcc.dg/attr-alloc_align-2.c trunk/gcc/testsuite/gcc.dg/attr-alloc_align-3.c trunk/gcc/testsuite/gcc.dg/attr-assume_aligned-1.c trunk/gcc/testsuite/gcc.dg/attr-assume_aligned-2.c trunk/gcc/testsuite/gcc.dg/attr-assume_aligned-3.c Modified: trunk/gcc/ChangeLog trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/c-common.c trunk/gcc/doc/extend.texi trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-ccp.c
(In reply to Richard Biener from comment #1) > We could "lower" > posix_memalign (&ptr, align, size); > to > posix_memalign (&ptr, align, size); > ptr = __builtin_assume_algined (ptr, align); > and hope for FRE to fix things up enough to make that useful. I wonder about mm_malloc. I assume for config/i386/pmm_malloc.h, it is already handled via posix_memalign, but shouldn't one also handle config/i386/gmm_malloc.h? For instance via --- a/gcc/config/i386/gmm_malloc.h +++ b/gcc/config/i386/gmm_malloc.h @@ -61,7 +61,11 @@ _mm_malloc (size_t size, size_t align) /* Store the original pointer just before p. */ ((void **) aligned_ptr) [-1] = malloc_ptr; +#if defined(__GNUC__) && __GNUC__ >= 4 && __GNUC_MINOR__ >= 7 + return __builtin_assume_aligned(aligned_ptr, align); +#else return aligned_ptr; +#endif } static __inline__ void
(In reply to Tobias Burnus from comment #18) > (In reply to Richard Biener from comment #1) > > We could "lower" > > posix_memalign (&ptr, align, size); > > to > > posix_memalign (&ptr, align, size); > > ptr = __builtin_assume_algined (ptr, align); > > and hope for FRE to fix things up enough to make that useful. > > > I wonder about mm_malloc. I assume for config/i386/pmm_malloc.h, it is > already handled via posix_memalign, but shouldn't one also handle > config/i386/gmm_malloc.h? For instance via > > --- a/gcc/config/i386/gmm_malloc.h > +++ b/gcc/config/i386/gmm_malloc.h > @@ -61,7 +61,11 @@ _mm_malloc (size_t size, size_t align) > /* Store the original pointer just before p. */ > ((void **) aligned_ptr) [-1] = malloc_ptr; > > +#if defined(__GNUC__) && __GNUC__ >= 4 && __GNUC_MINOR__ >= 7 > + return __builtin_assume_aligned(aligned_ptr, align); > +#else > return aligned_ptr; > +#endif > } > > static __inline__ void No, why? ccp of course understands the dynamic realignment: aligned_ptr = (void *) (((size_t) malloc_ptr + align) & ~((size_t) (align) - 1)); so will know that aligned_ptr is align bytes aligned.
Author: rguenth Date: Wed Feb 12 13:36:08 2014 New Revision: 207720 URL: http://gcc.gnu.org/viewcvs?rev=207720&root=gcc&view=rev Log: 2014-02-12 Richard Biener <rguenther@suse.de> PR middle-end/60092 * gimple-low.c (lower_builtin_posix_memalign): Lower conditional of posix_memalign being successful. (lower_stmt): Restrict lowering of posix_memalign to when -ftree-bit-ccp is enabled. * gcc.dg/torture/pr60092.c: New testcase. * gcc.dg/tree-ssa/alias-31.c: Disable SRA. Added: trunk/gcc/testsuite/gcc.dg/torture/pr60092.c Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-low.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-31.c
The new test FAILs on Solaris 11 (both SPARC and x86), which, unlike Solaris 10, has posix_memalign in libc: FAIL: gcc.dg/torture/pr60092.c -O0 execution test The posix_memalign invocation with size = -1 cannot be right, and indeed the function returns ENOMEM. Rainer
(In reply to Rainer Orth from comment #21) > The new test FAILs on Solaris 11 (both SPARC and x86), which, unlike Solaris > 10, > has posix_memalign in libc: > > FAIL: gcc.dg/torture/pr60092.c -O0 execution test > > The posix_memalign invocation with size = -1 cannot be right, and indeed the > function returns ENOMEM. The invocation uses size -1 intentionally, and expects ENOMEM, but if Solaris libc stores to what the first argument points to even when it fails, then it violates POSIX.
> --- Comment #22 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to Rainer Orth from comment #21) >> The new test FAILs on Solaris 11 (both SPARC and x86), which, unlike Solaris >> 10, >> has posix_memalign in libc: >> >> FAIL: gcc.dg/torture/pr60092.c -O0 execution test >> >> The posix_memalign invocation with size = -1 cannot be right, and indeed the >> function returns ENOMEM. > > The invocation uses size -1 intentionally, and expects ENOMEM, but if Solaris > libc stores to what the first argument points to even when it fails, then it > violates POSIX. POSIX.1 doesn't explicitly forbid setting *memptr on error: http://pubs.opengroup.org/onlinepubs/007904975/functions/posix_memalign.html But I agree it seems strange, and in the OpenSolaris sources you see that *memptr is set to the memalign() return value (NULL in this case). Rainer
As the standard doesn't specify that the value is undefined upon error and it only specifies its contents upon successfull completion it implicitely says that it retains the previous value on error. I'd say xfail/skip this on solaris11 and report a bug to them.
> --- Comment #24 from Richard Biener <rguenth at gcc dot gnu.org> --- > As the standard doesn't specify that the value is undefined upon error and it > only specifies its contents upon successfull completion it implicitely says > that it retains the previous value on error. > > I'd say xfail/skip this on solaris11 and report a bug to them. Patch here: http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01039.html I'd already reported the bug: 18253126 posix_memalign writes to first arg on error Rainer
GCC 4.9.0 has been released
Author: rguenth Date: Mon Apr 28 14:36:13 2014 New Revision: 209863 URL: http://gcc.gnu.org/viewcvs?rev=209863&root=gcc&view=rev Log: 2014-04-28 Richard Biener <rguenther@suse.de> PR middle-end/60092 * builtins.def (DEF_C11_BUILTIN): Add. (BUILT_IN_ALIGNED_ALLOC): Likewise. * coretypes.h (enum function_class): Add function_c11_misc. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Handle BUILT_IN_ALIGNED_ALLOC like BUILT_IN_MALLOC. (call_may_clobber_ref_p_1): Likewise. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Likewise. (mark_all_reaching_defs_necessary_1): Likewise. (propagate_necessity): Likewise. (eliminate_unnecessary_stmts): Likewise. * tree-ssa-ccp.c (evaluate_stmt): Handle BUILT_IN_ALIGNED_ALLOC. ada/ * gcc-interface/utils.c: Define flag_isoc11. lto/ * lto-lang.c: Define flag_isoc11. * gcc.dg/tree-ssa/alias-32.c: New testcase. * gcc.dg/vect/pr60092.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/alias-32.c trunk/gcc/testsuite/gcc.dg/vect/pr60092.c Modified: trunk/gcc/ChangeLog trunk/gcc/ada/ChangeLog trunk/gcc/ada/gcc-interface/utils.c trunk/gcc/builtins.def trunk/gcc/coretypes.h trunk/gcc/lto/ChangeLog trunk/gcc/lto/lto-lang.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-alias.c trunk/gcc/tree-ssa-ccp.c trunk/gcc/tree-ssa-dce.c
Fixed for 4.9.0.