Bug 93031 - Wish: When the underlying ISA does not force pointer alignment, option to make GCC not assume it
Summary: Wish: When the underlying ISA does not force pointer alignment, option to mak...
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: c (show other bugs)
Version: 9.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-12-20 16:12 UTC by Pascal Cuoq
Modified: 2021-05-03 14:58 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Pascal Cuoq 2019-12-20 16:12:50 UTC
GCC assumes that pointers must be aligned as part of its optimizations, even if the ISA does not force it (for instance, x86-64 without the vector instructions). The present feature wish as for an option to make it not make this assumption.

Since the late 1990s, GCC has been adding optimizations based on undefined behavior, and “breaking” existing C programs that used to “work” by relying on the assumption that since they were compiled for architecture X, they would be fine. The reasonable developers have been kept happy by giving them options to preserve the old behavior. These options are -fno-strict-aliasing, -fwrapv, ... and I think there should be another one.

In 2016, Pavel Zemtsov showed that a C program that was written assuming that misaligned pointer accesses are allowed could be compiled by GCC to code that did not work as intended:

http://pzemtsov.github.io/2016/11/06/bug-story-alignment-on-x86.html

This was an interesting development, but GCC's behavior was fair: Pavel had not disabled the vector instructions, GCC had automatically inserted these instructions in the generated code, so the target architecture was not really one that allowed all misaligned pointer accesses. Using -mno-sse would have fixed the behavior.

I have recently noticed that since at least version 8.1, GCC assumes that all pointers are aligned even when the target ISA really has no such restriction. This can be seen by compiling the two functions below (Compiler Explorer link:  https://gcc.godbolt.org/z/UBBD2Y )

int h(int *p, int *q){
  *p = 1;
  *q = 1;
  return *p;
}

typedef __attribute__((__may_alias__)) int I;

I k(I *p, I *q){
  *p = 1;
  *q = 1;
  return *p;
}

Compiling with GCC 8.1 or 9.2, with either the options “-O2” or “-O2 -fno-strict-aliasing”, GCC generates code that assumes that the functions will always return 1.
It does so because it assumes both p and q to be aligned pointers. I would like to have an option in order to make it not make this assumption.

(This feature wish is not related to strict aliasing. I have only used the option and the attribute in order to show that the optimization at play here is not related to strict aliasing, and that there is currently no documented way (that I know of) to disable it after having passed “-O2”.)

A context in which the functions are called may be:

int main(void) {
    char t[6];
    return h((int*)t, (int*)(t+2));
}

This context violates strict aliasing, but when using __attribute__((__may_alias__)) or -fno-strict-aliasing, this should not be relevant.

The idea here would be to have an option to keep legacy code that used to work working. Modern C code should probably use memcpy to access sequences of bytes that are to be treated as a word but may not be aligned in memory, since both Clang and GCC usually makes these memcpy free.

This feature wish is partially related to the bug report about packed structs https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51628 which ended with the addition of a new warning -Waddress-of-packed-member, but it is concerned with architectures on which misaligned accesses are allowed, not with architectures on which they are not.
Comment 1 Andrew Pinski 2019-12-21 01:38:35 UTC
I think it is wrong to assume that.
In fact it will make the code generation worse.  NOTE on x86_64, some instructions (SSE load/store ones) only handle aligned load/stores.  So there is the underlying ISA for x86_64 does enforce alignment.  That is where the blog is talking about really.  it just happens the GPR loads don't enforce alignment.

The C standard is clear here.
What you want really is the aligned attribute applied to all types.  This is not going to be supported really as it is not useful in real code and will cause most code to be worse off.

>GCC assumes that all pointers are aligned even when the target ISA really has no such restriction
Because the C standard says that is way pointers are designed.  Writing code in C means understand the standard requirements including alignment, aliasing, etc.
Comment 2 Alexander Monakov 2019-12-21 13:04:40 UTC
That must be the most well-written report I've seen so far sacrificed to the God of Unfairly Closed Bugreports.

Note that GCC aims to allow partial overlap for situations when alignment<size, see responses from Richi in PR 91091 ("Note that GCC middle-end semantics would allow partial overlaps here unless you factor in alignment requirements") and PR 91419.

It sounds messy but possible to respect alignment for ABI purposes and "forget" it for optimization purposes, but hard to say if such feature would have any users.

If there are examples of legacy code in need of such option, it might be good to know.
Comment 3 Pascal Cuoq 2019-12-21 16:42:55 UTC
@amonakov

The two blog posts below exist themselves, and describe tools that exist,
because software that makes misaligned access exists, although it seems to
be a “examples too numerous to list” situation (or, more optimistically,
perhaps one where the source code is fixed upstream as problems are found).

https://blogs.oracle.com/d/on-misaligned-memory-accesses
https://blog.quarkslab.com/unaligned-accesses-in-cc-what-why-and-solutions-to-do-it-properly.html

(In the end it's the binary executable that doesn't work, and both
posts deal with that aspect at some point, but these executables were
not written in SPARC or respectively ARM assembly. If they had been,
they would have been written to work. Instead, they were written in a
higher-level language that was translated to SPARC/ARM assembly,
presumably C.)


For a specific example, fifteen minutes of looking around knowing what
one is looking for turns up the LZO implementation from http://www.oberhumer.com/opensource/lzo/ . In the latest
version to date, 2.10:

#if (LZO_ARCH_ALPHA)
#  define LZO_OPT_AVOID_UINT_INDEX          1
#elif (LZO_ARCH_AMD64)
#  define LZO_OPT_AVOID_INT_INDEX           1
#  define LZO_OPT_AVOID_UINT_INDEX          1
#  ifndef LZO_OPT_UNALIGNED16
#  define LZO_OPT_UNALIGNED16               1
#  endif
#  ifndef LZO_OPT_UNALIGNED32
#  define LZO_OPT_UNALIGNED32               1
#  endif
#  ifndef LZO_OPT_UNALIGNED64
#  define LZO_OPT_UNALIGNED64               1
#  endif
#elif (LZO_ARCH_ARM)
#  if defined(__ARM_FEATURE_UNALIGNED)
#   if ((__ARM_FEATURE_UNALIGNED)+0)
#    ifndef LZO_OPT_UNALIGNED16
#    define LZO_OPT_UNALIGNED16             1
#    endif
#    ifndef LZO_OPT_UNALIGNED32
#    define LZO_OPT_UNALIGNED32             1
#    endif
#   endif
#  elif 1 && (LZO_ARCH_ARM_THUMB2)

...

#if (LZO_OPT_UNALIGNED32)
LZO_COMPILE_TIME_ASSERT_HEADER(sizeof(*(lzo_memops_TU4p)0)==4)
#define LZO_MEMOPS_COPY4(dd,ss) \
    * (lzo_memops_TU4p) (lzo_memops_TU0p) (dd) = * (const lzo_memops_TU4p) (const lzo_memops_TU0p) (ss)
#elif defined(lzo_memops_tcheck__)
#define LZO_MEMOPS_COPY4(dd,ss) \
    LZO_BLOCK_BEGIN if (lzo_memops_tcheck__(lzo_memops_TU4,4,1)) { \
        * (lzo_memops_TU4p) (lzo_memops_TU0p) (dd) = * (const lzo_memops_TU4p) (const lzo_memops_TU0p) (ss); \
    } else { LZO_MEMOPS_MOVE4(dd,ss); } LZO_BLOCK_END
#else
#define LZO_MEMOPS_COPY4(dd,ss) LZO_MEMOPS_MOVE4(dd,ss)
#endif

...

It is good news that this particular piece of software is already
designed to work on compilation platforms where misaligned accesses
are forbidden. But if anyone compiles it today with GCC for amd64 or
for “__ARM_FEATURE_UNALIGNED”, they are at risk of an optimization
firing and making the library not work as intended, perhaps in obscure
cases with safety or security implications.

I will state, despite the risk of tedious repetition, that the LZO
implementation invokes Undefined Behavior. I know it, Oberhumer
clearly knows it, and anyone who has read this far knows it. However
the perception of some GCC users (not me!) may be that GCC is once
again changing the rules and taking an Undefined Behavior that would
obviously never cause an actual demon to come out of one's nose, such
as a source file missing a final newline, and changing it into
one that does.
Comment 4 felix 2020-05-13 14:18:36 UTC
Given the discussion above apparently ended with the conclusion that code which performs misaligned accesses is ill-formed even on architectures that are permissive of such accesses, would it not make sense to make -Wcast-align synonymous to -Wcast-align=strict?
Comment 5 Vladislav Valtchev 2021-03-16 15:38:29 UTC
Guys, in the Linux kernel too unaligned access is used when the ISA supports it natively. Take a look at:

https://elixir.bootlin.com/linux/latest/source/lib/strncpy_from_user.c#L15

When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is 1, we have:
    #define IS_UNALIGNED(src, dst) 0

The initial IS_UNALIGNED() check in do_strncpy_from_user() always fails and the
following statement is executed:

    *(unsigned long *)(dst+res) = c;

Where 'dst' is a char* pointer, while 'res' is an unsigned long.

CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is widely used: https://elixir.bootlin.com/linux/latest/A/ident/CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS

What protects us from undefined behavior? We all agree that -fno-strict-aliasing doesn't help in this case. Maybe we're *temporarily* kind of safe because FPU instructions are not allowed in the kernel and we're not aliasing anything? But that doesn't guarantee us that, in the future, some new fancy optimization won't break everything, right?


My point is that out there there is a *ton* of code using unaligned access (when the ISA supports it) and while it's great to optimize as much as the C standard allows the new code, we still need to build legacy code as well, in MANY cases. And for "legacy" code, I mean all the code that we used to compile with GCC up to version 7.x, which is fairly recent.

I believe that it's *not a good idea* to just drop support for millions of lines of code that potentially, somewhere, might rely on unaligned access without not even adding an option to make that safe. It's simply not realistic to fix ALL the "legacy" C code that uses unaligned access, no matter if, even at the time, the C standard stated that unaligned access is UB. Therefore, it will be really great to have an option such as "-fno-strict-align" or something like that.


Side question: now we have "-Wcast-align=strict", which will trigger a warning in cases like the example above, which is helpful, even if it cannot warn us in all the cases (see Pascal's example). BUT, this warning can be suppressed this way:

    *(unsigned long *)(void *)(dst+res) = c;

Question: does that mean that GCC *won't make assumptions* anymore about the alignment of the pointer, or it's still UB from the GCC point of view?

Thanks in advance,
Vlad
Comment 6 Richard Biener 2021-05-03 07:54:22 UTC
It's still UB.  Note that GCC has for a _long_ time made this assumption - just the places we take advantage of it have grown.

Note it would be _very_ difficult to provide a -fno-strict-alignment option
because we can't really make all data types unaligned since that would break
the ABI.  So instead we'd have to sprinkle flag_strict_alignment checks
all over the place so I'm very sure you won't get a "fixed" compiler and
you might get one that doesn't adhere to the platform ABI any more (in case
we put one flag_strict_alignment too many).
Comment 7 Alexander Monakov 2021-05-03 14:58:08 UTC
In comment #2 I touched upon a potentially more practical way to offer -fno-strict-alignment:

Run early work with ABI alignments: compute __alignof correctly, lay out composite types as required by ABI, and assign alignments to variables (including stack variables and function parameters). Then make a pass over types and reduce their alignment. This way, optimizations see a universe where types have alignment 1, and variables are defined as if they had an explicit attribute-align with increased alignment (and likewise for structure fields).