Bug 56564 - movdqa on possibly-8-byte-aligned struct with -O3
Summary: movdqa on possibly-8-byte-aligned struct with -O3
Status: ASSIGNED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.7.2
: P3 normal
Target Milestone: ---
Assignee: Jakub Jelinek
URL:
Keywords: ABI, wrong-code
Depends on:
Blocks:
 
Reported: 2013-03-07 16:23 UTC by lukeocamden
Modified: 2017-07-14 08:46 UTC (History)
4 users (show)

See Also:
Host:
Target: x86_64-*-*
Build:
Known to work:
Known to fail: 4.6.4, 4.7.2, 4.8.0
Last reconfirmed: 2013-03-08 00:00:00


Attachments
Preprocessed source file (146.08 KB, application/x-gzip)
2013-03-07 16:26 UTC, lukeocamden
Details
Preprocessed with ICC (144.20 KB, application/x-gzip)
2013-03-08 10:47 UTC, lukeocamden
Details
Generated by icc 13 (328.72 KB, application/x-gzip)
2013-03-08 11:07 UTC, lukeocamden
Details
gcc49-pr56564.patch (4.21 KB, patch)
2013-06-07 14:01 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description lukeocamden 2013-03-07 16:23:22 UTC
#include <boost/exception_ptr.hpp>

struct foo { };

int main()
{
  boost::copy_exception(foo());
}

Compiling the above with -O3 results in the following instruction being emitted:

movdqa  %xmm0, _ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep(%rip)

But that symbol need not be 16-byte aligned (it's a boost::exception_ptr, which contains a boost::shared_ptr, which is just a pair of pointers).

This crashes if _ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep comes from another object file where it is declared with 8-byte alignment.

Possible duplicate of 54167? Works fine with 4.6.2

Preprocessed source is attached.
Comment 1 lukeocamden 2013-03-07 16:26:39 UTC
Created attachment 29611 [details]
Preprocessed source file
Comment 2 Richard Biener 2013-03-08 10:00:54 UTC
This instruction appears in an EH region of function

_ZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEv

(AFAIK).  It's defined twice, once weak and aligned 8 and once strong
and aligned 16, so AFAIK it _is_ aligned properly.

        .align 8
        .type   _ZGVZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep, @gnu_unique_object
        .size   _ZGVZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep, 8
_ZGVZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep:
        .zero   8
        .weak   _ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep
        .section        .bss._ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep,"awG",@nobits,_ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep,comdat
        .align 16
        .type   _ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep, @gnu_unique_object
        .size   _ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep, 16
_ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep:
        .zero   16

and readelf shows:

  [192] .bss._ZZN5boost16 NOBITS           0000000000000000  00001ca0
       0000000000000010  0000000000000000 WAG       0     0     16

with alignment of 16.

> This crashes if
>_ZZN5boost16exception_detail27get_static_exception_objectINS0_10bad_alloc_EEENS_13exception_ptrEvE2ep
> comes from another object file where it is declared with 8-byte alignment.

so this would be a bug and a violation of ODR(?)

What's this "other object file"?

The code piece in question is:

        template <class Exception>
        exception_ptr
        get_static_exception_object()
            {
            Exception ba;
            exception_detail::clone_impl<Exception> c(ba);
            static exception_ptr ep(shared_ptr<exception_detail::clone_base const>(new exception_detail::clone_impl<Exception>(c)));
            return ep;
            }

OTOH, not sure what increases the alignment of that object from it's
type-alignmend.

Same alignment is emitted with 4.8 and also 4.6 - so you must be unlucky
with that other object file (compiled with which compiler?)

Please also attach preprocessed source of the "other object file"
Comment 3 lukeocamden 2013-03-08 10:32:34 UTC
Sorry for my cryptic comments about the "other object file". It's compiled with icc 13. I will attach the preprocessed source and generated assembly.
Comment 4 lukeocamden 2013-03-08 10:47:10 UTC
Created attachment 29618 [details]
Preprocessed with ICC
Comment 5 lukeocamden 2013-03-08 11:07:45 UTC
Created attachment 29619 [details]
Generated by icc 13
Comment 6 Jakub Jelinek 2013-03-08 11:20:19 UTC
This is due to ix86_data_alignment, which has:

  /* x86-64 ABI requires arrays greater than 16 bytes to be aligned
     to 16byte boundary.  */
  if (TARGET_64BIT)
    {
      if (AGGREGATE_TYPE_P (type)
           && TYPE_SIZE (type)
           && TREE_CODE (TYPE_SIZE (type)) == INTEGER_CST
           && (TREE_INT_CST_LOW (TYPE_SIZE (type)) >= 128
               || TREE_INT_CST_HIGH (TYPE_SIZE (type))) && align < 128)
        return 128;
    }

The comment and wording of http://refspecs.linuxfoundation.org/elf/x86_64-abi-0.95.pdf
seems to be inconsistent with what the code does.
The comment and 0.95 version of the psABI only talks about arrays:
"An array uses the same alignment as its elements, except that a local or global
array variable that requires at least 16 bytes, or a C99 local or global variable-length array variable, always has alignment of at least 16 bytes."
but AGGREGATE_TYPE_P isn't solely about array local/global variables, but any aggregates (arrays, structs, unions, ...).  ep apparently has size of 16 and the above code aligns it to 16 bytes, but icc probably aligns it just to 8 bytes, as the maximum alignment of its members.  Now, changing the above to only look at arrays would probably cause more harm than good, because while code compiled by fixed gcc would be compatible with icc, it would be incompatible with code compiled by older gcc.  Guess if we want to change something, we'd need to change it in a way that such aggregates (non-array ones) of size 16 and above are still 16-byte aligned, but if the variable isn't known to bind locally, don't increase DECL_ALIGN of the var, so that no optimizers actually rely on it.
Comment 7 Richard Biener 2013-03-08 11:26:19 UTC
Confirmed.
Comment 8 Jakub Jelinek 2013-03-08 11:35:36 UTC
Guess we'd need to split DATA_ALIGNMENT into two different macros (or one with an extra argument), so that align_variable would know what alignment is part of ABI and what is just an optimization above that, then align_variable could call targetm.binds_local_p to see if DECL_ALIGN can be increased to the optimization level or needs to stay at the ABI guaranteed level.  And then when assembling vars, we'd increase the emitted alignment to the optimization level.
Comment 9 Jakub Jelinek 2013-03-08 12:38:20 UTC
Smaller testcase (-O2 -fpic):

struct S { long a, b; } s;
int foo (void)
{
  return ((long) &s) & 15;
}

is since http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162943 optimized into return 0, even when (probably) the psABI doesn't guarantee that.  But e.g. for
__builtin_memset (&s, 0, sizeof (s)); one can see already in 4.0 RTL dumps with -O2 -fpic that MEM_ALIGN of s is assumed to be 128-bit.
Comment 10 Jan Hubicka 2013-04-08 15:22:21 UTC
Mine.
Comment 11 Sandra Loosemore 2013-05-25 17:39:23 UTC
This affects at least PowerPC, too, which implements DATA_ALIGNMENT to add additional alignment beyond that specified by the ABI.

Isn't TYPE_ALIGN already supposed to return the ABI-mandated alignment for objects of a given type?  The documentation for DATA_ALIGNMENT already suggests that its purpose is to add additional alignment for optimization purposes and I suspect other targets may be using it that way, too.  Perhaps what's needed here is more careful monitoring of the places where DATA_ALIGNMENT is being used, rather than splitting it into two macros or adding an argument to control the two uses.  Or at least, we'd have to clarify how the requirements for the ABI-conforming use of DATA_ALIGNMENT differ from what TYPE_ALIGN is supposed to do.

It seems to me that DATA_ALIGNMENT's original purpose was to add additional alignment on variable definitions, and IIUC the problem now is either that it is being used in other contexts or that its intended use is not taking into account common, weak, and/or comdat definitions where the linker may substitute a less-aligned definition from another compilation unit.  

Also, somebody should check whether vect_can_force_dr_alignment_p in tree-vect-data-refs.c is catching all the cases it needs to for ABI conformance.
Comment 12 Jakub Jelinek 2013-05-25 18:21:49 UTC
Maybe it was original DATA_ALIGNMENT purpose, but it certainly serves for both right now, which is wrong, we need one for ABI mandated stuff and one for optimization stuff beyond, where optimization alignment can be used if it can be proved that we'll bind to the optimized decl, but ABI has to be used otherwise.

E.g. x86_64 ABI says that certain arrays are aligned that and that way, it is certainly something beyond what TYPE_ALIGN provides (changing TYPE_ALIGN of the arrays would affect layout of structures, but that is wrong).
Comment 13 Jakub Jelinek 2013-06-07 14:01:08 UTC
Created attachment 30275 [details]
gcc49-pr56564.patch

Untested fix.  Honza, is the array type >= 16 bytes alignment increase the only ABI mandated one and all the rest is just optimization?
Comment 14 Jakub Jelinek 2013-06-10 15:52:18 UTC
Author: jakub
Date: Mon Jun 10 15:41:52 2013
New Revision: 199898

URL: http://gcc.gnu.org/viewcvs?rev=199898&root=gcc&view=rev
Log:
	PR target/56564
	* varasm.c (align_variable): Don't use DATA_ALIGNMENT or
	CONSTANT_ALIGNMENT if !decl_binds_to_current_def_p (decl).
	Use DATA_ABI_ALIGNMENT for that case instead if defined.
	(get_variable_align): New function.
	(get_variable_section, emit_bss, emit_common,
	assemble_variable_contents, place_block_symbol): Use
	get_variable_align instead of DECL_ALIGN.
	(assemble_noswitch_variable): Add align argument, use it
	instead of DECL_ALIGN.
	(assemble_variable): Adjust caller.  Use get_variable_align
	instead of DECL_ALIGN.
	* config/i386/i386.h (DATA_ALIGNMENT): Adjust x86_data_alignment
	caller.
	(DATA_ABI_ALIGNMENT): Define.
	* config/i386/i386-protos.h (x86_data_alignment): Adjust prototype.
	* config/i386/i386.c (x86_data_alignment): Add opt argument.  If
	opt is false, only return the psABI mandated alignment increase.
	* config/c6x/c6x.h (DATA_ALIGNMENT): Renamed to...
	(DATA_ABI_ALIGNMENT): ... this.
	* config/mmix/mmix.h (DATA_ALIGNMENT): Renamed to...
	(DATA_ABI_ALIGNMENT): ... this.
	* config/mmix/mmix.c (mmix_data_alignment): Adjust function comment.
	* config/s390/s390.h (DATA_ALIGNMENT): Renamed to...
	(DATA_ABI_ALIGNMENT): ... this.
	* doc/tm.texi.in (DATA_ABI_ALIGNMENT): Document.
	* doc/tm.texi: Regenerated.

	* gcc.target/i386/pr56564-1.c: New test.
	* gcc.target/i386/pr56564-2.c: New test.
	* gcc.target/i386/pr56564-3.c: New test.
	* gcc.target/i386/pr56564-4.c: New test.
	* gcc.target/i386/avx256-unaligned-load-4.c: Add -fno-common.
	* gcc.target/i386/avx256-unaligned-store-1.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-3.c: Likewise.
	* gcc.target/i386/avx256-unaligned-store-4.c: Likewise.
	* gcc.target/i386/vect-sizes-1.c: Likewise.
	* gcc.target/i386/memcpy-1.c: Likewise.
	* gcc.dg/vect/costmodel/i386/costmodel-vect-31.c (tmp): Initialize.
	* gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c (tmp): Likewise.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr56564-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr56564-2.c
    trunk/gcc/testsuite/gcc.target/i386/pr56564-3.c
    trunk/gcc/testsuite/gcc.target/i386/pr56564-4.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/c6x/c6x.h
    trunk/gcc/config/i386/i386-protos.h
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h
    trunk/gcc/config/mmix/mmix.c
    trunk/gcc/config/mmix/mmix.h
    trunk/gcc/config/s390/s390.h
    trunk/gcc/doc/tm.texi
    trunk/gcc/doc/tm.texi.in
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/vect/costmodel/i386/costmodel-vect-31.c
    trunk/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-31.c
    trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-load-4.c
    trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-1.c
    trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-3.c
    trunk/gcc/testsuite/gcc.target/i386/avx256-unaligned-store-4.c
    trunk/gcc/testsuite/gcc.target/i386/memcpy-1.c
    trunk/gcc/testsuite/gcc.target/i386/vect-sizes-1.c
    trunk/gcc/varasm.c
Comment 15 Jakub Jelinek 2013-06-11 06:17:17 UTC
Author: jakub
Date: Tue Jun 11 06:03:46 2013
New Revision: 199934

URL: http://gcc.gnu.org/viewcvs?rev=199934&root=gcc&view=rev
Log:
	PR target/56564
	* varasm.c (get_variable_align): Move #endif to the right place.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/varasm.c
Comment 16 Dominique d'Humieres 2013-06-11 09:15:58 UTC
On x86_64-apple-darwin10.8 at revision 199935, I get the following failures for the tests added at revision 199898:

FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "&s" 1
FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "return 0" 1
FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&s" 1
FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&t" 1

The optimized dumps are (blank lines removed):

[macbook] f90/bug% cat pr56564-1.c.165t.optimized
;; Function foo (foo, funcdef_no=0, decl_uid=1741, symbol_order=2)
foo ()
{
  <bb 2>:
  return 0;
}
;; Function bar (bar, funcdef_no=1, decl_uid=1744, symbol_order=3)
bar ()
{
  <bb 2>:
  return 0;
}

[macbook] f90/bug% cat pr56564-3.c.165t.optimized
;; Function foo (foo, funcdef_no=0, decl_uid=1741, symbol_order=2)
foo ()
{
  struct S * D.1770;
  long int s.0;
  int _2;
  int _3;
  <bb 2>:
  _5 = __builtin___emutls_get_address (&__emutls_v.s);
  s.0_1 = (long int) _5;
  _2 = (int) s.0_1;
  _3 = _2 & 15;
  return _3;
}
;; Function bar (bar, funcdef_no=1, decl_uid=1744, symbol_order=3)
bar ()
{
  char * D.1769;
  char[16] * D.1768;
  long int _1;
  int _2;
  int _3;
  <bb 2>:
  _5 = __builtin___emutls_get_address (&__emutls_v.t);
  _6 = &*_5[0];
  _1 = (long int) _6;
  _2 = (int) _1;
  _3 = _2 & 15;
  return _3;
}
Comment 17 Jakub Jelinek 2013-06-11 09:26:56 UTC
(In reply to Dominique d'Humieres from comment #16)
> On x86_64-apple-darwin10.8 at revision 199935, I get the following failures
> for the tests added at revision 199898:
> 
> FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "&s" 1
> FAIL: gcc.target/i386/pr56564-1.c scan-tree-dump-times optimized "return 0" 1
> FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&s" 1
> FAIL: gcc.target/i386/pr56564-3.c scan-tree-dump-times optimized "&t" 1

Yeah, MachO is broken by design, guess the tests need to be restricted to non-darwin non-PE.
Comment 18 Dominique d'Humieres 2013-06-11 15:10:55 UTC
(In reply to comment #17)
> Yeah, MachO is broken by design, guess the tests need to be restricted 
> to non-darwin non-PE.

Questions:
(1) What is PE?
(2) Is the second "return 0;" wrong code or valid optimization? If the former, why?
(3) Is the decoration "__emutls_v." the same for all the emutls platforms? If not, where can I find the variants?
Comment 19 Jakub Jelinek 2013-06-11 15:27:16 UTC
The mingw/cygwin stuff.  The testcases assume that the symbols have decl_binds_to_current_def_p false, if that isn't the case (because darwin/mingw apparently don't allow symbol interposition), then the testcase can't work on those.
Comment 20 Jakub Jelinek 2013-06-12 06:50:03 UTC
Author: jakub
Date: Wed Jun 12 06:43:05 2013
New Revision: 199984

URL: http://gcc.gnu.org/viewcvs?rev=199984&root=gcc&view=rev
Log:
	PR target/56564
	* varasm.c (decl_binds_to_current_def_p): Call binds_local_p
	target hook even for !TREE_PUBLIC decls.  If no resolution info
	is available, return false for common and external decls.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/varasm.c

Author: jakub
Date: Wed Jun 12 06:46:53 2013
New Revision: 199985

URL: http://gcc.gnu.org/viewcvs?rev=199985&root=gcc&view=rev
Log:
	PR target/56564
	* gcc.target/i386/pr56564-1.c: Skip on darwin, mingw and cygwin.
	* gcc.target/i386/pr56564-3.c: Likewise.

Modified:
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.target/i386/pr56564-1.c
    trunk/gcc/testsuite/gcc.target/i386/pr56564-3.c
Comment 21 H.J. Lu 2015-12-11 22:39:43 UTC
This bug isn't fixed in GCC 4.9.  -O3 increases alignment from
64 bits to 128 bits on the original testcase:

Hardware watchpoint 6: *(unsigned int *) 0x7fffee9b4468

Old value = 64
New value = 128
ensure_base_align (stmt_info=0x1c8f990, dr=0x1db5b20)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:4907
4907	      DECL_USER_ALIGN (base_decl) = 1;
(gdb) bt
#0  ensure_base_align (stmt_info=0x1c8f990, dr=0x1db5b20)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:4907
#1  0x0000000000d33471 in vectorizable_store (stmt=0x7fffed95a280, 
    gsi=0x7fffffffd830, vec_stmt=0x7fffffffd790, slp_node=0x1d9e7a0)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:5131
#2  0x0000000000d38f80 in vect_transform_stmt (stmt=0x7fffed95a280, 
    gsi=0x7fffffffd830, grouped_store=0x7fffffffd84a, slp_node=0x1d9e7a0, 
    slp_node_instance=0x1cb3e10)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:7211
#3  0x0000000000d5a980 in vect_schedule_slp_instance (node=0x1d9e7a0, 
    instance=0x1cb3e10, vectorization_factor=1)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-slp.c:3084
#4  0x0000000000d5abd0 in vect_schedule_slp (loop_vinfo=0x0, 
    bb_vinfo=0x1ddf410)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-slp.c:3154
#5  0x0000000000d5aea7 in vect_slp_transform_bb (bb=0x7fffece8ec30)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-slp.c:3230
#6  0x0000000000d5e41b in execute_vect_slp ()
    at /export/gnu/import/git/gcc-release/gcc/tree-vectorizer.c:605
#7  0x0000000000d5e4c9 in (anonymous namespace)::pass_slp_vectorize::execute (
    this=0x1b97010)
    at /export/gnu/import/git/gcc-release/gcc/tree-vectorizer.c:649
#8  0x0000000000a7da14 in execute_one_pass (pass=0x1b97010)
---Type <return> to continue, or q <return> to quit---q
 at /export/gnu/imporQuit
(gdb) f 1
#1  0x0000000000d33471 in vectorizable_store (stmt=0x7fffed95a280, 
    gsi=0x7fffffffd830, vec_stmt=0x7fffffffd790, slp_node=0x1d9e7a0)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:5131
5131	  ensure_base_align (stmt_info, dr);
(gdb) f 2
#2  0x0000000000d38f80 in vect_transform_stmt (stmt=0x7fffed95a280, 
    gsi=0x7fffffffd830, grouped_store=0x7fffffffd84a, slp_node=0x1d9e7a0, 
    slp_node_instance=0x1cb3e10)
    at /export/gnu/import/git/gcc-release/gcc/tree-vect-stmts.c:7211
7211	      done = vectorizable_store (stmt, gsi, &vec_stmt, slp_node);
(gdb) 

This bug may be really fixed by r221268:

iff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index aa9d43f..41ff802 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -4956,8 +4956,13 @@ ensure_base_align (stmt_vec_info stmt_info, struct data_reference *dr)
       tree vectype = STMT_VINFO_VECTYPE (stmt_info);
       tree base_decl = ((dataref_aux *)dr->aux)->base_decl;
 
-      DECL_ALIGN (base_decl) = TYPE_ALIGN (vectype);
-      DECL_USER_ALIGN (base_decl) = 1;
+      if (decl_in_symtab_p (base_decl))
+  symtab_node::get (base_decl)->increase_alignment (TYPE_ALIGN (vectype));
+      else
+  {
+          DECL_ALIGN (base_decl) = TYPE_ALIGN (vectype);
+          DECL_USER_ALIGN (base_decl) = 1;
+  }
       ((dataref_aux *)dr->aux)->base_misaligned = false;
     }
 }

in GCC 5.
Comment 22 Thomas Gereke 2017-07-13 21:43:17 UTC
Seems the bug does still exist in 6.3.0 20170516 (Debian 6.3.0-18). I get a GP on

  >x0x55555574d8c8 <...[abi:cxx11]() const+264>    movdqa 0x68(%rsp),%xmm0
   x0x55555574d8ce <...[abi:cxx11]() const+270>    lea    0x80(%rsp),%r13
   x0x55555574d8d6 <...[abi:cxx11]() const+278>    movq   $0x0,0x50(%rsp)
   x0x55555574d8df <...[abi:cxx11]() const+287>    movl   $0x0,0x10(%rsp)
   x0x55555574d8e7 <...[abi:cxx11]() const+295>    movaps %xmm0,(%rsp)
   x0x55555574d8eb <...[abi:cxx11]() const+299>    movq   $0x0,0x6(%rsp)
   x0x55555574d8f4 <...[abi:cxx11]() const+308>    movw   $0x0,0xe(%rsp)
   x0x55555574d8fb <...[abi:cxx11]() const+315>    movdqa (%rsp),%xmm1
   x0x55555574d900 <...[abi:cxx11]() const+320>    movaps %xmm1,0x40(%rsp)

The asm code is obviously wrong, because movdqa 0x68(%rsp),%xmm0 followed by movdqa (%rsp),%xmm1 without changes to %rsp has to fail. %rsp was 0x7fffecd477d0.

Code was C++ compiled with -O3 and x86_64. The underlying data structure is boost::asio::ip::address, which consists of an enum (4 bytes), address_v4 (4 bytes) and address_v6 (16 bytes). The GP occurs when accessing the ipv6 address.
Comment 23 Jakub Jelinek 2017-07-14 08:46:52 UTC
The bug is fixed, you must be running into a different issue, either in the source you're compiling, or in the compiler.  So, please open a new bugreport instead of commenting on a different one, and supply all the needed information (see http://gcc.gnu.org/bugs/ for details on what we need).