Jakub Jelinek [Tue, 24 Apr 2018 16:05:19 +0000 (18:05 +0200)]
re PR target/85503 (ICE in replace_swapped_load_constant, at config/rs6000/rs6000-p8swap.c:1853 on powerpc64le-linux-gnu)
PR target/85503
* config/rs6000/rs6000-p8swap.c (const_load_sequence_p): Punt if
const_vector is not CONST_VECTOR or SYMBOL_REF for a constant pool
containing a CONST_VECTOR.
Eric Botcazou [Mon, 23 Apr 2018 20:19:39 +0000 (20:19 +0000)]
re PR middle-end/85496 (internal compiler error: in emit_move_insn, at expr.c:3722)
PR middle-end/85496
* expr.c (store_field): In the bitfield case, if the value comes from
a function call and is returned in registers by means of a PARALLEL,
do not change the mode of the temporary unless BLKmode and VOIDmode.
Kito Cheng [Fri, 20 Apr 2018 19:03:19 +0000 (19:03 +0000)]
RISC-V: Make sure stack is always aligned during adjusting stack.
gcc/
2018-04-20 Kito Cheng <kito.cheng@gmail.com>
* config/riscv/riscv.c (riscv_first_stack_step): Round up min
step to make sure stack always aligned.
PR testsuite/85483: Move aarch64/sve/vcond_1.c test to g++.dg/other/
I totally botched up this sve test file in 259437.
It needs C++, so move it to g++.dg/other and make it a .C file.
Also adds the target guards to prevent it from running on non-aarch64 targets.
Tested that it passes on aarch64-none-elf and doesn't get run on arm-none-eabi.
Jakub Jelinek [Thu, 19 Apr 2018 19:16:18 +0000 (21:16 +0200)]
re PR tree-optimization/85467 (ICE: verify_gimple failed: non-trivial conversion at assignment with -O2 -fno-tree-ccp --param=sccvn-max-scc-size=10)
PR tree-optimization/85467
* fold-const.c (fold_ternary_loc) <case BIT_FIELD_REF>: Use
VECTOR_TYPE_P macro. If type is vector type, VIEW_CONVERT_EXPR the
VECTOR_CST element to type.
H.J. Lu [Thu, 19 Apr 2018 17:05:39 +0000 (17:05 +0000)]
libgcc/CET: Skip signal frames when unwinding shadow stack
When -fcf-protection -mcet is used, I got
FAIL: g++.dg/eh/sighandle.C
(gdb) bt
#0 _Unwind_RaiseException (exc=exc@entry=0x416ed0)
at /export/gnu/import/git/sources/gcc/libgcc/unwind.inc:140
#1 0x00007ffff7d9936b in __cxxabiv1::__cxa_throw (obj=<optimized out>,
tinfo=0x403dd0 <typeinfo for int@@CXXABI_1.3>, dest=0x0)
at /export/gnu/import/git/sources/gcc/libstdc++-v3/libsupc++/eh_throw.cc:90
#2 0x0000000000401255 in sighandler (signo=11, si=0x7fffffffd6f8,
uc=0x7fffffffd5c0)
at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:9
#3 <signal handler called> <<<< Signal frame which isn't on shadow stack
#4 dosegv ()
at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:14
#5 0x00000000004012e3 in main ()
at /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/eh/sighandle.C:30
(gdb) p frames
$6 = 5
(gdb)
frame count should be 4, not 5. This patch skips signal frames when
unwinding shadow stack.
gcc/testsuite/
PR libgcc/85334
* g++.dg/torture/pr85334.C: New test.
H.J. Lu [Thu, 19 Apr 2018 16:36:34 +0000 (16:36 +0000)]
i386: Add save_stack_nonlocal and restore_stack_nonlocal
Define STACK_SAVEAREA_MODE to hold both shadow stack and stack pointers.
Replace builtin_setjmp_setup and builtin_longjmp with save_stack_nonlocal
and restore_stack_nonlocal to support both builtin setjmp/longjmp as well
as non-local goto in nested functions.
H.J. Lu [Thu, 19 Apr 2018 15:22:27 +0000 (15:22 +0000)]
libgcc/CET: Add _CET_ENDBR to __stack_split_initialize
Program received signal SIGSEGV, Segmentation fault.
__stack_split_initialize ()
at /export/gnu/import/git/sources/gcc/libgcc/config/i386/morestack.S:751
751 leaq -16000(%rsp),%rax # We should have at least 16K.
Missing separate debuginfos, use: dnf debuginfo-install libgcc-8.0.1-0.21.0.fc28.x86_64
(gdb) disass
Dump of assembler code for function __stack_split_initialize:
=> 0x0000000000402858 <+0>: lea -0x3e80(%rsp),%rax
0x0000000000402860 <+8>: mov %rax,%fs:0x70
0x0000000000402869 <+17>: sub $0x8,%rsp
0x000000000040286d <+21>: mov %rsp,%rdi
0x0000000000402870 <+24>: mov $0x3e80,%esi
0x0000000000402875 <+29>: callq 0x401810 <__generic_morestack_set_initial_sp>
0x000000000040287a <+34>: add $0x8,%rsp
0x000000000040287e <+38>: retq
End of assembler dump.
(gdb)
This patch adds the missing ENDBR to __stack_split_initialize.
H.J. Lu [Thu, 19 Apr 2018 15:15:04 +0000 (15:15 +0000)]
x86: Enable -fcf-protection with multi-byte NOPs
-fcf-protection -mcet can't be used with IFUNC features, like symbol
multiversioning or target clone, since IBT/SHSTK are applied to the whole
program and they may be disabled in some functions. But -fcf-protection
is implemented with multi-byte NOPs on all 64-bit processors as well as
32-bit processors starting with Pentium Pro. If -fcf-protection requires
-mcet, IFUNC features can't be used on Linux when -fcf-protection is
enabled by default.
This patch changes -fcf-protection to implement indirect branch and
return address tracking with multi-byte NOPs. -mibt and -mshstk are
changed to only enable CET built-in functions. CET tests are updated
to allow -fcf-protection without -mibt, -mshstk and -mcet on x86.
-fcf-protection=none are also added to tests which fail with
-fcf-protection so that -fcf-protection can be added to RUNTESTFLAGS
to verify -fcf-protection implementation.
gcc/
PR target/85417
* config/i386/cet.c (file_end_indicate_exec_stack_and_cet):
Check flag_cf_protection instead of TARGET_IBT and TARGET_SHSTK.
* config/i386/i386-c.c (ix86_target_macros_internal): Also
define __IBT__ and __SHSTK__ for -fcf-protection.
* config/i386/i386.c (pass_insert_endbranch::gate): Don't check
TARGET_IBT.
(ix86_trampoline_init): Likewise.
(x86_output_mi_thunk): Likewise.
(ix86_notrack_prefixed_insn_p): Likewise.
(ix86_option_override_internal): Don't disallow -fcf-protection.
* config/i386/i386.md (rdssp<mode>): Also enable for
-fcf-protection.
(incssp<mode>): Likewise.
(nop_endbr): Likewise.
* config/i386/i386.opt (mcet): Change help message to built-in
functions only.
(mibt): Likewise.
(mshstk): Likewise.
* doc/invoke.texi: Remove -mcet, -mibt and -mshstk condition
on -fcf-protection. Change -mcet, -mibt and -mshstk to only
enable CET built-in functions.
Richard Biener [Thu, 19 Apr 2018 12:41:42 +0000 (12:41 +0000)]
re PR tree-optimization/84737 (20% degradation in CPU2000 172.mgrid starting with r256888)
2018-04-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/84737
* tree-vect-data-refs.c (vect_copy_ref_info): New function
copying restrict info.
(vect_setup_realignment): Use it.
* tree-vectorizer.h (vect_copy_ref_info): Declare.
* tree-vect-stmts.c (vectorizable_store): Copy ref info from
the first DR to all generated stores.
(vectorizable_load): Likewise for loads.
Jakub Jelinek [Thu, 19 Apr 2018 07:46:54 +0000 (09:46 +0200)]
re PR tree-optimization/85446 (wrong-code on riscv64)
PR tree-optimization/85446
* match.pd ((intptr_t) x eq/ne CST to x eq/ne (typeof x) cst): Require
the integral and pointer types to have the same precision.
Martin Liska [Wed, 18 Apr 2018 20:08:44 +0000 (22:08 +0200)]
Make Wodr warnings stable.
2018-04-18 Martin Liska <mliska@suse.cz>
PR ipa/83983
PR ipa/85391
* lto.c (cmp_type_location): New function.
(lto_read_decls): First collect all types, then
sort them according by location before register_odr_type
is called.
2018-04-18 Martin Liska <mliska@suse.cz>
David Malcolm [Wed, 18 Apr 2018 09:46:58 +0000 (09:46 +0000)]
re PR jit/85384 (libgccjit does not work if --with-gcc-major-version is used)
PR jit/85384
* acx.m4 (GCC_BASE_VER): Remove \$\$ from sed expression.
* configure.ac (gcc-driver-name.h): Honor --with-gcc-major-version
by using gcc_base_ver to generate a gcc_driver_version, and use
it when generating GCC_DRIVER_NAME.
* configure: Regenerate.
Jakub Jelinek [Wed, 18 Apr 2018 06:57:45 +0000 (08:57 +0200)]
re PR c++/84463 (Supposedly-incompliant "error: '* key0' is not a constant expression")
PR c++/84463
* typeck.c (cp_build_addr_expr_1): Move handling of offsetof-like
tricks from here to ...
* cp-gimplify.c (cp_fold) <case ADDR_EXPR>: ... here. Only use it
if INDIRECT_REF's operand is INTEGER_CST cast to pointer type.
* g++.dg/cpp0x/constexpr-nullptr-1.C: Add -O1 to dg-options.
* g++.dg/cpp0x/constexpr-nullptr-2.C: Expect different diagnostics
in two cases. Uncomment two other tests and add expected dg-error for
them.
* g++.dg/init/struct2.C: Cast to int rather than long to avoid
-Wnarrowing diagnostics on some targets for c++11.
* g++.dg/parse/array-size2.C: Remove xfail.
* g++.dg/cpp0x/constexpr-84463.C: New test.
tinst_level objects are created during template instantiation, and
they're most often quite short-lived, but since there's no intervening
garbage collection, they accumulate throughout the pass while most by
far could be recycled after a short while. The original testcase in
PR80290, for example, creates almost 55 million tinst_level objects,
all but 10 thousand of them without intervening GC, but we don't need
more than 284 of them live at a time.
Furthermore, in many cases, TREE_LIST objects are created to stand for
the decl in tinst_level. In most cases, those can be recycled as soon
as the tinst_level object is recycled; in some relatively common
cases, they are modified and reused in up to 3 tinst_level objects.
In the same testcase, TREE_LISTs are used in all but 3 thousand of the
tinst_level objects, and we don't need more than 89 tinst_level
objects with TREE_LISTs live at a time. Furthermore, all but 2
thousand of those are created without intervening GC.
This patch arranges for tinst_level objects to be refcounted (I've
seen as many as 20 live references to a single tinst_level object in
my testing), and for pending_template, tinst_level and TREE_LIST
objects that can be recycled to be added to freelists; that's much
faster than ggc_free()ing them and reallocating them shortly
thereafter. In fact, the introduction of such freelists is what makes
this entire patch lighter-weight, when it comes to memory use, and
faster. With refcounting alone, the total memory footprint was still
about the same, and we spent more time.
In order to further reduce memory use, I've arranged for us to create
TREE_LISTs lazily, only at points that really need them (when printing
error messages). This grows tinst_level by an additional pointer, but
since a TREE_LIST holds a lot more than an extra pointer, and
tinst_levels with TREE_LISTs used to be allocated tens of thousands of
times more often than plain decl ones, we still save memory overall.
I was still not quite happy about growing memory use in cases that
used template classes but not template overload resolution, so I
changed the type of the errors field to unsigned short, from int.
With that change, in_system_header_p and refcount move into the same
word or half-word that used to hold errors, releasing a full word,
bringing tinst_level back to its original size, now without any
padding.
The errors field is only used to test whether we've had any errors
since the expansion of some template, to skip the expansion of further
templates. If we get 2^16 errors or more, it is probably reasonable
to refrain from expanding further templates, even if we would expand
them before this change.
With these changes, compile time for the original testcase at -O0,
with default checking enabled, is cut down by some 3.7%, total GCable
memory allocation is cut down by almost 20%, and total memory use (as
reported by GNU time as maxresident) is cut down by slightly over 15%.
for gcc/cp/ChangeLog
PR c++/80290
* cp-tree.h (struct tinst_level): Split decl into tldcl and
targs. Add private split_list_p, tree_list_p, and not_list_p
inline const predicates; to_list private member function
declaration; free public member function declaration; list_p,
get_node and maybe_get_node accessors, and refcount data
member. Narrow errors to unsigned short.
* error.c (print_instantiation_full_context): Use new
accessors.
(print_instantiation_partial_context_line): Likewise. Drop
const from tinst_level-typed parameter.
* mangle.c (mangle_decl_string): Likewise.
* pt.c (freelist): New template class.
(tree_list_freelist_head): New var.
(tree_list_freelist): New fn, along with specializations.
(tinst_level_freelist_head): New var.
(pending_template_freelist_head): Likewise.
(tinst_level_freelist, pending_template_freelist): New fns.
(tinst_level::to_list, tinst_level::free): Define.
(inc_refcount_use, dec_refcount_use): New fns for tinst_level.
(set_refcount_ptr): New template fn.
(add_pending_template): Adjust for refcounting, freelists and
new accessors.
(neglectable_inst_p): Take a NULL d as a non-DECL.
(limit_bad_template_recursion): Use new accessors.
(push_tinst_level): New overload to create the list.
(push_tinst_level_loc): Make it static, split decl into two
args, adjust tests and initialization to cope with split
lists, use freelist, adjust for refcounting.
(push_tinst_level_loc): New wrapper with the old interface.
(pop_tinst_level): Adjust for refcounting.
(record_last_problematic_instantiation): Likewise.
(reopen_tinst_level): Likewise. Use new accessors.
(instantiate_alias_template): Adjust for split list.
(fn_type_unification): Likewise.
(get_partial_spec_bindings): Likewise.
(instantiate_pending_templates): Use new accessors. Adjust
for refcount. Release pending_template to freelist.
(instantiating_current_function_p): Use new accessors.
Ian Lance Taylor [Tue, 17 Apr 2018 23:55:17 +0000 (23:55 +0000)]
os/signal: disable loading of history during test
Bring in https://golang.org/cl/98616 from gc tip.
Original CL description:
This change modifies Go to disable loading of users' shell history for
TestTerminalSignal tests. TestTerminalSignal, as part of its workload,
will execute a new interactive bash shell. Bash will attempt to load the
user's history from the file pointed to by the HISTFILE environment
variable. For users with large histories that may take up to several
seconds, pushing the whole test past the 5 second timeout and causing
it to fail.
Jakub Jelinek [Tue, 17 Apr 2018 22:18:47 +0000 (00:18 +0200)]
re PR debug/84637 (gcc/dbxout.c:684:14: runtime error: negation of -9223372036854775808 cannot be represented in type 'long int'; cast to an unsigned type to negate this value to itself)
PR debug/84637
* dbxout.c (dbxout_int): Perform negation in unsigned int type.
(stabstr_D): Change type of unum from unsigned int to
unsigned HOST_WIDE_INT. Perform negation in unsigned HOST_WIDE_INT
type.
gcc/
PR 84856
* config/riscv/riscv.c (riscv_compute_frame_info): Add calls to
RISCV_STACK_ALIGN when using outgoing_args_size and pretend_args_size.
Set arg_pointer_offset after using pretend_args_size.
Jakub Jelinek [Tue, 17 Apr 2018 20:22:50 +0000 (22:22 +0200)]
re PR sanitizer/85230 (asan: false positives in kernel on allocas)
PR sanitizer/85230
* asan.c (handle_builtin_stack_restore): Adjust comment. Emit
__asan_allocas_unpoison call and last_alloca_addr = new_sp before
__builtin_stack_restore rather than after it.
* builtins.c (expand_asan_emit_allocas_unpoison): Pass
arg1 + (virtual_dynamic_stack_rtx - stack_pointer_rtx) as second
argument instead of virtual_dynamic_stack_rtx.
rs6000-protos.h (rs6000_builtin_is_supported_p): New prototype.
gcc/ChangeLog:
2018-04-13 Kelvin Nilsen <kelvin@gcc.gnu.org>
* config/rs6000/rs6000-protos.h (rs6000_builtin_is_supported_p):
New prototype.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Add note to error message to explain internal mapping of overloaded
built-in function name to non-overloaded built-in function name.
* config/rs6000/rs6000.c (rs6000_builtin_is_supported_p): New
function.
gcc/testsuite/ChangeLog:
2018-04-13 Kelvin Nilsen <kelvin@gcc.gnu.org>
* gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Simplify to
prevent cascading of errors and change expected error message.
* gcc.target/powerpc/bfp/scalar-test-neg-4.c: Restrict this test
to 64-bit targets.
* gcc.target/powerpc/bfp/scalar-test-data-class-8.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-9.c: Likewise.
* gcc.target/powerpc/bfp/scalar-test-data-class-10.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Change expected
error message.
* gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Likewise.
Jakub Jelinek [Tue, 17 Apr 2018 17:18:20 +0000 (19:18 +0200)]
sse.md (vec_extract_lo_<mode><mask_name>): Add (=v, v) alternative and explicit "memory" attribute.
* config/i386/sse.md (vec_extract_lo_<mode><mask_name>): Add
(=v, v) alternative and explicit "memory" attribute.
(vec_extract_lo_<mode><mask_name>): Likewise. Also add
"type", "prefix", "prefix_extra", "length_immediate" and "mode"
attributes.
(vec_extract_lo_<mode><mask_name>): Add (=v, v) alternative and use
"sselog1" type instead of "sselog".
(vec_extract_hi_<mode><mask_name>): Use "sselog1" type instead of
"sselog". Remove explicit "memory" attribute.
(vec_extract_lo_v32hi): Add (=v, v) alternative and explicit "memory",
"type", "prefix", "prefix_extra", "length_immediate" and "mode"
attributes.
(vec_extract_hi_v32hi): Merge all alternatives into one, use
"sselog1" type instead of "sselog". Remove explicit "memory"
attribute.
(vec_extract_hi_v16hi): Merge each pair of alternatives into one,
use "sselog1" type instead of "sselog". Remove explicit "memory"
attribute.
(vec_extract_lo_v64qi): Add (=v, v) alternative and explicit "memory",
"type", "prefix", "prefix_extra", "length_immediate" and "mode"
attributes.
(vec_extract_hi_v64qi): Merge all alternatives into one, use
"sselog1" type instead of "sselog". Remove explicit "memory"
attribute.
(vec_extract_hi_v32qi): Merge each pair of alternatives into one,
use "sselog1" type instead of "sselog". Remove explicit "memory"
attribute.