gcc/
* config/mips/mips.c (mips_output_jump): Output R_MICROMIPS_JALR
rather than R_MIPS_JALR relocation in microMIPS code. Do not
cancel short delay slots in PIC call relaxation.
gcc/testsuite/
* gcc.target/mips/call-1.c (dg-options): Add `-mno-micromips'.
(dg-final): Remove microMIPS JALRS mnemonic matching.
* gcc.target/mips/call-2.c (dg-options): Add `-mno-micromips'.
(dg-final): Remove microMIPS JALRS mnemonic matching.
* gcc.target/mips/call-3.c (dg-options): Add `-mno-micromips'.
(dg-final): Remove microMIPS JALRS mnemonic matching.
* gcc.target/mips/call-4.c (dg-options): Add `-mno-micromips'.
* gcc.target/mips/call-5.c (dg-options): Add `-mno-micromips'.
* gcc.target/mips/call-6.c (dg-options): Add `-mno-micromips'.
* gcc.target/mips/call-1u.c: New test case.
* gcc.target/mips/call-2u.c: New test case.
* gcc.target/mips/call-3u.c: New test case.
* gcc.target/mips/call-4u.c: New test case.
* gcc.target/mips/call-5u.c: New test case.
* gcc.target/mips/call-6u.c: New test case.
Wilco Dijkstra [Wed, 16 Nov 2016 18:10:34 +0000 (18:10 +0000)]
Looking at PR77308, one of the issues is that the bswap optimization phase doesn't work on ARM.
Looking at PR77308, one of the issues is that the bswap optimization
phase doesn't work on ARM. This is due to an odd check that uses
SLOW_UNALIGNED_ACCESS (which is always true on ARM). Since the testcase
in PR77308 generates much better code with this patch (~13% fewer
instructions), it seems best to remove this check.
gcc/
* tree-ssa-math-opts.c (bswap_replace): Remove test
of SLOW_UNALIGNED_ACCESS.
Szabolcs Nagy [Wed, 16 Nov 2016 17:27:04 +0000 (17:27 +0000)]
[PR libgfortran/78314] Fix ieee_support_halting
ieee_support_halting only checked the availability of status
flags, not trapping support. On some targets the later can
only be checked at runtime: feenableexcept reports if
enabling traps failed.
So check trapping support by enabling/disabling it.
Updated the test that enabled trapping to check if it is
supported.
gcc/testsuite/
PR libgfortran/78314
* gfortran.dg/ieee/ieee_6.f90: Use ieee_support_halting.
libgfortran/
PR libgfortran/78314
* config/fpu-glibc.h (support_fpu_trap): Use feenableexcept.
gcc/
* config/nvptx/mkoffload.c (main): Check that either OpenACC or OpenMP
is selected. Pass -mgomp to offload compiler in OpenMP case.
* config/nvptx/nvptx-protos.h (nvptx_shuffle_kind): Move enum
declaration from nvptx.c.
(nvptx_gen_shuffle): Declare.
(nvptx_output_set_softstack): Declare.
* config/nvptx/nvptx.c (nvptx_shuffle_kind): Move to nvptx-protos.h.
(need_softstack_decl): New variable.
(need_unisimt_decl): New variable.
(diagnose_openacc_conflict): New. Use it...
(nvptx_option_override): ...here. Handle TARGET_GOMP.
(nvptx_encode_section_info): Handle "shared" attribute.
(write_as_kernel): Restrict to OpenACC target regions.
(init_softstack_frame): New.
(nvptx_init_unisimt_predicate): New.
(write_omp_entry): New. Use it...
(nvptx_declare_function_name): ...here to emit OpenMP target region
entrypoints. Handle TARGET_SOFT_STACK. Call
nvptx_init_unisimt_predicate.
(nvptx_output_set_softstack): New.
(nvptx_get_drap_rtx): Return %argp as the DRAP if needed.
(nvptx_gen_shuffle): Export.
(nvptx_output_call_insn): Handle COND_EXEC patterns. Emit instruction
predicate.
(nvptx_print_operand): Fix handling of instruction predicates.
(nvptx_get_unisimt_master): New helper function.
(nvptx_get_unisimt_predicate): Ditto.
(nvptx_call_insn_is_syscall_p): Ditto.
(nvptx_unisimt_handle_set): Ditto.
(nvptx_reorg_uniform_simt): New. Transform code for -muniform-simt.
(nvptx_reorg): Call nvptx_reorg_uniform_simt.
(nvptx_handle_shared_attribute): New. Use it...
(nvptx_attribute_table): ... here (new entry).
(nvptx_record_offload_symbol): Handle NULL attributes.
(nvptx_file_end): Handle need_softstack_decl and need_unisimt_decl.
(nvptx_simt_vf): New.
(TARGET_SIMT_VF): Define.
* config/nvptx/nvptx.h (TARGET_CPU_CPP_BUILTINS): Define
__nvptx_softstack or __nvptx_unisimt__ when -msoft-stack, or resp.
-muniform-simt option is active.
(STACK_SIZE_MODE): Define.
(FIXED_REGISTERS): Adjust.
(SOFTSTACK_SLOT_REGNUM): New.
(SOFTSTACK_PREV_REGNUM): New.
(REGISTER_NAMES): Adjust.
(struct machine_function): New fields.
* config/nvptx/nvptx.md (UNSPEC_SET_SOFTSTACK): New.
(UNSPEC_VOTE_BALLOT): Ditto.
(UNSPEC_LANEID): Ditto.
(UNSPECV_NOUNROLL): Ditto.
(atomic): New attribute.
(predicable): New attribute. Generate predicated forms via
define_cond_exec.
(br_true): Mark as not predicable.
(br_false): Ditto.
(br_true_uni): Ditto.
(br_false_uni): Ditto.
(return): Ditto.
(trap_if_true): Ditto.
(trap_if_false): Ditto.
(nvptx_fork): Ditto.
(nvptx_forked): Ditto.
(nvptx_joining): Ditto.
(nvptx_join): Ditto.
(nvptx_barsync): Ditto.
(epilogue): Emit stack restore if TARGET_SOFT_STACK.
(allocate_stack): Implement for TARGET_SOFT_STACK. Remove unused code.
(allocate_stack_<mode>): Remove unused pattern.
(set_softstack_insn): New pattern.
(restore_stack_block): Handle for TARGET_SOFT_STACK.
(nvptx_vote_ballot): New pattern.
(omp_simt_lane): Ditto.
(omp_simt_last_lane): Ditto.
(omp_simt_ordered): Ditto.
(omp_simt_vote_any): Ditto.
(omp_simt_xchg_bfly): Ditto.
(omp_simt_xchg_idx): Ditto.
(nvptx_nounroll): Ditto.
(atomic_compare_and_swap<mode>_1): Mark with atomic attribute.
(atomic_exchange<mode>): Ditto.
(atomic_fetch_add<mode>): Ditto.
(atomic_fetch_addsf): Ditto.
(atomic_fetch_<logic><mode>): Ditto.
* config/nvptx/nvptx.opt: (msoft-stack): New option.
(muniform-simt): Ditto.
(mgomp): Ditto.
* config/nvptx/t-nvptx (MULTILIB_OPTIONS): New.
* doc/extend.texi (Nvidia PTX Variable Attributes): New section.
* doc/invoke.texi (msoft-stack): Document.
(muniform-simt): Document
(mgomp): Document.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: (TARGET_SIMT_VF): New hook.
* target.def: Define it.
* target-insns.def (omp_simt_lane): New.
(omp_simt_last_lane): New.
(omp_simt_ordered): New.
(omp_simt_vote_any): New.
(omp_simt_xchg_bfly): New.
(omp_simt_xchg_idx): New.
libgcc/
* config/nvptx/crt0.c (__main): Setup __nvptx_stacks and __nvptx_uni.
* config/nvptx/mgomp.c: New file.
* config/nvptx/t-nvptx: Add mgomp.c
gcc/testsuite/
* lib/target-supports.exp (check_effective_target_alloca): Use a
compile test.
* gcc.target/nvptx/softstack.c: New test.
* gcc.target/nvptx/decl-shared.c: New test.
* gcc.target/nvptx/decl-shared-init.c: New test.
gcc/
* config/mips/mips-protos.h (mips_set_text_contents_type): New
prototype.
* config/mips/mips.h (ASM_OUTPUT_BEFORE_CASE_LABEL): New macro.
(ASM_OUTPUT_CASE_END): Likewise.
* config/mips/mips.c (mips_set_text_contents_type): New
function.
(mips16_emit_constants): Record the pool's initial label number
with the `consttable' insn. Emit a `consttable_end' insn at the
end.
(mips_final_prescan_insn): Call `mips_set_text_contents_type'
for `consttable' insns.
(mips_final_postscan_insn): Call `mips_set_text_contents_type'
for `consttable_end' insns.
* config/mips/mips.md (unspec): Add UNSPEC_CONSTTABLE_END enum
value.
(consttable): Add operand.
(consttable_end): New insn.
gcc/testsuite/
* gcc.target/mips/data-sym-jump.c: New test case.
* gcc.target/mips/data-sym-pool.c: New test case.
* gcc.target/mips/insn-pseudo-4.c: Adjust for constant pool
annotation.
Yuri Rumyantsev [Wed, 16 Nov 2016 16:22:39 +0000 (16:22 +0000)]
Support non-masked epilogue vectoriziation
gcc/
2016-11-16 Yuri Rumyantsev <ysrumyan@gmail.com>
* params.def (PARAM_VECT_EPILOGUES_NOMASK): New.
* tree-if-conv.c (tree_if_conversion): Make public.
* * tree-if-conv.h: New file.
* tree-vect-data-refs.c (vect_analyze_data_ref_dependences) Avoid
dynamic alias checks for epilogues.
* tree-vect-loop-manip.c (vect_do_peeling): Return created epilog.
* tree-vect-loop.c: include tree-if-conv.h.
(new_loop_vec_info): Add zeroing orig_loop_info field.
(vect_analyze_loop_2): Don't try to enhance alignment for epilogues.
(vect_analyze_loop): Add argument ORIG_LOOP_INFO which is not NULL
if epilogue is vectorized, set up orig_loop_info field of loop_vinfo
using passed argument.
(vect_transform_loop): Check if created epilogue should be returned
for further vectorization with less vf. If-convert epilogue if
required. Print vectorization success for epilogue.
* tree-vectorizer.c (vectorize_loops): Add epilogue vectorization
if it is required, pass loop_vinfo produced during vectorization of
loop body to vect_analyze_loop.
* tree-vectorizer.h (struct _loop_vec_info): Add new field
orig_loop_info.
(LOOP_VINFO_ORIG_LOOP_INFO): New.
(LOOP_VINFO_EPILOGUE_P): New.
(LOOP_VINFO_ORIG_VECT_FACTOR): New.
(vect_do_peeling): Change prototype to return epilogue.
(vect_analyze_loop): Add argument of loop_vec_info type.
(vect_transform_loop): Return created loop.
gcc/testsuite/
2016-11-16 Yuri Rumyantsev <ysrumyan@gmail.com>
* lib/target-supports.exp (check_avx2_hw_available): New.
(check_effective_target_avx2_runtime): New.
* gcc.dg/vect/vect-tail-nomask-1.c: New test.
df: Change defs in entry and uses in exit block during separate shrink-wrapping
So far all target implementations of the separate shrink-wrapping hooks
use the DF LIVE info to figure out around which basic blocks the non-
volatile registers need to be saved. This is done by looking at the
IN+GEN+KILL sets of the basic blocks. However, that doesn't work for
registers that DF says are defined in the entry block, or used in the
exit block.
This patch introduces a local flag DF_SCAN_EMPTY_ENTRY_EXIT that says
no registers should be defined in the entry block, and none used in the
exit block. It also makes try_shrink_wrapping_separate use it. The
rs6000 port is changed to use IN+GEN+KILL for the LR component.
* config/rs6000/rs6000.c (rs6000_components_for_bb): Mark the LR
component as used also if LR_REGNO is a live input to the bb.
* df-scan.c (df_get_entry_block_def_set): Return immediately after
clearing the set if DF_SCAN_EMPTY_ENTRY_EXIT is set.
(df_get_exit_block_use_set): Ditto.
* df.h (df_scan_flags): New enum.
* shrink-wrap.c (try_shrink_wrapping_separate): Set
DF_SCAN_EMPTY_ENTRY_EXIT in df_scan->local_flags, and call
df_update_entry_block_defs and df_update_exit_block_uses
at the start; clear the flag and call those functions at the end.
Ian Lance Taylor [Wed, 16 Nov 2016 14:47:28 +0000 (14:47 +0000)]
compiler: separate incomparable types from comparable ones
Otherwise we can accidentally and incorrectly mark an actual user type
as incomparable. This fixes the gccgo version of
https://golang.org/issue/17752. The test case for gccgo is
https://golang.org/cl/33249.
Fix nb_iterations calculation in tree-vect-loop-manip.c
We previously stored the number of loop iterations rather
than the number of latch iterations.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* tree-vect-loop-manip.c (slpeel_make_loop_iterate_ntimes): Set
nb_iterations to the number of latch iterations rather than the
number of loop iterations.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242493
The transformations made by make_compound_operation apply
only to scalar integer modes. The fix for PR70944 had enforced
that by returning early for vector modes at the top of the
function. However, the function is supposed to be recursive,
so we should continue to look at integer suboperands even if
the outer operation is a vector one.
This patch instead splits out the non-recursive parts
of make_compound_operation into a subroutine and checks
that the mode is a scalar integer before calling it.
The patch was originally written to help with the later
conversion to static type checking of mode classes, but it
also happened to reenable optimisation of things like
vec_duplicate operands.
Note that the gen_lowparts in the PLUS and MINUS cases
were redundant, since new_rtx already had mode "mode"
at those points.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* combine.c (maybe_swap_commutative_operands): New function.
(combine_simplify_rtx): Use it.
(change_zero_ext): Likewise.
(make_compound_operation_int): New function, split out of...
(make_compound_operation): ...here. Use
maybe_swap_commutative_operands for both.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242492
Richard Earnshaw [Wed, 16 Nov 2016 14:02:10 +0000 (14:02 +0000)]
[arm] Add vfpv2 and neon-vfpv3
* arm/arm-fpus.def (vfpv2): New FPU, currently an alias for 'vfp'.
(neon-vfpv3): New FPU, currently an alias for 'neon'.
* arm/arm-tables.opt: Regenerated.
* arm/t-aprofile (MULTILIB_REUSE): Add reuse rules for vfpv2 and
neon-vfpv3.
* doc/invoke.texi (ARM: -mfpu): Document new options. Note that 'vfp'
and 'neon' are aliases for specific implementations.
re PR fortran/78356 ([OOP] segfault allocating polymorphic variable with polymorphic component with allocatable component)
gcc/fortran/ChangeLog:
2016-11-16 Andre Vehreschild <vehre@gcc.gnu.org>
PR fortran/78356
* class.c (gfc_is_class_scalar_expr): Prevent taking an array ref for
a component ref.
* trans-expr.c (gfc_trans_assignment_1): Ensure a reference to the
object to copy is generated, when assigning class objects.
gcc/testsuite/ChangeLog:
2016-11-16 Andre Vehreschild <vehre@gcc.gnu.org>
PR fortran/78356
* gfortran.dg/class_allocate_23.f08: New test.
vec_cmps assign the result of a vector comparison to a mask.
The optab was called with the destination having mode mask_mode
but with the source (the comparison) having mode VOIDmode,
which led to invalid rtl if the source operand was used directly.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* optabs.c (vector_compare_rtx): Add a cmp_mode parameter
and use it in the final call to gen_rtx_fmt_ee.
(expand_vec_cond_expr): Update accordingly.
(expand_vec_cmp_expr): Likewise.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242489
local_cprop_find_used_regs punted on all multiword registers,
with the comment:
/* Setting a subreg of a register larger than word_mode leaves
the non-written words unchanged. */
But this only applies if the outer mode is smaller than the
inner mode. If they're the same size then writes to the subreg
are a normal full update.
This patch uses df_read_modify_subreg_p instead. A later patch
adds more uses of the same routine, but this part had a (positive)
effect on code generation for the testsuite whereas the others
seemed to be simple clean-ups.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* cprop.c (local_cprop_find_used_regs): Use df_read_modify_subreg_p.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242488
Andrew Burgess [Wed, 16 Nov 2016 11:42:43 +0000 (11:42 +0000)]
[ARC] Fix LE tests for nps400 variant.
gcc/arc: New peephole2 and little endian arc test fixes
Resolve some test failures introduced for little endian arc as a result
of the recent arc/nps400 additions.
There's a new peephole2 optimisation to merge together two zero_extracts
in order that the movb instruction can be used.
One of the test cases is extended so that the test does something
meaningful in both big and little endian arc mode.
Other tests have their expected results updated to reflect improvements
in other areas of GCC.
gcc/ChangeLog:
Andrew Burgess <andrew.burgess@embecosm.com>
* config/arc/arc.md (movb peephole2): New peephole2 to merge two
zero_extract operations to allow a movb to occur.
* gcc.target/arc/movb-1.c: Update little endian arc results.
* gcc.target/arc/movb-2.c: Likewise.
* gcc.target/arc/movb-5.c: Likewise.
* gcc.target/arc/movh_cl-1.c: Extend test to cover little endian
arc.
Fix PR78294 - thread sanitizer broken when using ld.gold
When one uses ld.gold to build gcc, the thread sanitizer doesn't work,
because gold is more conservative when applying TLS relaxations than
ld.bfd. In this case a missing initial-exec attribute on a declaration
causes gcc to assume the general dynamic model. With ld.bfd this gets
relaxed to initial exec when linking the shared library, so the missing
attribute doesn't matter. But ld.gold doesn't perform this optimization
and this leads to crashes on tsan instrumented binaries.
The CONCAT handling in emit_group_load chooses between doing
an extraction from a single component or forcing the whole
thing to memory and extracting from there. The condition for
the former (more efficient) option was:
On the one hand this seems dangerous, since the second line
allows bit ranges that start in the first component and leak
into the second. On the other hand it seems strange to allow
references that start after the first byte of the second
component but not those that start after the first byte
of the first component. This led to a pessimisation of
things like gcc.dg/builtins-54.c for hppa64-hp-hpux11.23.
This patch simply checks whether the reference is contained
within a single component. It also makes sure that we do
an extraction on anything that doesn't span the whole
component (even if it's constant).
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* expr.c (emit_group_load_1): Tighten check for whether an
access involves only one operand of a CONCAT. Use extract_bit_field
for constants if the bit range does span the whole operand.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242477
Fix handling of unknown sizes in rtx_addr_can_trap_p
If the size passed in to rtx_addr_can_trap_p was zero, the frame
handling would get the size from the mode instead. However, this
too can be zero if the mode is BLKmode, i.e. if we have a BLKmode
memory reference with no MEM_SIZE (which should be rare these days).
This meant that the conditions for a 4-byte access at offset X were
stricter than those for an access of unknown size at offset X.
This patch checks whether the size is still zero, as the
SYMBOL_REF handling does.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
Fix nb_iterations_estimate calculation in tree-vect-loop.c
vect_transform_loop has to reduce three iteration counts by
the vectorisation factor: nb_iterations_upper_bound,
nb_iterations_likely_upper_bound and nb_iterations_estimate.
All three are latch execution counts rather than loop body
execution counts. The calculations were taking that into
account for the first two, but not for nb_iterations_estimate.
This patch updates the way the calculations are done to fix
this and to add a bit more commentary about what is going on.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* tree-vect-loop.c (vect_transform_loop): Protect the updates of
all three iteration counts with an any_* test. Use a single update
for each count. Fix the calculation of nb_iterations_estimate.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242475
Kyrylo Tkachov [Wed, 16 Nov 2016 09:02:18 +0000 (09:02 +0000)]
[ARM] PR target/78364: Add proper restrictions to zero and sign_extract patterns operands
PR target/78364
* config/arm/arm.md (*extv_reg): Restrict operands 2 and 3 to the
proper ranges for an SBFX instruction.
(extzv_t2): Likewise for UBFX.
Richard Biener [Wed, 16 Nov 2016 08:42:20 +0000 (08:42 +0000)]
re PR tree-optimization/78348 ([7 REGRESSION] 15% performance drop for coremark-pro/nnet-test after r242038)
2016-11-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/78348
* tree-loop-distribution.c (enum partition_kind): Add PKIND_MEMMOVE.
(generate_memcpy_builtin): Honor PKIND_MEMCPY on the partition.
(classify_partition): Set PKIND_MEMCPY if dependence analysis
revealed no dependency, PKIND_MEMMOVE otherwise.
Jakub Jelinek [Wed, 16 Nov 2016 08:28:50 +0000 (09:28 +0100)]
re PR sanitizer/77823 (ICE: in ubsan_encode_value, at ubsan.c:137 with -fsanitize=undefined and vector types)
PR sanitizer/77823
* ubsan.c (ubsan_build_overflow_builtin): Add DATAP argument, if
it points to non-NULL tree, use it instead of ubsan_create_data.
(instrument_si_overflow): Handle vector signed integer overflow
checking.
* ubsan.h (ubsan_build_overflow_builtin): Add DATAP argument.
* tree-vrp.c (simplify_internal_call_using_ranges): Punt for
vector IFN_UBSAN_CHECK_*.
* internal-fn.c (expand_addsub_overflow): Add DATAP argument,
pass it through to ubsan_build_overflow_builtin.
(expand_neg_overflow, expand_mul_overflow): Likewise.
(expand_vector_ubsan_overflow): New function.
(expand_UBSAN_CHECK_ADD, expand_UBSAN_CHECK_SUB,
expand_UBSAN_CHECK_MUL): Use tit for vector arithmetics.
(expand_arith_overflow): Adjust expand_*_overflow callers.
* c-c++-common/ubsan/overflow-vec-1.c: New test.
* c-c++-common/ubsan/overflow-vec-2.c: New test.
* genattrtab.c (attr_rtx_1): Avoid allocating new rtx objects.
Clear ATTR_CURR_SIMPLIFIED_P for re-used binary rtx objects.
Use DEF_ATTR_STRING for string arguments. Use RTL_HASH for
integer arguments. Only set ATTR_PERMANENT_P on newly hashed
rtx when all sub-rtx are also permanent.
(attr_eq): Simplify.
(attr_copy_rtx): Remove.
(make_canonical, get_attr_value): Use attr_equal_p.
(copy_boolean): Rehash NOT.
(simplify_test_exp_in_temp,
optimize_attrs): Remove call to attr_copy_rtx.
(attr_alt_intersection, attr_alt_union,
attr_alt_complement, mk_attr_alt): Rehash EQ_ATTR_ALT.
(make_automaton_attrs): Use attr_eq.
Jonathan Wakely [Tue, 15 Nov 2016 19:32:44 +0000 (19:32 +0000)]
Make std::tuple_size<cv T> SFINAE-friendly (LWG 2770)
* doc/xml/manual/intro.xml: Document LWG 2770 status. Remove entries
for 2742 and 2748.
* doc/html/*: Regenerate.
* include/std/utility (__tuple_size_cv_impl): New helper to safely
detect tuple_size<T>::value, as per LWG 2770.
(tuple_size<cv T>): Adjust partial specializations to derive from
__tuple_size_cv_impl.
* testsuite/20_util/tuple/cv_tuple_size.cc: Test SFINAE-friendliness.
Mark Wielaard [Tue, 15 Nov 2016 19:31:59 +0000 (19:31 +0000)]
libiberty: demangler crash with missing :? or fold expression component.
When constructing an :? or fold expression that requires a third
expression only the first and second were explicitly checked to
not be NULL. Since the third expression is also required in these
constructs it needs to be explicitly checked and rejected when missing.
Otherwise the demangler will crash once it tries to d_print the
NULL component. Added two examples to demangle-expected of strings
that would crash before this fix.
Mark Wielaard [Tue, 15 Nov 2016 19:31:50 +0000 (19:31 +0000)]
libiberty: Fix some demangler crashes caused by reading past end of input.
In various situations the cplus_demangle () function could read past the
end of input causing crashes. Add checks in various places to not advance
the demangle string location and fail early when end of string is reached.
Add various examples of input strings to the testsuite that would crash
test-demangle before the fixes.
Found by using the American Fuzzy Lop (afl) fuzzer.
libiberty/ChangeLog:
* cplus-dem.c (demangle_signature): After 'H', template function,
no success and don't advance position if end of string reached.
(demangle_template): After 'z', template name, return zero on
premature end of string.
(gnu_special): Guard strchr against searching for zero characters.
(do_type): If member, only advance mangled string when 'F' found.
* testsuite/demangle-expected: Add examples of strings that could
crash the demangler by reading past end of input.
Uros Bizjak [Tue, 15 Nov 2016 19:26:41 +0000 (20:26 +0100)]
funcspec-56.inc: New file.
* gcc.target/i386/funcspec-56.inc: New file.
* gcc.target/i386.funcspec-5.c: Include funcspec-56.inc. Remove
common 32-bit and 64-bit function specific options.
* gcc.target/i386.funcspec-6.c: Ditto.
Several definitions of INCOMING_RETURN_ADDR_RTX used
gen_rtx_REG (VOIDmode, ...), which with later patches
would trip an assert. This patch converts them to use
Pmode instead.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
After simplifying the operands of a PLUS, canon_rtx checked only
for cases in which one of the simplified operands was a constant,
falling back to gen_rtx_PLUS otherwise. This left the PLUS in a
non-canonical order if one of the simplified operands was
(plus (reg R1) (const_int X)); we'd end up with:
(plus (plus (reg R1) (const_int Y)) (reg R2))
rather than:
(plus (plus (reg R1) (reg R2)) (const_int Y))
Fixing this exposed new DSE opportunities on spu-elf in
gcc.c-torture/execute/builtins/strcat-chk.c but otherwise
it doesn't seem to have much practical effect.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* alias.c (canon_rtx): Use simplify_gen_binary.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242445
LOAD_EXTEND_OP only applies to scalar integer modes that are narrower
than a word. However, callers weren't consistent about which of these
checks they made beforehand, and also weren't consistent about whether
"smaller" was based on (bit)size or precision (IMO it's the latter).
This patch adds a wrapper to try to make the macro easier to use.
LOAD_EXTEND_OP is often used to disable transformations that aren't
beneficial when extends from memory are free, so being stricter about
the check accidentally exposed more optimisation opportunities.
"SUBREG_BYTE (...) == 0" and subreg_lowpart_p are implied by
paradoxical_subreg_p, so the patch also removes some redundant tests.
The patch doesn't change reload, since different checks could have
unforeseen consequences.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
Fix simplify_shift_const_1 handling of vector shifts
simplify_shift_const_1 handles both shifts of scalars by scalars
and shifts of vectors by scalars. For vectors this means that
each element is shifted by the same amount.
However:
(a) the two cases weren't always distinguished, so we'd try
things for vectors that only made sense for scalars.
(b) a lot of the range and bitcount checks were based on the
bitsize or precision of the full shifted operand, rather
than the mode of each element.
Fixing (b) accidentally exposed more optimisation opportunities,
although that wasn't the point of the patch.
gcc/
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* combine.c (simplify_shift_const_1): Use the number of bits
in the inner mode to determine the range of the shift.
When handling shifts of vectors, skip any rules that apply
only to scalars.
Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com>
From-SVN: r242442
Tim Shen [Tue, 15 Nov 2016 17:26:59 +0000 (17:26 +0000)]
variant: Remove variant<T&>...
* include/std/variant: Remove variant<T&>, variant<void>, variant<> support
to rebase on the post-Issaquah design.
* testsuite/20_util/variant/compile.cc: Likewise.
Jakub Jelinek [Tue, 15 Nov 2016 17:05:23 +0000 (18:05 +0100)]
decl.c (cp_finish_decomp): For DECL_NAMESPACE_SCOPE_P decl, set DECL_ASSEMBLER_NAME.
* decl.c (cp_finish_decomp): For DECL_NAMESPACE_SCOPE_P decl,
set DECL_ASSEMBLER_NAME.
* parser.c (cp_parser_decomposition_declaration): Likewise
if returning error_mark_node.
* mangle.c (mangle_decomp): New function.
* cp-tree.h (mangle_decomp): New declaration.
gcc/
* config/mips/mips.c (mips16_emit_constants): Emit `consttable'
insn at the beginning of the constant pool.
(mips_insert_insn_pseudos): New function.
(mips_machine_reorg2): Call it.
* config/mips/mips.md (unspec): Add UNSPEC_CONSTTABLE and
UNSPEC_INSN_PSEUDO enum values.
(insn_pseudo, consttable): New insns.
gcc/testsuite/
* gcc.target/mips/insn-casesi.c: New test case.
* gcc.target/mips/insn-pseudo-1.c: New test case.
* gcc.target/mips/insn-pseudo-2.c: New test case.
* gcc.target/mips/insn-pseudo-3.c: New test case.
* gcc.target/mips/insn-pseudo-4.c: New test case.
* gcc.target/mips/insn-tablejump.c: New test case.
Jonathan Wakely [Tue, 15 Nov 2016 14:33:20 +0000 (14:33 +0000)]
Add std::string constructor for substring of string_view (LWG 2742)
* doc/xml/manual/intro.xml: Document LWG 2742 status.
* doc/html/*: Regenerate.
* include/bits/basic_string.h
(basic_string(const T&, size_type, size_type, const Allocator&)): Add
constructor for substring of basic_string_view, as per LWG 2742 but
with additional constraint to fix ambiguity.
* testsuite/21_strings/basic_string/cons/char/9.cc: New test.
* testsuite/21_strings/basic_string/cons/wchar_t/9.cc: New test.
Jonathan Wakely [Tue, 15 Nov 2016 14:33:09 +0000 (14:33 +0000)]
Constrain swap overload for std::optional (LWG 2748)
* doc/xml/manual/intro.xml: Document LWG 2748 status.
* include/std/optional (optional<T>::swap): Use is_nothrow_swappable_v
for exception specification.
(swap(optional<T>&, optional<T>&)): Disable when T is not swappable.
* testsuite/20_util/optional/swap/2.cc: New test.
Michael Matz [Tue, 15 Nov 2016 14:02:28 +0000 (14:02 +0000)]
re PR target/77881 (Non-optimal signed comparison on x86_64 since r146817)
PR missed-optimization/77881
* combine.c (simplify_comparison): Remove useless subregs
also inside the loop, not just after it.
(make_compound_operation): Recognize some subregs as being
masking as well.
Jason Merrill [Tue, 15 Nov 2016 05:22:28 +0000 (00:22 -0500)]
Various C++17 decomposition fixes.
* tree.c (bitfield_p): New.
* cp-tree.h: Declare it.
* typeck.c (cxx_sizeof_expr, cxx_alignof_expr)
(cp_build_addr_expr_1): Use it instead of DECL_C_BIT_FIELD.
* decl.c (cp_finish_decomp): Look through reference. Always
SET_DECL_DECOMPOSITION_P.
* semantics.c (finish_decltype_type): Adjust decomposition handling.
Ian Lance Taylor [Mon, 14 Nov 2016 23:16:04 +0000 (23:16 +0000)]
runtime: don't crash if signal handler info argument is nil
Apparently on Solaris 10 a SA_SIGINFO signal handler can be invoked with
a nil info argument. I would not have believed it but I've now seen it
happen, and the sigaction man page actually says "If the second argument
is not equal to NULL, it points to a siginfo_t structure...." So, if
that happens, don't crash.
Also fix another case where we want to make sure that &T{} does not
allocate.
Implement P0504R0 (Revisiting in-place tag types for any/optional/variant).
Implement P0504R0 (Revisiting in-place tag types for
any/optional/variant).
* include/std/any (any(_ValueType&& __value)): Constrain
the __is_in_place_type with the decayed type.
(make_any): Adjust to use the new tag type.
* include/std/utility (in_place_tag): Remove.
(in_place_t): Turn into a non-reference tag type.
(__in_place, __in_place_type, __in_place_index): Remove.
(in_place): Turn into an inline variable of non-reference
tag type.
(in_place<_Tp>): Remove.
(in_place_index<_Idx>): Remove.
(in_place_type_t): New.
(in_place_type): Turn into a variable template of non-reference
type.
(in_place_index_t): New.
(in_place_index): Turn into a variable template of non-reference
type.
* include/std/variant
(_Variant_storage(in_place_index_t<_Np>, _Args&&...)): Adjust to
use the new tag type.
(_Union(in_place_index_t<0>, _Args&&...)): Likewise.
(_Union(in_place_index_t<_Np>, _Args&&...)): Likewise.
(_Variant_base()): Likewise.
(variant(_Tp&&)): Likewise.
(variant(in_place_type_t<_Tp>, _Args&&...)): Likewise.
(variant(in_place_type_t<_Tp>, initializer_list<_Up>,
_Args&&...)): Likewise.
(variant(in_place_index_t<_Np>, _Args&&...)): Likewise.
(variant(in_place_index_t<_Np>, initializer_list<_Up>,
_Args&&...)): Likewise
(variant(allocator_arg_t, const _Alloc&)): Likewise.
(variant(allocator_arg_t, const _Alloc&, _Tp&&)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_type_t<_Tp>,
_Args&&...)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_type_t<_Tp>,
initializer_list<_Up>, _Args&&...)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_index_t<_Np>,
_Args&&...)): Likewise.
(variant(allocator_arg_t, const _Alloc&, in_place_index_t<_Np>,
initializer_list<_Up>, _Args&&...)): Likewise.
(emplace(_Args&&...)): Likewise.
(emplace(initializer_list<_Up>, _Args&&...)): Likewise.
* testsuite/20_util/any/cons/explicit.cc: Likewise.
* testsuite/20_util/any/cons/in_place.cc: Likewise.
* testsuite/20_util/any/requirements.cc: Add tests to
check that any is not constructible from the new in_place_type_t
of any value category.
* testsuite/20_util/in_place/requirements.cc: Adjust to
use the new tag type.
* testsuite/20_util/variant/compile.cc: Likewise.
* testsuite/20_util/variant/run.cc: Likewise.
tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg and cmpnop in two steps...
2016-11-14 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* tree-ssa-math-opts.c (find_bswap_or_nop): Zero out bytes in cmpxchg
and cmpnop in two steps: first the ones not accessed in original gimple
expression in a endian independent way and then the ones not accessed
in the final result in an endian-specific way.
(bswap_replace): Stop doing big endian adjustment.