Ilya Enkovich [Tue, 10 Nov 2015 12:17:30 +0000 (12:17 +0000)]
optabs.c (expand_binop_directly): Allow scalar mode for vec_pack_trunc_optab.
gcc/
* optabs.c (expand_binop_directly): Allow scalar mode for
vec_pack_trunc_optab.
* tree-vect-loop.c (vect_determine_vectorization_factor): Skip
boolean vector producers from pattern sequence when computing VF.
* tree-vect-patterns.c (vect_vect_recog_func_ptrs) Add
vect_recog_mask_conversion_pattern.
(search_type_for_mask): Choose the smallest
type if different size types are mixed.
(build_mask_conversion): New.
(vect_recog_mask_conversion_pattern): New.
(vect_pattern_recog_1): Allow scalar mode for boolean vectype.
* tree-vect-stmts.c (vectorizable_mask_load_store): Support masked
load with pattern.
(vectorizable_conversion): Support boolean vectors.
(free_stmt_vec_info): Allow patterns for statements with no lhs.
* tree-vectorizer.h (NUM_PATTERNS): Increase to 14.
Richard Biener [Tue, 10 Nov 2015 10:14:02 +0000 (10:14 +0000)]
re PR tree-optimization/68240 (compilation hangs on valid code at -O1 and above on x86_64-linux-gnu)
2015-11-10 Richard Biener <rguenther@suse.de>
PR tree-optimization/68240
* tree-ssa-sccvn.c (cond_stmts_equal_p): Handle commutative compares
properly.
(visit_phi): For PHIs with just a single executable edge
take its value directly.
(expressions_equal_p): Handle VN_TOP properly.
Kyrylo Tkachov [Tue, 10 Nov 2015 09:37:51 +0000 (09:37 +0000)]
[AArch64][2/3] Implement negcc, notcc optabs
* config/aarch64/aarch64.md (<neg_not_op><mode>cc): New define_expand.
* config/aarch64/iterators.md (NEG_NOT): New code iterator.
(neg_not_op): New code attribute.
Robert Suchanek [Tue, 10 Nov 2015 09:12:52 +0000 (09:12 +0000)]
Tie chains for move instructions.
gcc/
* regrename.c (create_new_chain): Initialize renamed and tied_chain.
(build_def_use): Initialize terminated_this_insn.
(find_best_rename_reg): Pick and check register from the tied chain.
(regrename_do_replace): Mark head as renamed.
(struct du_head *terminated_this_insn). New static variable.
(scan_rtx_reg): Tie chains in move insns. Set terminated_this_insn.
* regrename.h (struct du_head): Add tied_chain, renamed members.
> This is causing a bootstrap comparison failure in gcc/go/gogo.o.
I've had a look at this and the trigger is the
aarch64_use_constant_blocks_p change which appears to be causing a
bootstrap comparison failure because of differences to offsets when
built with debug and without debug. I don't think the problem is
specifically in the backend but this needs some careful
investigation. For now, in the interest of go bootstraps continuing on
trunk - I'm proposing a patch that partially rolls back the change in
aarch64_use_constant_blocks_p and am still looking into the issue but
it will take me some more time to get to the bottom of the issue.
Bootstrapped on aarch64-none-linux-gnu including (c,c++ and go) -
testing finished ok.
parser.c (cp_finalize_oacc_routine): New boolean first argument.
gcc/cp/
* parser.c (cp_finalize_oacc_routine): New boolean first argument.
(cp_ensure_no_oacc_routine): Update call to cp_finalize_oacc_routine.
(cp_parser_simple_declaration): Maintain a boolean first to keep track
of each new declarator. Propagate it to cp_parser_init_declarator.
(cp_parser_init_declarator): New boolean first argument. Propagate it
to cp_parser_save_member_function_body and cp_finalize_oacc_routine.
(cp_parser_member_declaration): Likewise.
(cp_parser_single_declaration): Update call to
cp_parser_init_declarator.
(cp_parser_save_member_function_body): New boolean first_decl argument.
Propagate it to cp_finalize_oacc_routine.
(cp_parser_finish_oacc_routine): New boolean first argument. Use it to
determine if multiple declarators follow a routine construct.
(cp_parser_oacc_routine): Update call to cp_parser_finish_oacc_routine.
gcc/testsuite/
* c-c++-common/goacc/routine-5.c: Enable c++ tests.
Martin Sebor [Tue, 10 Nov 2015 02:23:34 +0000 (02:23 +0000)]
PR c++/67913 - new expression with negative size not diagnosed
PR c++/67913 - new expression with negative size not diagnosed
PR c++/67927 - array new expression with excessive number of elements
not diagnosed
gcc/cp/
* call.c (build_operator_new_call): Do not assume size_check
is non-null, analogously to the top half of the function.
* init.c (build_new_1): Detect and diagnose array sizes in
excess of the maximum of roughly SIZE_MAX / 2.
Insert a runtime check only for arrays with a non-constant size.
(build_new): Detect and diagnose negative array sizes.
gcc/testsuite/
* init/new45.C: New test to verify that operator new is invoked
with or without overhead for a cookie.
* init/new44.C: New test for placement new expressions for arrays
with excessive number of elements.
* init/new43.C: New test for placement new expressions for arrays
with negative number of elements.
* other/new-size-type.C: Expect array new expression with
an excessive number of elements to be rejected.
Michael Meissner [Tue, 10 Nov 2015 00:04:03 +0000 (00:04 +0000)]
constraints.md (wF constraint): New constraints for power9/toc fusion.
[gcc]
2015-11-08 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/constraints.md (wF constraint): New constraints
for power9/toc fusion.
(wG constraint): Likewise.
* config/rs6000/predicates.md (u6bit_cint_operand): New
predicate, recognize 0..63.
(upper16_cint_operand): New predicate for power9 and toc fusion.
(fpr_reg_operand): Likewise.
(toc_fusion_or_p9_reg_operand): Likewise.
(toc_fusion_mem_raw): Likewise.
(toc_fusion_mem_wrapped): Likewise.
(fusion_gpr_addis): If power9 fusion, allow fusion for a larger
address range.
(fusion_gpr_mem_combo): Delete, use fusion_addis_mem_combo_load
instead.
(fusion_addis_mem_combo_load): Add support for power9 fusion of
floating point loads, floating point stores, and gpr stores.
(fusion_addis_mem_combo_store): Likewise.
(fusion_offsettable_mem_operand): Likewise.
* config/rs6000/rs6000.c (struct rs6000_reg_addr): Add new
elements for power9 fusion.
(rs6000_debug_print_mode): Rework debug information to print more
information about fusion.
(rs6000_init_hard_regno_mode_ok): Setup for power9 fusion
support.
(rs6000_legitimate_address_p): Recognize toc fusion as a valid
offsettable memory address.
(rs6000_rtx_costs): Update costs for new ISA 3.0 instructions.
(emit_fusion_gpr_load): Move most of the code from
emit_fusion_gpr_load into emit_fusion-addis that handles both
power8 and power9 fusion.
(emit_fusion_addis): Likewise.
(emit_fusion_load_store): Likewise.
(fusion_wrap_memory_address): Add support for TOC fusion.
(fusion_split_address): Likewise.
(fusion_p9_p): Add support for power9 fusion.
(expand_fusion_p9_load): Likewise.
(expand_fusion_p9_store): Likewise.
(emit_fusion_p9_load): Likewise.
(emit_fusion_p9_store): Likewise.
* config/rs6000/rs6000.h (TARGET_EXTSWSLI): Macros for support for
new instructions in ISA 3.0.
(TARGET_CTZ): Likewise.
(TARGET_TOC_FUSION_INT): Macros for power9 fusion support.
(TARGET_TOC_FUSION_FP): Likewise.
* config/rs6000/rs6000.md (UNSPEC_FUSION_P9): New power9/toc
fusion unspecs.
(UNSPEC_FUSION_ADDIS): Likewise.
(QHSI mode iterator): New iterator for power9 fusion.
(GPR_FUSION): Likewise.
(FPR_FUSION): Likewise.
(mod<mode>3): Add support for ISA 3.0
modulus instructions.
(umod<mode>3): Likewise.
(divmod peephole): Likewise.
(udivmod peephole): Likewise.
(ctz<mode>2): Add support for ISA 3.0 count trailing zeros scalar
instructions.
(ctz<mode>2_h): Likewise.
(ashdi3_extswsli): Add support for ISA 3.0 EXTSWSLI instruction.
(ashdi3_extswsli_dot): Likewise.
(ashdi3_extswsli_dot2): Likewise.
(power9 fusion splitter): New power9/toc fusion support.
(toc_fusionload_<mode>): Likewise.
(toc_fusionload_di): Likewise.
(fusion_gpr_load_<mode>): Update predicate function.
(power9 fusion peephole2s): New power9/toc fusion support.
(fusion_gpr_<P:mode>_<GPR_FUSION:mode>_load): Likewise.
(fusion_gpr_<P:mode>_<GPR_FUSION:mode>_store): Likewise.
(fusion_fpr_<P:mode>_<FPR_FUSION:mode>_load): Likewise.
(fusion_fpr_<P:mode>_<FPR_FUSION:mode>_store): Likewise.
(fusion_p9_<mode>_constant): Likewise.
[gcc/testsuite]
2015-11-08 Michael Meissner <meissner@linux.vnet.ibm.com>
* lib/target-supports.exp (check_p8vector_hw_available): Split
long line.
(check_vsx_hw_available): Likewise.
(check_p9vector_hw_available): Add new checks for ISA 3.0 hardware
support and for PowerPC float128 support.
(check_p9modulo_hw_available): Likewise.
(check_ppc_float128_sw_available): Likewise.
(check_ppc_float128_hw_available): Likewise.
(check_effective_target_powerpc_p9vector_ok): Likewise.
(check_effective_target_powerpc_p9modulo_ok): Likewise.
(check_effective_target_powerpc_float128_sw_ok): Likewise.
(check_effective_target_powerpc_float128_hw_ok): Likewise.
(is-effective-target): Add new PowerPc targets.
(is-effective-target-keyword): Likewise.
(check_vect_support_and_set_flags): If we have ISA 3.0 vector
instructions, use it.
* gcc.target/powerpc/mod-1.c: New test for ISA 3.0 instructions.
* gcc.target/powerpc/mod-2.c: Likewise.
* gcc.target/powerpc/ctz-1.c: Likewise.
* gcc.target/powerpc/ctz-2.c: Likewise.
* gcc.target/powerpc/extswsli-1.c: Likewise.
* gcc.target/powerpc/extswsli-2.c: Likewise.
* gcc.target/powerpc/extswsli-3.c: Likewise.
* gcc.target/powerpc/fusion.c (fusion_vector): Move to fusion2.c
and allow the test on PowerPC LE.
* gcc.target/powerpc/fusion2.c (fusion_vector): Likewise.
* gcc.target/powerpc/fusion3.c: New file, test power9 fusion.
* gcc.target/powerpc/float128-call.c: Use powerpc_float128_sw_ok
check instead of powerpc_vsx_ok.
* gcc.target/powerpc/float128-mix.c: Likewise.
* haifa-sched.c (setup_sched_dump): Don't redirect output to stderr.
* common.opt (-fsched-verbose): Set default value to 1.
* invoke.texi (-fsched-verbose): Update the option's description.
* config/rs6000/rs6000.c (power9_cost): Initial cost setup for
power9.
(rs6000_debug_reg_global): Add support for power9 fusion.
(rs6000_setup_reg_addr_masks): Cache mode size.
(rs6000_option_override_internal): Until real power9 tuning is
added, use -mtune=power8 for -mcpu=power9.
(rs6000_setup_reg_addr_masks): Do not allow pre-increment,
pre-decrement, or pre-modify on SFmode/DFmode if we allow the use
of Altivec registers.
(rs6000_option_override_internal): Add support for ISA 3.0
switches.
(rs6000_loop_align): Add support for power9 cpu.
(rs6000_file_start): Likewise.
(rs6000_adjust_cost): Likewise.
(rs6000_issue_rate): Likewise.
(insn_must_be_first_in_group): Likewise.
(insn_must_be_last_in_group): Likewise.
(force_new_group): Likewise.
(rs6000_register_move_cost): Likewise.
(rs6000_opt_masks): Likewise.
Martin Liska [Mon, 9 Nov 2015 15:45:59 +0000 (16:45 +0100)]
Fix memory leaks and use a pool_allocator
* gcc.c (record_temp_file): Release name string.
* ifcvt.c (noce_convert_multiple_sets): Use auto_vec instead
of vec.
* lra-lives.c (free_live_range_list): Utilize
lra_live_range_pool for allocation and deallocation.
(create_live_range): Likewise.
(copy_live_range): Likewise.
(lra_merge_live_ranges): Likewise.
(remove_some_program_points_and_update_live_ranges): Likewise.
(lra_create_live_ranges_1): Release point_freq_vec that can
be not freed from previous iteration of the function.
* tree-eh.c (lower_try_finally_switch): Use auto_vec instead of
vec.
* tree-sra.c (sra_deinitialize): Release all vectors in
base_access_vec.
* tree-ssa-dom.c (free_dom_edge_info): Make the function extern.
* tree-ssa-threadupdate.c (remove_ctrl_stmt_and_useless_edges):
Release edge_info for a removed edge.
(thread_through_all_blocks): Free region vector.
* tree-ssa.h (free_dom_edge_info): Declare function extern.
Ilya Enkovich [Mon, 9 Nov 2015 15:11:02 +0000 (15:11 +0000)]
optabs.c (expand_vec_cond_expr): Always get sign from type.
gcc/
* optabs.c (expand_vec_cond_expr): Always get sign from type.
* tree.c (wide_int_to_tree): Support negative values for boolean.
(build_nonstandard_boolean_type): Use signed type for booleans.
Richard Biener [Mon, 9 Nov 2015 12:59:17 +0000 (12:59 +0000)]
re PR tree-optimization/56118 (Piecewise vector / complex initialization from constants not combined)
2015-11-09 Richard Biener <rguenther@suse.de>
PR tree-optimization/56118
* tree-vectorizer.h (vect_find_last_scalar_stmt_in_slp): Declare.
* tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Export.
* tree-vect-data-refs.c (vect_slp_analyze_node_dependences): New
function.
(vect_slp_analyze_data_ref_dependences): Instead of computing
all dependences of the region DRs just analyze the code motions
SLP vectorization will perform. Remove SLP instances that
cannot have their store/load motions applied.
(vect_analyze_data_refs): Allow DRs without a vectype
in BB vectorization.
Kyrylo Tkachov [Mon, 9 Nov 2015 11:40:17 +0000 (11:40 +0000)]
[RTL-ifcvt] PR rtl-optimization/67749: Do not emit separate SET insn in IF-ELSE case
PR rtl-optimization/67749
* ifcvt.c (noce_try_cmove_arith): Do not emit move in IF-ELSE
case before emitting the two blocks. Instead modify the register
in the corresponding final insn of the basic block.
thumb2-slow-flash-data.c: Add missing typespec for labelref and check use of constant pool by looking for...
2015-11-09 Thomas Preud'homme <thomas.preudhomme@arm.com>
gcc/
* gcc.target/arm/thumb2-slow-flash-data.c: Add missing typespec for
labelref and check use of constant pool by looking for .word and
similar directives.
While cmps and movs allow a segment override of the ds:esi
source, the es:edi source/destination cannot be overriden.
Simplify things in the backend for now by disallowing
segments for string insns entirely.
* config/i386/i386.c (ix86_check_no_addr_space): New.
(decide_alg): Add have_as parameter.
(alg_usable_p): Likewise; disable rep algorithms if set.
(ix86_expand_set_or_movmem): Notice if either MEM has a
non-default address space.
(ix86_expand_strlen): Likewise.
* config/i386/i386.md (strmov, strset): Likewise.
(*strmovdi_rex_1): Use ix86_check_no_addr_space.
(*strmovsi_1, *strmovqi_1, *rep_movdi_rex64, *rep_movsi, *rep_movqi,
*strsetdi_rex_1, *strsetsi_1, *strsethi_1, *strsetqi_1,
*rep_stosdi_rex64, *rep_stossi, *rep_stosqi, *cmpstrnqi_nz_1,
*cmpstrnqi_1, *strlenqi_1): Likewise.
Add hook for modifying debug info for address spaces
* dwarf2out.c (modified_type_die): Pass the address space number
through TARGET_ADDR_SPACE_DEBUG to produce the dwarf address class.
* target.def (TARGET_ADDR_SPACE_DEBUG): New.
* targhooks.c (default_addr_space_debug): New.
* targhooks.h (default_addr_space_debug): Declare.
* doc/tm.texi.in (TARGET_ADDR_SPACE_DEBUG): Mark it.
* doc/tm.texi: Rebuild.
* gimple.c (check_loadstore): Return false when 0 is a valid address.
* fold-const.c (const_unop) [ADDR_SPACE_CONVERT_EXPR]: Do not fold
null when 0 is valid in the source address space.
* target.def (TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID): New.
* targhooks.c (default_addr_space_zero_address_valid): New.
* targhooks.h (default_addr_space_zero_address_valid): Declare.
* doc/tm.texi.in (TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID): Mark it.
* doc/tm.texi: Rebuild.
Jeff Law [Mon, 9 Nov 2015 09:02:27 +0000 (02:02 -0700)]
[PATCH] Minor refactoring in tree-ssanames.c & freelists verifier
[PATCH] Minor refactoring in tree-ssanames.c & freelists verifier
* tree-into-ssa.c (names_to_release): No longer static.
* tree-into-ssa.h (names_to_release): Declare.
* tree-ssanames.c (verify_ssaname_freelists): New debug function.
(release_free_names_and_compact_live_names): New function extracted
from pass_release_ssa_names::execute.
(pass_release_ssa_names::execute): Use it.
Alan Modra [Mon, 9 Nov 2015 04:28:21 +0000 (14:58 +1030)]
Modify obstack.[hc] to avoid having to include other gnulib files
Using the standard gnulib obstack source requires importing quite a
lot of other files from gnulib, and requires build changes.
include/
* obstack.h (__attribute_pure__): Expand _GL_ATTRIBUTE_PURE.
libiberty/
* obstack.c (__alignof__): Expand alignof_type from alignof.h.
(obstack_exit_failure): Don't use exitfail.h.
(_): Include libintl.h when HAVE_LIBINTL_H and nls enabled.
Provide default. Don't include gettext.h.
(_Noreturn): Define.
* obstacks.texi: Adjust node references to external libc info files.
Alan Modra [Mon, 9 Nov 2015 04:23:25 +0000 (14:53 +1030)]
Update libsanitizer obstack interceptors
New obstack uses sensible types, size_t instead of int for length
params. Since libsanitizer does not use prototypes from obstack.h to
call the real functions, it's necessary to update the libsanitizer
function declarations emitted by the INTERCEPTOR macro.
Alan Modra [Mon, 9 Nov 2015 04:19:43 +0000 (14:49 +1030)]
Correct libvtv obstack use
Fixes a compile error with both old and new obstacks due to
obstack_chunk_free having the wrong signature. Also, setting chunk
size and alignment before obstack_init is pointless since they are
overwritten.
* vtv_malloc.cc (obstack_chunk_free): Correct param type.
(__vtv_malloc_init): Use obstack_specify_allocation.
Alan Modra [Mon, 9 Nov 2015 04:17:53 +0000 (14:47 +1030)]
New obstack_next_free is not an lvalue
New obstack.h casts obstack_next_free to (void *), resulting in it
being a non-lvalue, and warnings on pointer arithmetic.
gcc/
* gensupport.c (add_mnemonic_string): Make len param a size_t.
(gen_mnemonic_setattr): Make "size" var a size_t. Use
obstack_blank_fast to shrink obstack. Cast obstack_next_free
return value.
gcc/objc/
* objc-encoding.c (encode_aggregate_within): Cast obstack_next_free
return value.
Fix bb-reorder problem with degenerate cond_jump (PR68182)
The code mistakenly thinks any cond_jump has two successors. This is
not true if both destinations are the same, as can happen with weird
patterns as in the PR.
PR rtl-optimization/68182
* gcc/bb-reorder.c (reorder_basic_blocks_simple): Treat a conditional
branch with only one successor just like unconditional branches.
Jeff Law [Mon, 9 Nov 2015 03:19:09 +0000 (20:19 -0700)]
[PATCH] Remove backedge handling support in tree-ssa-threadupdate.c
* tree-ssa-threadupdate.c (register_jump_thraed): Assert that a
non-FSM path has no edges marked with EDGE_DFS_BACK.
(ssa_redirect_edges): No longer call mark_loop_for_removal.
(thread_single_edge, def_split_header_continue_p): Remove.
(bb_ends_with_multiway_branch): Likewise.
(thread_through_loop_header): Remove cases of threading from
latch through the header. Simplify knowing we won't thread
the latch.
(thread_through_all_blocks): Simplify knowing that only the FSM
threader needs to handle backedges.
* gfortran.dg/PR67518.f90: move from here...
* gfortran.dg/graphite/PR67518.f90: to here.
* gfortran.dg/PR53852.f90: move from here...
* gfortran.dg/graphite/PR53852.f90: to here.
Paul Thomas [Sun, 8 Nov 2015 16:47:58 +0000 (16:47 +0000)]
re PR fortran/68196 (ICE on function result with procedure pointer component)
2015-11-08 Paul Thomas <pault@gcc.gnu.org>
PR fortran/68196
* class.c (has_finalizer_component): Prevent infinite recursion
through this function if the derived type and that of its
component are the same.
* trans-types.c (gfc_get_derived_type): Do the same for proc
pointers by ignoring the explicit interface for the component.
PR fortran/66465
* check.c (same_type_check): If either of the expressions is
BT_PROCEDURE, use the typespec from the symbol, rather than the
expression.
2015-11-08 Paul Thomas <pault@gcc.gnu.org>
PR fortran/68196
* gfortran.dg/proc_ptr_47.f90: New test.
PR fortran/66465
* gfortran.dg/pr66465.f90: New test.
i386: Use the STC bb-reorder algorithm at -Os (PR67864)
For x86, STC still gives better results for optimise-for-size than
"simple" does. So use STC at -Os as well.
PR rtl-optimization/67864
* common/config/i386/i386-common.c (ix86_option_optimization_table)
<OPT_freorder_blocks_algorithm_>: Use REORDER_BLOCKS_ALGORITHM_STC
at -Os and up.
The upcoming changes to use internal functions for things like sqrt
caused a failure in gcc.dg/tm/20100610.c, because we were trying to get
call flags from the null gimple_call_fn of an IFN_SQRT call. We've been
making fairly heavy use of internal functions for a while now so I think
this might be latent.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* trans-mem.c (is_tm_pure_call): Use gimple_call_flags for
internal functions.
I was confused at first why tree-core.h was undefining DEF_BUILTIN_CHKP
before defining it, then undefining it again after including builtins.def.
This is because builtins.def provides a default definition of
DEF_BUILTIN_CHKP, but leaves it up to the caller to undefine it where
necessary. Similarly to the previous internal-fn.def patch, it seems
more obvious for builtins.def to #undef things unconditionally.
One argument might have been that keeping preprocessor stuff
out of the .def files makes it easier for non-cpp parsers. In practice
though we already have #ifs and multiline #defines, so single-line #undefs
should be easy in comparison.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
In practice the definition of DEF_INTERNAL_FN is never reused after
including internal-fn.def, so we might as well #undef it there.
This becomes more obvious with a later patch that adds other
DEF_INTERNAL_* directives, such as DEF_INTERNAL_OPTAB_FN.
If the includer doesn't care about the information carried in
these new directives, it can simply leave the macro undefined
and internals.def will provide a definition that forwards to
DEF_INTERNAL_FN. It doesn't make much sense for includers to have
to #undef macros that are defined by internals.def and it seems overly
complicated to get internals.def to undef macros only in the cases
where it provided a definition. Instead I went with the approach of
#undeffing all the DEF_INTERNAL_* macros unconditionally.
Tested on x86_64-linux-gnu, aarch64-linux-gnu and arm-linux-gnueabi.
gcc/
* internal-fn.def: #undef DEF_INTERNAL_FN at the end.
* internal-fn.c: Don't undef it here.
* tree-core.h: Likewise.