Steven Munroe [Fri, 21 Jul 2017 17:44:22 +0000 (17:44 +0000)]
Now completeting the mmintrin.h intrinsic headers for PowerPC by
adding the DG tests.
2017-07-21 Steven Munroe <munroesj@gcc.gnu.org>
* gcc.target/powerpc/mmx-check.h: New file.
* gcc.target/powerpc/mmx-packs.c: New file.
* gcc.target/powerpc/mmx-packssdw-1.c: New file.
* gcc.target/powerpc/mmx-packsswb-1.c: New file.
* gcc.target/powerpc/mmx-packuswb-1.c: New file.
* gcc.target/powerpc/mmx-paddb-1.c: New file.
* gcc.target/powerpc/mmx-paddd-1.c: New file.
* gcc.target/powerpc/mmx-paddsb-1.c: New file.
* gcc.target/powerpc/mmx-paddsw-1.c: New file.
* gcc.target/powerpc/mmx-paddusb-1.c: New file.
* gcc.target/powerpc/mmx-paddusw-1.c: New file.
* gcc.target/powerpc/mmx-paddw-1.c: New file.
* gcc.target/powerpc/mmx-pcmpeqb-1.c: New file.
* gcc.target/powerpc/mmx-pcmpeqd-1.c: New file.
* gcc.target/powerpc/mmx-pcmpeqw-1.c: New file.
* gcc.target/powerpc/mmx-pcmpgtb-1.c: New file.
* gcc.target/powerpc/mmx-pcmpgtd-1.c: New file.
* gcc.target/powerpc/mmx-pcmpgtw-1.c: New file.
* gcc.target/powerpc/mmx-pmaddwd-1.c: New file.
* gcc.target/powerpc/mmx-pmulhw-1.c: New file.
* gcc.target/powerpc/mmx-pmullw-1.c: New file.
* gcc.target/powerpc/mmx-pslld-1.c: New file.
* gcc.target/powerpc/mmx-psllw-1.c: New file.
* gcc.target/powerpc/mmx-psrad-1.c: New file.
* gcc.target/powerpc/mmx-psraw-1.c: New file.
* gcc.target/powerpc/mmx-psrld-1.c: New file.
* gcc.target/powerpc/mmx-psrlw-1.c: New file.
* gcc.target/powerpc/mmx-psubb-2.c: New file.
* gcc.target/powerpc/mmx-psubd-2.c: New file.
* gcc.target/powerpc/mmx-psubsb-1.c: New file.
* gcc.target/powerpc/mmx-psubsw-1.c: New file.
* gcc.target/powerpc/mmx-psubusb-1.c: New file.
* gcc.target/powerpc/mmx-psubusw-1.c: New file.
* gcc.target/powerpc/mmx-psubw-2.c: New file.
* gcc.target/powerpc/mmx-punpckhbw-1.c: New file.
* gcc.target/powerpc/mmx-punpckhdq-1.c: New file.
* gcc.target/powerpc/mmx-punpckhwd-1.c: New file.
* gcc.target/powerpc/mmx-punpcklbw-1.c: New file.
* gcc.target/powerpc/mmx-punpckldq-1.c: New file.
* gcc.target/powerpc/mmx-punpcklwd-1.c: New file.
Andrew Pinski [Fri, 21 Jul 2017 17:16:51 +0000 (17:16 +0000)]
tree-ssa-sccvn.c (vn_nary_op_eq): Check BIT_INSERT_EXPR's operand 1 to see if the types precision matches.
2017-07-21 Andrew Pinski <apinski@cavium.com>
* tree-ssa-sccvn.c (vn_nary_op_eq): Check BIT_INSERT_EXPR's
operand 1 to see if the types precision matches.
* fold-const.c (operand_equal_p): Likewise.
re PR lto/81487 ([mingw32] ld.exe: error: asprintf failed)
lto-plugin/
PR lto/81487
* lto-plugin.c (claim_file_handler): Use xasprintf instead of
asprintf.
[hi!=0]: Swap hi and lo arguments supplied to xasprintf.
Richard Biener [Fri, 21 Jul 2017 11:32:39 +0000 (11:32 +0000)]
re PR tree-optimization/81303 (410.bwaves regression caused by r249919)
2017-07-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/81303
* tree-vect-data-refs.c (vect_get_peeling_costs_all_drs): Pass
in datarefs vector. Allow NULL dr0 for no peeling cost estimate.
(vect_peeling_hash_get_lowest_cost): Adjust.
(vect_enhance_data_refs_alignment): Likewise. Use
vect_get_peeling_costs_all_drs to compute the penalty for no
peeling to match up costs.
Jan Hubicka [Fri, 21 Jul 2017 07:17:22 +0000 (09:17 +0200)]
bb-reorder.c (find_rarely_executed_basic_blocks_and_crossing_edges): Put all BBs reachable only via paths crossing cold region to cold region.
* bb-reorder.c (find_rarely_executed_basic_blocks_and_crossing_edges):
Put all BBs reachable only via paths crossing cold region to cold
region.
* cfgrtl.c (find_bbs_reachable_by_hot_paths): New function.
Richard Biener [Fri, 21 Jul 2017 07:13:57 +0000 (07:13 +0000)]
re PR tree-optimization/81303 (410.bwaves regression caused by r249919)
2016-07-21 Richard Biener <rguenther@suse.de>
PR tree-optimization/81303
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Take
into account prologue and epilogue iterations when raising
min_profitable_iters to sth at least covering one vector iteration.
gcc/
Remove TYPE_METHODS.
* tree.h (TYPE_METHODS): Delete.
* dwarf2out.c (gen_member_die): Member fns are on TYPE_FIELDS.
* dbxout.c (dbxout_type_fields): Ignore FUNCTION_DECLs.
(dbxout_type_methods): Scan TYPE_FIELDS.
(dbxout_type): Don't check TYPE_METHODS here.
* function.c (use_register_for_decl): Always ignore register for
class types when not optimizing.
* ipa-devirt.c (odr_types_equivalent_p): Delete TYPE_METHODS scan.
* tree.c (free_lang_data_in_type): Stitch out member functions and
templates from TYPE_FIELDS.
(build_distinct_type_copy, verify_type_variant,
verify_type): Member fns are on TYPE_FIELDS.
* tree-dump.c (dequeue_and_dump): No TYPE_METHODS.
* tree-pretty-print.c (dump_generic_node): Likewise.
gcc/c-family/
Remove TYPE_METHODS.
* c-ada-spec.c (is_tagged_type, has_nontrivial_methods,
dump_ada_template, print_ada_methods,
print_ada_declaration): Member fns are on TYPE_FIELDS.
gcc/objc/
Remove TYPE_METHODS.
* objc-runtime-shared-support.c (build_ivar_list_initializer):
Don't presume first item is a FIELD_DECL.
Ian Lance Taylor [Thu, 20 Jul 2017 19:53:54 +0000 (19:53 +0000)]
compiler: add explicit convert in Type_guard_expression::do_get_backend
The current recipe in Type_guard_expression for creating a Bexpression
performs a type conversion at the Type level, but doesn't invoke
Backend::convert_expression to capture the conversion for the back
end. Add code to create a BE type conversion operation.
Jakub Jelinek [Thu, 20 Jul 2017 16:36:18 +0000 (18:36 +0200)]
re PR target/80846 (auto-vectorized AVX2 horizontal sum should narrow to 128b right away, to be more efficient for Ryzen and Intel)
PR target/80846
* config/i386/i386.c (ix86_expand_vector_init_general): Handle
V2TImode and V4TImode.
(ix86_expand_vector_extract): Likewise.
* config/i386/sse.md (VMOVE): Enable V4TImode even for just
TARGET_AVX512F, instead of only for TARGET_AVX512BW.
(ssescalarmode): Handle V4TImode and V2TImode.
(VEC_EXTRACT_MODE): Add V4TImode and V2TImode.
(*vec_extractv2ti, *vec_extractv4ti): New insns.
(VEXTRACTI128_MODE): New mode iterator.
(splitter for *vec_extractv?ti first element): New.
(VEC_INIT_MODE): New mode iterator.
(vec_init<mode>): Consolidate 3 expanders into one using
VEC_INIT_MODE mode iterator.
* gcc.target/i386/avx-pr80846.c: New test.
* gcc.target/i386/avx2-pr80846.c: New test.
* gcc.target/i386/avx512f-pr80846.c: New test.
* gimple.h (gimple_phi_result): Add gphi * overload.
(gimple_phi_result_ptr): Likewise.
(gimple_phi_arg): Likewise. Adjust index assert to only
allow actual argument accesses rather than all slots available
by capacity.
(gimple_phi_arg_def): Add gphi * overload.
* tree-phinodes.c (make_phi_node): Initialize only actual
arguments.
(resize_phi_node): Clear memory not covered by old node,
do not initialize excess argument slots.
(reserve_phi_args_for_new_edge): Initialize new argument slot
completely.
Richard Biener [Thu, 20 Jul 2017 11:17:21 +0000 (11:17 +0000)]
re PR tree-optimization/61171 (vectorization fails for a reduction in presence of subtraction)
2017-07-20 Richard Biener <rguenther@suse.de>
PR tree-optimization/61171
* tree-vectorizer.h (slp_instance): Add reduc_phis member.
(vect_analyze_stmt): Add slp instance parameter.
(vectorizable_reduction): Likewise.
* tree-vect-loop.c (vect_analyze_loop_operations): Adjust.
(vect_is_simple_reduction): Deal with chains not detected
as SLP reduction chain, specifically not properly associated
chains containing a mix of plus/minus.
(get_reduction_op): Remove.
(get_initial_defs_for_reduction): Simplify, pass in whether
this is a reduction chain, pass in the SLP node for the PHIs.
(vect_create_epilog_for_reduction): Get the SLP instance as
arg and adjust.
(vectorizable_reduction): Get the SLP instance as arg.
During analysis remember the SLP node with the PHIs in the
instance. Simplify getting at the vectorized reduction PHIs.
* tree-vect-slp.c (vect_slp_analyze_node_operations): Pass
through SLP instance.
(vect_slp_analyze_operations): Likewise.
* tree-vect-stms.c (vect_analyze_stmt): Likewise.
(vect_transform_stmt): Likewise.
* g++.dg/vect/pr61171.cc: New testcase.
* gfortran.dg/vect/pr61171.f: Likewise.
* gcc.dg/vect/vect-reduc-11.c: Likewise.
Tom de Vries [Thu, 20 Jul 2017 07:16:01 +0000 (07:16 +0000)]
Fix phi arg location in find_implicit_erroneous_behavior
2017-07-20 Tom de Vries <tom@codesourcery.com>
PR tree-optimization/81489
* gimple-ssa-isolate-paths.c (find_implicit_erroneous_behavior): Move
read of phi arg location to before loop that modifies phi.
Jonathan Wakely [Wed, 19 Jul 2017 19:32:15 +0000 (20:32 +0100)]
PR libstdc++/81476 Optimise vector insertion from input iterators
PR libstdc++/81476
* include/bits/vector.tcc (vector::_M_range_insert<_InputIterator>):
Only insert elements one-by-one when inserting at the end.
* testsuite/performance/23_containers/insert/81476.cc: New.
We here have an AND of a SUBREG of an LSHIFTRT. If that SUBREG is
paradoxical, the extraction we form is the length of the size of the
inner mode, which includes some bits that should not be in the result.
Just give up in that case.
PR rtl-optimization/81423
* combine.c (make_compound_operation_int): Don't try to optimize
the AND of a SUBREG of an LSHIFTRT if that SUBREG is paradoxical.
Michael Meissner [Wed, 19 Jul 2017 19:29:16 +0000 (19:29 +0000)]
cpu-builtin-1.c: Change test to use #ifdef __BUILTIN_CPU_SUPPORTS to see if...
2017-07-19 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/cpu-builtin-1.c: Change test to use #ifdef
__BUILTIN_CPU_SUPPORTS to see if the GLIBC is new enough that
__builtin_cpu_is and __builtin_cpu_supports are supported.
Steven Munroe [Wed, 19 Jul 2017 19:04:50 +0000 (19:04 +0000)]
Fix up plafform testes in check headers.
Fix up plafform testes in check headers. After a resent GCC change
the previously submitted BMI/BMI2 intrinsic test started to fail
with a warning/error.
Jan Hubicka [Wed, 19 Jul 2017 18:09:39 +0000 (18:09 +0000)]
predict.c (propagate_unlikely_bbs_forward): Break out from ...
* predict.c (propagate_unlikely_bbs_forward): Break out from ...
(determine_unlikely_bbs): ... here.
* predict.h (propagate_unlikely_bbs_forward): Declare.
* cfgexpand.c (pass_expand::execute): Use it.
* bb-reorder.c (sanitize_hot_paths): Do not consider known to be
unlikely edges.
(find_rarely_executed_basic_blocks_and_crossing_edges): Use
propagate_unlikely_bbs_forward.
Jan Hubicka [Wed, 19 Jul 2017 18:08:53 +0000 (20:08 +0200)]
predict.c (propagate_unlikely_bbs_forward): Break out from ...
* predict.c (propagate_unlikely_bbs_forward): Break out from ...
(determine_unlikely_bbs): ... here.
* predict.h (propagate_unlikely_bbs_forward): Declare.
* cfgexpand.c (pass_expand::execute): Use it.
* bb-reorder.c (sanitize_hot_paths): Do not consider known to be
unlikely edges.
(find_rarely_executed_basic_blocks_and_crossing_edges): Use
propagate_unlikely_bbs_forward.
Jakub Jelinek [Wed, 19 Jul 2017 15:55:47 +0000 (17:55 +0200)]
ada-tree.h (TYPE_OBJECT_RECORD_TYPE, [...]): Use TYPE_MIN_VALUE_RAW instead of TYPE_MINVAL.
* gcc-interface/ada-tree.h (TYPE_OBJECT_RECORD_TYPE,
TYPE_GCC_MIN_VALUE): Use TYPE_MIN_VALUE_RAW instead of TYPE_MINVAL.
(TYPE_GCC_MAX_VALUE): Use TYPE_MAX_VALUE_RAW instead of TYPE_MAXVAL.
Tom de Vries [Wed, 19 Jul 2017 13:05:21 +0000 (13:05 +0000)]
Add v2si support for nvptx
2017-07-19 Tom de Vries <tom@codesourcery.com>
* config/nvptx/nvptx-modes.def: New file. Add V2SImode.
* config/nvptx/nvptx.c (nvptx_ptx_type_from_mode): Handle V2SImode.
(nvptx_vector_mode_supported): New function. Allow V2SImode.
(TARGET_VECTOR_MODE_SUPPORTED_P): Redefine to nvptx_vector_mode_supported.
* config/nvptx/nvptx.md (VECIM): New mode iterator. Add V2SI.
(mov<VECIM>_insn): New define_insn.
(define_expand "mov<VECIM>): New define_expand.
* gcc.target/nvptx/slp-run.c: New test.
* gcc.target/nvptx/slp.c: New test.
* gcc.target/nvptx/v2si-cvt.c: New test.
* gcc.target/nvptx/v2si-run.c: New test.
* gcc.target/nvptx/v2si.c: New test.
* gcc.target/nvptx/vec.inc: New test.
Jakub Jelinek [Wed, 19 Jul 2017 12:31:59 +0000 (14:31 +0200)]
re PR tree-optimization/81346 (Missed constant propagation into comparison)
PR tree-optimization/81346
* fold-const.h (fold_div_compare, range_check_type): Declare.
* fold-const.c (range_check_type): New function.
(build_range_check): Use range_check_type.
(fold_div_compare): No longer static, rewritten into
a match.pd helper function.
(fold_comparison): Don't call fold_div_compare here.
* match.pd (X / C1 op C2): New optimization using fold_div_compare
as helper function.
* gcc.dg/tree-ssa/pr81346-1.c: New test.
* gcc.dg/tree-ssa/pr81346-2.c: New test.
* gcc.dg/tree-ssa/pr81346-3.c: New test.
* gcc.dg/tree-ssa/pr81346-4.c: New test.
* gcc.target/i386/umod-3.c: Hide comparison against 1 from the
compiler to avoid X / C1 op C2 optimization to trigger.
Ian Lance Taylor [Tue, 18 Jul 2017 23:29:15 +0000 (23:29 +0000)]
compiler: insert backend type conversion for closure func ptr
In Func_expression::do_get_backend when creating the backend
representation for a closure, create a backend type conversion to
account for potential differences between the closure struct type
(where the number of fields is dependent on the number of values
referenced in the closure) and the generic function descriptor type
(struct with single function pointer field).
Ian Lance Taylor [Tue, 18 Jul 2017 23:14:29 +0000 (23:14 +0000)]
re PR go/81451 (missing futex check - libgo/runtime/thread-linux.c:12:0 futex.h:13:12: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘long’)
PR go/81451
runtime: inline runtime_osinit
We had two identical copies of runtime_osinit. They set runtime_ncpu,
a variable that is no longer used. Removing that leaves us with two lines.
Inline those two lines in the two places the function was called.
Ian Lance Taylor [Tue, 18 Jul 2017 22:31:00 +0000 (22:31 +0000)]
compiler: pass correct 'function' flag to circular_pointer_type
The code in Named_type::do_get_backend was not passing the correct
flag value for circular function types to Backend::circular_pointer_type
(it was always setting this flag to false). Pass a true value if the
type being converted is a function type.
Ian Lance Taylor [Tue, 18 Jul 2017 22:06:31 +0000 (22:06 +0000)]
re PR go/81324 (libgo does not build with glibc 2.18)
PR go/81324
sysinfo.c: ignore ptrace_peeksiginfo_args from <linux/ptrace.h>
With some versions of glibc and GNU/Linux ptrace_pseeksiginfo_args is
defined in both <sys/ptrace.h> and <linux/ptrace.h>. We don't actually
care about the struct, so use a #define to avoid a redefinition error.
re PR target/81471 (internal compiler error: in curr_insn_transform, at lra-constraints.c:3495)
PR target/81471
* config/i386/i386.md (rorx_immediate_operand): New mode attribute.
(*bmi2_rorx<mode>3_1): Use rorx_immediate_operand as
operand 2 predicate.
(*bmi2_rorxsi3_1_zext): Use const_0_to_31_operand as
operand 2 predicate.
(ror,rol -> rorx splitters): Use const_int_operand as
operand 2 predicate.
testsuite/ChangeLog:
PR target/81471
* gcc.target/i386/pr81471.c: New test.
Jan Hubicka [Tue, 18 Jul 2017 13:49:30 +0000 (15:49 +0200)]
re PR tree-optimization/81462 (ICE in estimate_bb_frequencies at gcc/predict.c:3546)
PR middle-end/81462
* predict.c (set_even_probabilities): Cleanup; do not affect
probabilities that are already known.
(combine_predictions_for_bb): Call even when count is set.
* class.c (classtype_has_move_assign_or_move_ctor): Declare.
(add_implicitly_declared_members): Use it.
(type_has_move_constructor, type_has_move_assign): Merge into ...
(classtype_has_move_assign_or_move_ctor): ... this new function.
* cp-tree.h (type_has_move_constructor, type_has_move_assign): Delete.
Robin Dapp [Tue, 18 Jul 2017 09:23:35 +0000 (09:23 +0000)]
Fix PR81362: Vector peeling
npeel was erroneously overwritten by vect_peeling_hash_get_lowest_cost
although the corresponding dataref is not used afterwards. It should
be safe to get rid of the npeel parameter since we use the returned
peeling_info's npeel anyway. Also removed the body_cost_vec parameter
which is not used elsewhere.
gcc/ChangeLog:
2017-07-18 Robin Dapp <rdapp@linux.vnet.ibm.com>
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Remove
body_cost_vec from _vect_peel_extended_info.
(vect_peeling_hash_get_lowest_cost): Do not set body_cost_vec.
(vect_peeling_hash_choose_best_peeling): Remove body_cost_vec and
npeel.
Richard Biener [Tue, 18 Jul 2017 07:35:40 +0000 (07:35 +0000)]
re PR tree-optimization/80620 (gcc produces wrong code with -O3)
2017-07-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/80620
PR tree-optimization/81403
* tree-ssa-pre.c (phi_translate_1): Clear range and points-to
info when re-using a VN table entry.
* gcc.dg/torture/pr80620.c: New testcase.
* gcc.dg/torture/pr81403.c: Likewise.
Richard Biener [Tue, 18 Jul 2017 07:26:04 +0000 (07:26 +0000)]
re PR tree-optimization/81418 (ICE in vect_get_vec_def_for_stmt_copy)
2017-06-18 Richard Biener <rguenther@suse.de>
PR tree-optimization/81418
* tree-vect-loop.c (vectorizable_reduction): Properly compute
vectype_in. Verify that with lane-reducing reduction operations
we have a single def-use cycle.
class.c (type_has_user_declared_move_constructor, [...]): Combine into ...
* class.c (type_has_user_declared_move_constructor,
type_has_user_declared_move_assign): Combine into ...
(classtype_has_user_move_assign_or_move_ctor_p): ... this new function.
* cp-tree.h (type_has_user_declared_move_constructor,
type_has_user_declared_move_assign): Combine into ...
(classtype_has_user_move_assign_or_move_ctor_p): ... this. Declare.
* method.c (maybe_explain_implicit_delete): Use it.
(lazily_declare_fn): Use it.
* tree.c (type_has_nontrivial_copy_init): Use it.
Emitting subregs in the expand will result in broken code due to LRA handling of them. Issue observed while turning on mlra and mexpand-adddi options using dejagnu test suite. Deprecate this
option.
[ARC] [LRA] Avoid emitting COND_EXEC during expand.
Emmitting COND_EXEC rtxes during expand does introduces errors due to LRA handling of them. Issue discovered while running dejagnu test suit with mlra option on.
* config/arc/arc.md (clzsi2): Expand to an arc_clzsi2 instruction
that also clobbers the CC register. The old expand code is moved
to ...
(*arc_clzsi2): ... here.
(ctzsi2): Expand to an arc_ctzsi2 instruction that also clobbers
the CC register. The old expand code is moved to ...
(arc_ctzsi2): ... here.
Bin Cheng [Mon, 17 Jul 2017 11:40:54 +0000 (11:40 +0000)]
re PR tree-optimization/81369 (ICE in generate_code_for_partition)
PR target/81369
* tree-loop-distribution.c (classify_partition): Only assert on
numer of iterations.
(merge_dep_scc_partitions): Delete prameter. Update function call.
(distribute_loop): Remove code handling loop with unknown niters.
(pass_loop_distribution::execute): Skip loop with unknown niters.
Bin Cheng [Mon, 17 Jul 2017 11:34:30 +0000 (11:34 +0000)]
re PR tree-optimization/81374 (ICE in bb_top_order_cmp, at tree-loop-distribution.c:391)
PR tree-optimization/81374
* tree-loop-distribution.c (pass_loop_distribution::execute): Record
the max index of basic blocks, rather than number of basic blocks.
This patch refactors a number of functions and compiler hooks into using a
single function which checks if a rtx is suited for pic or not. Removed
functions are arc_legitimate_pc_offset_p and arc_legitimate_pic_operand_p
beeing replaced by calls to arc_legitimate_pic_addr_p. Thus we have an
unitary way of checking a rtx beeing pic.
* config/arc/arc-protos.h (arc_legitimate_pc_offset_p): Remove
proto.
(arc_legitimate_pic_operand_p): Likewise.
* config/arc/arc.c (arc_legitimate_pic_operand_p): Remove
function.
(arc_needs_pcl_p): Likewise.
(arc_legitimate_pc_offset_p): Likewise.
(arc_legitimate_pic_addr_p): Remove LABEL_REF case, as this
function is also used in constrains.md.
(arc_legitimate_constant_p): Use arc_legitimate_pic_addr_p to
validate pic constants. Handle CONST_INT, CONST_DOUBLE, MINUS and
PLUS. Only return true/false in known cases, otherwise assert.
(arc_legitimate_address_p): Remove arc_legitimate_pic_addr_p as it
is already called in arc_legitimate_constant_p.
* config/arc/arc.h (CONSTANT_ADDRESS_P): Consider also LABEL for
pic addresses.
(LEGITIMATE_PIC_OPERAND_P): Use
arc_raw_symbolic_reference_mentioned_p function.
* config/arc/constraints.md (Cpc): Use arc_legitimate_pic_addr_p
function.
(Cal): Likewise.
(C32): Likewise.