gcc.gnu.org Git - gcc.git/log

attribs: Don't diagnose attribute exclusions during error recovery [PR94705]

On the following testcase GCC ICEs, because last_decl is error_mark_node,
and diag_attr_exclusions assumes that if it is not NULL, it must be a decl.

The following patch just doesn't diagnose attribute exclusions if the
other decl is erroneous (and thus we've already reported errors for it).

2020-04-23 Jakub Jelinek <jakub@redhat.com>

PR c/94705
* attribs.c (decl_attribute): Don't diagnose attribute exclusions
if last_decl is error_mark_node or has such a TREE_TYPE.

* gcc.dg/pr94705.c: New test.

(cherry picked from commit e2a71816b4949225498bec37e947293aa7f5841b)

ubsan: Avoid -Wpadded warnings [PR94641]

-Wpadded warnings aren't really useful for the artificial types that GCC
lays out for ubsan.

2020-04-21 Jakub Jelinek <jakub@redhat.com>

PR c/94641
* stor-layout.c (place_field, finalize_record_size): Don't emit
-Wpadded warning on TYPE_ARTIFICIAL rli->t.
* ubsan.c (ubsan_get_type_descriptor_type,
ubsan_get_source_location_type, ubsan_create_data): Set
TYPE_ARTIFICIAL.
* asan.c (asan_global_struct): Likewise.

* c-c++-common/ubsan/pr94641.c: New test.

(cherry picked from commit 73f8e9dca5ff891ed19001b213fd1f6ce31417e3)

Fix -fcompare-debug issue in delete_insn_and_edges [PR94618]

delete_insn_and_edges calls purge_dead_edges whenever deleting the last insn
in a bb, whatever it is. If it called it only for mandatory last insns
in the basic block (that may not be followed by DEBUG_INSNs, dunno if that
is control_flow_insn_p or something more complex), that wouldn't be a
problem, but as it calls it on any last insn and can actually do something
in the bb, if such an insn is followed by one more more DEBUG_INSNs and
nothing else in the same bb, we don't call purge_dead_edges with -g and do
call it with -g0.

On the testcase, there are two reg-to-reg moves with REG_EH_REGION notes
(previously memory accesses but simplified and yet not optimized), and the
second is followed by DEBUG_INSNs; the second move is delete_insn_and_edges
and after removing it, for -g0 purge_dead_edges removes the REG_EH_REGION
from the now last insn in the bb (the first reg-to-reg move), while
for -g it isn't called and things diverge from that quickly on.

Fixed by calling purdge_dead_edges even if we remove the last real insn
followed only by DEBUG_INSNs in the same bb.

2020-04-17 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/94618
* cfgrtl.c (delete_insn_and_edges): Set purge not just when
insn is the BB_END of its block, but also when it is only followed
by DEBUG_INSNs in its block.

* g++.dg/opt/pr94618.C: New test.

(cherry picked from commit c41884a09206be0e21cad7eea71b9754daa969d4)

c++: Fix pasto in structured binding diagnostics [PR94571]

This snippet has been copied from the non-structured binding declaration
parsing later in the function, and while for non-structured bindings
it can be followed by comma or semicolon, structured bindings may be
only followed by semicolon.

Or, do we want to have a different message for the case when there is
a comma (and keep this corrected one only if there is something else)
that would explain better what is the bug (or add a fix-it hint)?
Marek said in the PR that clang++ reports
error: decomposition declaration must be the only declaration in its group

There is another thing Marek noted (though, something for different spot),
that diagnostic for auto x(1), [e,f] = test2; could also use a clearer
wording like the above (or a fix-it hint), but the question is if we should
assume [ after , as a structured binding or if we should do some tentative
parsing first to figure out if it looks like a structured binding.

2020-04-16 Jakub Jelinek <jakub@redhat.com>

PR c++/94571
* parser.c (cp_parser_simple_declaration): Fix up a pasto in
diagnostics.

* g++.dg/cpp1z/decomp51.C: New test.

(cherry picked from commit e4658c7dbbe88f742c96e5f58ee4a6d549d642ca)

vect: Fix up lowering of TRUNC_MOD_EXPR by negative constant [PR94524]

The first testcase below is miscompiled, because for the division part
of the lowering we canonicalize negative divisors to their absolute value
(similarly how expmed.c canonicalizes it), but when multiplying the division
result back by the VECTOR_CST, we use the original constant, which can
contain negative divisors.

Fixed by computing ABS_EXPR of the VECTOR_CST.  Unfortunately, fold-const.c
doesn't support const_unop (ABS_EXPR, VECTOR_CST) and I think it is too late
in GCC 10 cycle to add it now.

Furthermore, while modulo by most negative constant happens to return the
right value, it does that only by invoking UB in the IL, because
we then expand division by that 1U+INT_MAX and say for INT_MIN % INT_MIN
compute the division as -1, and then multiply by INT_MIN, which is signed
integer overflow.  We in theory could do the computation in unsigned vector
types instead, but is it worth bothering.  People that are doing % INT_MIN
are either testing for standard conformance, or doing something wrong.
So, I've also added punting on % INT_MIN, both in vect lowering and vect
pattern recognition (we punt already for / INT_MIN).

2020-04-08  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/94524
* tree-vect-generic.c (expand_vector_divmod): If any elt of op1 is
negative for signed TRUNC_MOD_EXPR, multiply with absolute value of
op1 rather than op1 itself at the end.  Punt for signed modulo by
most negative constant.
* tree-vect-patterns.c (vect_recog_divmod_pattern): Punt for signed
modulo by most negative constant.

* gcc.c-torture/execute/pr94524-1.c: New test.
* gcc.c-torture/execute/pr94524-2.c: New test.

(cherry picked from commit f52eb4f988992d393c69ee4ab76f236dced80e36)

i386: Don't use AVX512F integral masks for V*TImode [PR94438]

The ix86_get_mask_mode hook uses int mask for 512-bit vectors or 128/256-bit
vectors with AVX512VL (that is correct), and only for V*[SD][IF]mode if not
AVX512BW (also correct), but with AVX512BW it would stop checking the
elem_size altogether and pretend the hw has masking support for V*TImode
etc., which it doesn't. That can lead to various ICEs later on.

2020-04-08 Jakub Jelinek <jakub@redhat.com>

PR target/94438
* config/i386/i386.c (ix86_get_mask_mode): Only use int mask for elem_size
1, 2, 4 and 8.

* gcc.target/i386/avx512bw-pr94438.c: New test.
* gcc.target/i386/avx512vlbw-pr94438.c: New test.

(cherry picked from commit 8bf5faa9c463f0d53ffe835ba03d4502edfb959d)

c++: Further fix for -fsanitize=vptr [PR94325]

For -fsanitize=vptr, we insert a NULL store into the vptr instead of just
adding a CLOBBER of this. build_clobber_this makes the CLOBBER conditional
on in_charge (implicit) parameter whenever CLASSTYPE_VBASECLASSES, but when
adding this conditionalization to the -fsanitize=vptr code in PR87095,
I wanted it to catch some more cases when the class has CLASSTYPE_VBASECLASSES,
but the vptr is still not shared with something else, otherwise the
sanitization would be less effective.
The following testcase shows that the chosen test that CLASSTYPE_PRIMARY_BINFO
is non-NULL and has BINFO_VIRTUAL_P set wasn't sufficient,
the D class has still sizeof(D) == sizeof(void*) and thus contains just
a single vptr, but while in B::~B() this results in the vptr not being
cleared, in C::~C() this condition isn't true, as CLASSTYPE_PRIMARY_BINFO
in that case is B and is not BINFO_VIRTUAL_P, so it clears the vptr, but the
D::~D() dtor after invoking C::~C() invokes A::~A() with an already cleared
vptr, which is then reported.
The following patch is just a shot in the dark, keep looking through
CLASSTYPE_PRIMARY_BINFO until we find BINFO_VIRTUAL_P, but it works on the
existing testcase as well as this new one.

2020-04-08 Jakub Jelinek <jakub@redhat.com>

PR c++/94325
* decl.c (begin_destructor_body): For CLASSTYPE_VBASECLASSES class
dtors, if CLASSTYPE_PRIMARY_BINFO is non-NULL, but not BINFO_VIRTUAL_P,
look at CLASSTYPE_PRIMARY_BINFO of its BINFO_TYPE if it is not
BINFO_VIRTUAL_P, and so on.

* g++.dg/ubsan/vptr-15.C: New test.

(cherry picked from commit 4cf6b06cb5b02c053738e2975e3b7a4b3c577401)

i386: Fix V{64QI,32HI}mode constant permutations [PR94509]

The following testcases are miscompiled, because expand_vec_perm_pshufb
incorrectly thinks it can use vpshufb instruction for the permutations
when it can't.
The
          if (vmode == V32QImode)
            {
              /* vpshufb only works intra lanes, it is not
                 possible to shuffle bytes in between the lanes.  */
              for (i = 0; i < nelt; ++i)
                if ((d->perm[i] ^ i) & (nelt / 2))
                  return false;
            }
intra-lane check which is correct has been copied and adjusted for 64-byte
modes into:
          if (vmode == V64QImode)
            {
              /* vpshufb only works intra lanes, it is not
                 possible to shuffle bytes in between the lanes.  */
              for (i = 0; i < nelt; ++i)
                if ((d->perm[i] ^ i) & (nelt / 4))
                  return false;
            }
which is not correct, because 64-byte modes have 4 lanes rather than just
two and the above is only testing that the permutation grabs even lane elts
from even lanes and odd lane elts from odd lanes, but not that they are
from the same 256-bit half.

The following patch fixes it by using 3 * nelt / 4 instead of nelt / 4,
so we actually check the most significant 2 bits rather than just one.

2020-04-07  Jakub Jelinek  <jakub@redhat.com>

PR target/94509
* config/i386/i386.c (expand_vec_perm_pshufb): Fix the check
for inter-lane permutation for 64-byte modes.

* gcc.target/i386/avx512bw-pr94509-1.c: New test.
* gcc.target/i386/avx512bw-pr94509-2.c: New test.

(cherry picked from commit 14192f1ed48cb3982b1b3c794e0f313835d0cdcd)

aarch64: Fix {ash[lr],lshr}<mode>3 expanders [PR94488]

The following testcase ICEs on aarch64 apparently since the introduction of
the aarch64 port.  The reason is that the {ashl,ashr,lshr}<mode>3 expanders
completely unnecessarily FAIL; if operands[2] is something other than
a CONST_INT or REG or MEM and the middle-end code can't cope with the
pattern giving up in these cases.  All the expanders use general_operand
predicate for the shift amount operand, but then have just a special case
for CONST_INT (if in-bound, emit an immediate shift, otherwise force into
REG), or MEM (force into REG), or REG (that is the case it handles).
In the testcase, operands[2] is a lowpart SUBREG of a REG, which is valid
general_operand.
I don't see any reason what is magic about MEMs that it should be forced
into REG and others like SUBREGs that it shouldn't, there isn't even a
reason to check for !REG_P because force_reg will do nothing if the operand
is already a REG, and otherwise can handle general_operand just fine.

2020-04-07  Jakub Jelinek  <jakub@redhat.com>

PR target/94488
* config/aarch64/aarch64-simd.md (ashl<mode>3, lshr<mode>3,
ashr<mode>3): Force operands[2] into reg whenever it is not CONST_INT.
Assume it is a REG after that instead of testing it and doing FAIL
otherwise.  Formatting fix.

* gcc.c-torture/compile/pr94488.c: New test.

(cherry picked from commit 7f3ac38b3c765d49a46f65f1e5e9a812fb1da49c)

debug: Improve debug info of c++14 deduced return type [PR94459]

On the following testcase, in gdb ptype S<long>::m1 prints long as return
type, but all the other methods show void instead.
PR53756 added code to add_type_attribute if the return type is
auto/decltype(auto), but we actually should look through references,
pointers and qualifiers.
Haven't included there DW_TAG_atomic_type, because I think at least ATM
one can't use that in C++.  Not sure about DW_TAG_array_type or what else
could be deduced.

> http://eel.is/c++draft/dcl.spec.auto#3 says it has to appear as a
> decl-specifier.
>
> http://eel.is/c++draft/temp.deduct.type#8 lists the forms where a template
> argument can be deduced.
>
> Looks like you are missing arrays, pointers to members, and function return
> types.

2020-04-04  Hannes Domani  <ssbssa@yahoo.de>
    Jakub Jelinek  <jakub@redhat.com>

PR debug/94459
* dwarf2out.c (gen_subprogram_die): Look through references, pointers,
arrays, pointer-to-members, function types and qualifiers when
checking if in-class DIE had an 'auto' or 'decltype(auto)' return type
to emit type again on definition.

* g++.dg/debug/pr94459.C: New test.

Co-Authored-By: Hannes Domani <ssbssa@yahoo.de>
(cherry picked from commit b5039b7259e64a92f5c077fe4d023556d6b12550)

i386: Fix vph{add,subs?}[wd] 256-bit AVX2 RTL patterns [PR94460]

The following testcase is miscompiled, because the AVX2 patterns don't
describe correctly what the insn does.  E.g. vphaddd with %ymm* operands
(the second pattern) instruction as per:
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_hadd_epi32&expand=2941
does { a0+a1, a2+a3, b0+b1, b2+b3, a4+a5, a6+a7, b4+b5, b6+b7 }
but our RTL pattern did
     { a0+a1, a2+a3, a4+a5, a6+a7, b0+b1, b2+b3, b4+b5, b6+b7 }
where the first and last 64 bits are the same and two middle 64 bits
swapped.
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_hadd_epi16&expand=2939
similarly, insn does:
     { a0+a1, a2+a3, a4+a5, a6+a7, b0+b1, b2+b3, b4+b5, b6+b7,
       a8+a9, a10+a11, a12+a13, a14+a15, b8+b9, b10+b11, b12+b13, b14+b15 }
but RTL pattern did
     { a0+a1, a2+a3, a4+a5, a6+a7, a8+a9, a10+a11, a12+a13, a14+a15,
       b0+b1, b2+b3, b4+b5, b6+b7, b8+b9, b10+b11, b12+b13, b14+b15 }
again, first and last 64 bits are the same and the two middle 64 bits
swapped.

2020-04-03  Jakub Jelinek  <jakub@redhat.com>

PR target/94460
* config/i386/sse.md (avx2_ph<plusminus_mnemonic>wv16hi3,
avx2_ph<plusminus_mnemonic>dv8si3): Fix up RTL pattern to do
second half of first lane from first lane of second operand and
first half of second lane from second lane of first operand.

* gcc.target/i386/avx2-pr94460.c: New test.

(cherry picked from commit dbff1829843180dc2a6c8ce5ce7883146b5cf083)

objsz: Don't call replace_uses_by on SSA_NAME_OCCURS_IN_ABNORMAL_PHI [PR94423]

The following testcase ICEs because the objsz pass calls replace_uses_by
on SSA_NAME_OCCURS_IN_ABNORMAL_PHI SSA_NAME.  The following patch instead
of that calls replace_call_with_value, which will turn it into
  xyz_123(ab) = 234;

2020-04-01  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/94423
* tree-object-size.c (pass_object_sizes::execute): Don't call
replace_uses_by for SSA_NAME_OCCURS_IN_ABNORMAL_PHI lhs, instead
call replace_call_with_value.

* gcc.dg/ubsan/pr94423.c: New test.

(cherry picked from commit 4486a537f14bc3b05ac552c3cbe18e540e397ed7)

fold-const: Fix division folding with vector operands [PR94412]

The following testcase is miscompiled since 4.9, we treat unsigned
vector types as if they were signed and "optimize" negations across it.

2020-03-31 Marc Glisse <marc.glisse@inria.fr>
Jakub Jelinek <jakub@redhat.com>

PR middle-end/94412
* fold-const.c (fold_binary_loc) <case TRUNC_DIV_EXPR>: Use
ANY_INTEGRAL_TYPE_P instead of INTEGRAL_TYPE_P.

* gcc.c-torture/execute/pr94412.c: New test.

Co-authored-by: Marc Glisse <marc.glisse@inria.fr>
(cherry picked from commit 8f99f9e6ccec167a5ba67dcc08e6c14948595b82)

Fix vextract* masked patterns [PR93069]

The AVX512F documentation clearly states that in instructions where the
destination is a memory only merging-masking is possible, not zero-masking,
and the assembler enforces that.

The testcase in this patch fails to assemble because of
Error: unsupported masking for `vextracti32x8'
on
vextracti32x8 $0x0, %zmm1, -64(%rsp){%k1}{z}
For the vector extraction patterns, we apparently have 7 *_maskm patterns
that only accept memory destinations and rtx_equal_p merge-masking source
for it, 7 *<mask_name> corresponding patterns that allow memory destination
only for the non-masked cases (through <store_mask_constraint>), then 2
*<mask_name> patterns (lo ssehalf V16FI and lo ssehalf VI8F_256 ones) which
do allow memory destination even for masked cases and are the cause of the
testsuite failure, because we must not allow C constraint if the destination
is m, and finally one pair of patterns (separate * and *_mask, hi ssehalf
VI4F_256), which has another issue (for which I don't have a testcase
though), where if it would match zero-masking with register destination,
it wouldn't emit the needed {z} into assembly.
The attached patch fixes those 3 issues only, perhaps more suitable for
backporting.

2020-03-30 Jakub Jelinek <jakub@redhat.com>

PR target/93069
* config/i386/sse.md (vec_extract_lo_<mode><mask_name>): Use
<store_mask_constraint> instead of m in output operand constraint.
(vec_extract_hi_<mode><mask_name>): Use <mask_operand2> instead of
%{%3%}.

* gcc.target/i386/avx512vl-pr93069.c: New test.
* gcc.dg/vect/pr93069.c: New test.

(cherry picked from commit 57e276f3e304ef92483763ee1028e5b3e1345e0f)

reassoc: Fix -fcompare-debug bug in reassociate_bb [PR94329]

The following testcase FAILs with -fcompare-debug, because reassociate_bb
mishandles the case when the last stmt in a bb has zero uses.  In that case
reassoc_remove_stmt (like gsi_remove) moves the iterator to the next stmt,
i.e. gsi_end_p is true, which means the code sets the iterator back to
gsi_last_bb.  The problem is that the for loop does gsi_prev on that before
handling the next statement, which means the former penultimate stmt, now
last one, is not processed by reassociate_bb.
Now, with -g, if there is at least one debug stmt at the end of the bb,
reassoc_remove_stmt moves the iterator to that following debug stmt and we
just do gsi_prev and continue with the former penultimate non-debug stmt,
now last non-debug stmt.

The following patch fixes that by not doing the gsi_prev in this case; there
are too many continue; cases, so I didn't want to copy over the gsi_prev to
all of them, so this patch uses a bool for that instead.  The second
gsi_end_p check isn't needed anymore, because when we don't do the
undesirable gsi_prev after gsi = gsi_last_bb, the loop !gsi_end_p (gsi)
condition will catch the removal of the very last stmt from a bb.

2020-03-28  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/94329
* tree-ssa-reassoc.c (reassociate_bb): When calling reassoc_remove_stmt
on the last stmt in a bb, make sure gsi_prev isn't done immediately
after gsi_last_bb.

* gfortran.dg/pr94329.f90: New test.

(cherry picked from commit aa9c08ef97f4df1ebb1fc8d72f2e7f9f8c1045c2)

varasm: Fix output_constructor where a RANGE_EXPR index needs to skip some elts [PR94303]

The following testcase is miscompiled, because output_constructor doesn't
output the initializer correctly.  The FE creates {[1...2] = 9} in this
case, and we emit .long 9; long 9; .zero 8 instead of the expected
.zero 8; .long 9; .long 9.  If the CONSTRUCTOR is {[1] = 9, [2] = 9},
output_constructor_regular_field has code to notice that the current
location (local->total_bytes) is smaller than the location we want to write
to (1*sizeof(elt)) and will call assemble_zeros to skip those.  But
RANGE_EXPRs are handled by a different function which didn't do this,
so for RANGE_EXPRs we emitted them properly only if local->total_bytes
was always equal to the location where the RANGE_EXPR needs to start.

2020-03-25  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/94303
* varasm.c (output_constructor_array_range): If local->index
RANGE_EXPR doesn't start at the current location in the constructor,
skip needed number of bytes using assemble_zeros or assert we don't
go backwards.

PR middle-end/94303
* g++.dg/torture/pr94303.C: New test.

(cherry picked from commit 56407bab53a514ffcd6ac011965cebdc5eb3ef54)

if-conv: Delete dead stmts backwards in ifcvt_local_dce [PR94283]

> > This patch caused:
> >
> > gcc /home/marxin/Programming/gcc/gcc/testsuite/gcc.c-torture/compile/990625-2.c -O3 -g -fno-tree-dce -c
> > during GIMPLE pass: ifcvt
> > /home/marxin/Programming/gcc/gcc/testsuite/gcc.c-torture/compile/990625-2.c: In function ‘broken030599’:
> > /home/marxin/Programming/gcc/gcc/testsuite/gcc.c-torture/compile/990625-2.c:2:1: internal compiler error: Segmentation fault
>
> Likely
>
>   /* Delete dead statements.  */
>   gsi = gsi_start_bb (bb);
>   while (!gsi_end_p (gsi))
>     {
>
> needs to instead work back-to-front for debug stmt adjustment to work

Indeed, that seems to work.

2020-03-25  Richard Biener  <rguenther@suse.de>
    Jakub Jelinek  <jakub@redhat.com>

PR debug/94283
* tree-if-conv.c (ifcvt_local_dce): Delete dead statements backwards.

* gcc.dg/pr94283.c: New test.

Co-authored-by: Richard Biener <rguenther@suse.de>
(cherry picked from commit 8ea7970c4968517fb73f42bcca40d316adacf215)

if-conv: Fix -fcompare-debug bugs in ifcvt_local_dce [PR94283]

The following testcase shows -fcompare-debug bugs in ifcvt_local_dce,
where the decisions what statements are needed is based also on debug stmt
operands, which is wrong.
So, this patch makes sure to never add debug stmt to the worklist, or never
add an assign to worklist just because it is used in a debug stmt in another
bb.

2020-03-24  Jakub Jelinek  <jakub@redhat.com>

PR debug/94283
* tree-if-conv.c (ifcvt_local_dce): For gimple debug stmts, just set
GF_PLF_2, but don't add them to worklist.  Don't add an assigment to
worklist or set GF_PLF_2 just because it is used in a debug stmt in
another bb.  Formatting improvements.

* gcc.target/i386/pr94283.c: New test.

(cherry picked from commit 4dcfd4e56b0d22af12750372f3e0b49249b1d473)

c++: Fix up handling of captured vars in lambdas in OpenMP clauses [PR93931]

Without the parser.c change we were ICEing on the testcase, because while the
uses of the captured vars inside of the constructs were replaced with capture
proxy decls, we didn't do that for decls in OpenMP clauses.

With that fixed, we don't ICE anymore, but the testcase is miscompiled and FAILs
at runtime.  This is because the capture proxy decls have DECL_VALUE_EXPR and
during gimplification we were gimplifying those to their DECL_VALUE_EXPRs.
That is fine for shared vars, but for privatized ones we must not do that.
So that is what the cp-gimplify.c changes do.  Had to add a DECL_CONTEXT check
before calling is_capture_proxy because some VAR_DECLs don't have DECL_CONTEXT
set (yet) and is_capture_proxy relies on that being non-NULL always.

2020-03-19  Jakub Jelinek  <jakub@redhat.com>

PR c++/93931
* parser.c (cp_parser_omp_var_list_no_open): Call process_outer_var_ref
on outer_automatic_var_p decls.
* cp-gimplify.c (cxx_omp_disregard_value_expr): Return true also for
capture proxy decls.

* testsuite/libgomp.c++/pr93931.C: New test.

(cherry picked from commit 484206967f958fc47827a71654fe52a98adc95cb)

phiopt: Avoid -fcompare-debug bug in phiopt [PR94211]

Two years ago, I've added support for up to 2 simple preparation statements
in value_replacement, but the
-      && estimate_num_insns (assign, &eni_time_weights)
+      && estimate_num_insns (bb_seq (middle_bb), &eni_time_weights)
change, meant that we compute the cost of all those statements rather than
just the single assign that has been the single supported non-debug
statement in the bb before, doesn't do what I thought would do, gimple_seq
is just gimple * and thus it can't be really overloaded depending on whether
we pass a single gimple * or a whole sequence.  Which means in the last
two years it doesn't count all the statements, but only the first one.
With -g that happens to be a DEBUG_STMT, or it could be e.g. the first
preparation statement which could be much cheaper than the actual assign.

2020-03-19  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/94211
* tree-ssa-phiopt.c (value_replacement): Use estimate_num_insns_seq
instead of estimate_num_insns for bb_seq (middle_bb).  Rename
emtpy_or_with_defined_p variable to empty_or_with_defined_p, adjust
all uses.

* gcc.dg/pr94211.c: New test.

(cherry picked from commit 8db876e9c045c57d2dc5bd08a6e250f822efaad0)

c: Handle C_TYPE_INCOMPLETE_VARS even for ENUMERAL_TYPEs [PR94172]

The following testcases ICE, because they contain extern variable
declarations with incomplete enum types that is later completed and after
that those variables are accessed.  The ICEs are because the vars then may have
incorrect DECL_MODE etc., e.g. in the first case the var has SImode
DECL_MODE (the guessed mode for the enum), but the enum then actually has
DImode because its enumerators don't fit into unsigned int.

The following patch fixes it by using C_TYPE_INCOMPLETE_VARS not just on
incomplete struct/union types, but also incomplete enum types.
TYPE_VFIELD can't be used as it is TYPE_MIN_VALUE on ENUMERAL_TYPE,
thankfully TYPE_LANG_SLOT_1 has been used in the C FE only on
FUNCTION_TYPEs.

2020-03-17  Jakub Jelinek  <jakub@redhat.com>

PR c/94172
* c-tree.h (C_TYPE_INCOMPLETE_VARS): Define to TYPE_LANG_SLOT_1
instead of TYPE_VFIELD, and support it on {RECORD,UNION,ENUMERAL}_TYPE.
(TYPE_ACTUAL_ARG_TYPES): Check that it is only used on FUNCTION_TYPEs.
* c-decl.c (pushdecl): Push C_TYPE_INCOMPLETE_VARS also to
ENUMERAL_TYPEs.
(finish_incomplete_vars): New function, moved from finish_struct.  Use
relayout_decl instead of layout_decl.
(finish_struct): Remove obsolete comment about C_TYPE_INCOMPLETE_VARS
being TYPE_VFIELD.  Use finish_incomplete_vars.
(finish_enum): Clear C_TYPE_INCOMPLETE_VARS.  Call
finish_incomplete_vars.
* c-typeck.c (c_build_qualified_type): Clear C_TYPE_INCOMPLETE_VARS
also on ENUMERAL_TYPEs.

* gcc.dg/pr94172-1.c: New test.
* gcc.dg/pr94172-2.c: New test.

(cherry picked from commit 87ce34fa00cd6b87452d747235da40dfe5b6e00f)

c++: Fix parsing of invalid enum specifiers [PR90995]

The testcase shows some accepts-invalid (the ones without alignas) and
ice-on-invalid-code (the ones with alignas) cases.
If the enum doesn't have an underlying type and is not a definition,
the caller retries to parse it as elaborated type specifier.
E.g. for enum struct S s it will then pedwarn that elaborated type specifier
shouldn't have the struct/class keywords.
The problem is if the enum specifier is not followed by { when it has
underlying type.  In that case we have already called
cp_parser_parse_definitely to end the tentative parsing started at the
beginning of cp_parser_enum_specifier.  But the
cp_parser_error (parser, "expected %<;%> or %<{%>");
doesn't emit any error because the whole function is called from yet another
tentative parse and the caller starts parsing the elaborated type
specifier where the cp_parser_enum_specifier stopped (i.e. after the
underlying type token(s)).  The ultimate caller than commits the tentative
parsing (and even if it wouldn't, it wouldn't know what kind of error
to report).  I think after seeing enum {,struct,class} : type not being
followed by { or ;, there is no reason not to report it right away, as it
can't be valid C++, which is what the patch does.  Not sure if we shouldn't
also return error_mark_node instead of NULL_TREE, so that the caller doesn't
try to parse it as elaborated type specifier (the patch doesn't do that
right now).

Furthermore, while reading the code, I've noticed that
parser->colon_corrects_to_scope_p is saved and set to false at the start
of the function, but not restored back in some cases.  Don't have a testcase
where this would be a problem, but it just seems wrong.  Either we can in
the two spots replace return NULL_TREE; with { type = NULL_TREE; goto out; }
or we could perhaps abuse warning_sentinel or create a special class with
dtor to clean the flag up.

And lastly, I've fixed some formatting issues in the function while reading
it.

2020-03-17  Jakub Jelinek  <jakub@redhat.com>

PR c++/90995
* parser.c (cp_parser_enum_specifier): Use temp_override for
parser->colon_corrects_to_scope_p, replace goto out with return.
If scoped enum or enum with underlying type is not followed by
{ or ;, call cp_parser_commit_to_tentative_parse before calling
cp_parser_error and make sure to return error_mark_node instead of
NULL_TREE.  Formatting fixes.

* g++.dg/cpp0x/enum40.C: New test.

(cherry picked from commit 980a7a0be5a114e285c49ab05ac70881e4f27fc3)

tree-inline: Fix a -fcompare-debug issue in the inliner [PR94167]

The following testcase fails with -fcompare-debug.  The problem is that
bar is marked as address_taken only with -g and not without.
I've tracked it down to insert_init_stmt calling gimple_regimplify_operands
even on DEBUG_STMTs.  That function will just insert normal stmts before
the DEBUG_STMT if the DEBUG_STMT operand isn't gimple val or invariant.
While DCE will turn those statements into debug temporaries, it can cause
differences in SSA_NAMEs and more importantly, the ipa references are
generated from those before the DCE happens.
On the testcase, the DEBUG_STMT value is (int)bar.

We could generate DEBUG_STMTs with debug temporaries instead, but I fail to
see the reason to do that, DEBUG_STMTs allow other expressions and all we
want to ensure is that the expressions aren't too large (arbitrarily
complex), but during inlining/function versioning I don't see why something
would queue a DEBUG_STMT with arbitrarily complex expressions in there.

2020-03-16  Jakub Jelinek  <jakub@redhat.com>

PR debug/94167
* tree-inline.c (insert_init_stmt): Don't gimple_regimplify_operands
DEBUG_STMTs.

* gcc.dg/pr94167.c: New test.

(cherry picked from commit 378e830538afd4a02e41674cc9161fa59b5e09a9)

tree-nested: Fix handling of *reduction clauses with C array sections [PR93566]

tree-nested.c didn't handle C array sections in {,task_,in_}reduction clauses.

2020-03-14 Jakub Jelinek <jakub@redhat.com>

PR middle-end/93566
* tree-nested.c (convert_nonlocal_omp_clauses,
convert_local_omp_clauses): Handle {,in_,task_}reduction clauses
with C/C++ array sections.

* testsuite/libgomp.c/pr93566.c: New test.

(cherry picked from commit a8fc40fd551a60a97efbfe3fee08721accd80964)

aarch64: Fix another bug in aarch64_add_offset_1 [PR94121]

> I'm getting this ICE with -mabi=ilp32:
>
> during RTL pass: fwprop1
> /opt/gcc/gcc-20200312/gcc/testsuite/gcc.dg/pr94121.c: In function 'bar':
> /opt/gcc/gcc-20200312/gcc/testsuite/gcc.dg/pr94121.c:16:1: internal compiler error: in decompose, at rtl.h:2279

That is a preexisting issue, caused by another bug in the same function.
When mode is SImode and moffset is 0x80000000 (or anything else with the
bit 31 set), we need to sign-extend it.

2020-03-13 Jakub Jelinek <jakub@redhat.com>

PR target/94121
* config/aarch64/aarch64.c (aarch64_add_offset_1): Use gen_int_mode
instead of GEN_INT.

(cherry picked from commit c2f836c413b1e9ae45598338b4a2ecd33bd926fb)

maintainer-scripts: Fix up gcc_release without -l, where mkdir was using umask 077 after migration

2020-03-12 Jakub Jelinek <jakub@redhat.com>

* gcc_release (upload_files): Without -l, pass -m 755 to the mkdir
command invoked through ssh.

(cherry picked from commit 3739894d0cfc88b6d84134b827f33b31d646d32a)

doc: Fix up ASM_OUTPUT_ALIGNED_DECL_LOCAL description

When looking into PR94134, I've noticed bugs in the
ASM_OUTPUT_ALIGNED_DECL_LOCAL documentation.  varasm.c has:
  #if defined ASM_OUTPUT_ALIGNED_DECL_LOCAL
    unsigned int align = symtab_node::get (decl)->definition_alignment ();
    ASM_OUTPUT_ALIGNED_DECL_LOCAL (asm_out_file, decl, name,
                                   size, align);
    return true;
  #elif defined ASM_OUTPUT_ALIGNED_LOCAL
    unsigned int align = symtab_node::get (decl)->definition_alignment ();
    ASM_OUTPUT_ALIGNED_LOCAL (asm_out_file, name, size, align);
    return true;
  #else
    ASM_OUTPUT_LOCAL (asm_out_file, name, size, rounded);
    return false;
  #endif
and the ASM_OUTPUT_ALIGNED_LOCAL documentation properly mentions:
Like @code{ASM_OUTPUT_LOCAL} and mentions the same macro in another place.
The ASM_OUTPUT_ALIGNED_DECL_LOCAL description mentions non-existing macros
ASM_OUTPUT_ALIGNED_DECL and ASM_OUTPUT_DECL instead of the right ones
ASM_OUTPUT_ALIGNED_LOCAL and ASM_OUTPUT_LOCAL.

2020-03-12  Jakub Jelinek  <jakub@redhat.com>

* doc/tm.texi.in (ASM_OUTPUT_ALIGNED_DECL_LOCAL): Change
ASM_OUTPUT_ALIGNED_DECL in description to ASM_OUTPUT_ALIGNED_LOCAL
and ASM_OUTPUT_DECL to ASM_OUTPUT_LOCAL.
* doc/tm.texi: Regenerated.

(cherry picked from commit 9a8af207d7d03149a438185a2a0c50eeeb96a402)

tree-dse: Fix mem* head trimming if call has lhs [PR94130]

As the testcase shows, if DSE decides to head trim {mem{set,cpy,move},strncpy}
and the call has lhs, it is incorrect to leave the lhs as is, because it
will then point to the adjusted address (base + head_trim) instead of the
original base.
The following patch fixes that by dropping the lhs of the call and assigning
lhs the original base in a following statement.

2020-03-12 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/94130
* tree-ssa-dse.c: Include gimplify.h.
(increment_start_addr): If stmt has lhs, drop the lhs from call and
set it after the call to the original value of the first argument.
Formatting fixes.
(decrement_count): Formatting fix.

* gcc.c-torture/execute/pr94130.c: New test.

(cherry picked from commit a545ffafa380fa958393e1dfbf7f5f8f129bc5cf)

pdp11: Fix handling of common (local and global) vars [PR94134]

As mentioned in the PR, the generic code decides to put the a variable into
lcomm_section, which is a NOSWITCH section and thus the generic code doesn't
switch into a particular section before using
ASM_OUTPUT{_ALIGNED{,_DECL}_}_LOCAL, on many targets that results just in
.lcomm (or for non-local .comm) directives which don't need a switch to some
section, other targets put switch_to_section (bss_section) at the start of
that macro.
pdp11 doesn't do that (and doesn't have bss_section), and so emits the
lcomm/comm variables in whatever section is current (it has only .text/.data
and for DEC assembler rodata).

The following patch fixes that by putting it always into data section, and
additionally avoids emitting an empty line in the assembly for the lcomm
vars.

2020-03-11 Jakub Jelinek <jakub@redhat.com>

PR target/94134
* config/pdp11/pdp11.c (pdp11_asm_output_var): Call switch_to_section
at the start to switch to data section. Don't print extra newline if
.globl directive has not been emitted.

* gcc.c-torture/execute/pr94134.c: New test.

(cherry picked from commit f1125cf88ac0c97d819e4f81d556fbcd1161270e)

aarch64: Fix ICE in aarch64_add_offset_1 [PR94121]

abs_hwi asserts that the argument is not HOST_WIDE_INT_MIN and as the
(invalid) testcase shows, the function can be called with such an offset.
The following patch is IMHO minimal fix, absu_hwi unlike abs_hwi allows even
that value and will return (unsigned HOST_WIDE_INT) HOST_WIDE_INT_MIN
in that case.  The function then uses moffset in two spots which wouldn't
care if the value is (unsigned HOST_WIDE_INT) HOST_WIDE_INT_MIN or
HOST_WIDE_INT_MIN and wouldn't accept it (!moffset and
aarch64_uimm12_shift (moffset)), then in one spot where the signedness of
moffset does matter and using unsigned is the right thing -
moffset < 0x1000000 - and finally has code which will handle even this
value right; the assembler doesn't really care for DImode immediates if
        mov     x1, -9223372036854775808
or
        mov     x1, 9223372036854775808
is used and similarly it doesn't matter if we add or sub it in DImode.

2020-03-11  Jakub Jelinek  <jakub@redhat.com>

PR target/94121
* config/aarch64/aarch64.c (aarch64_add_offset_1): Use absu_hwi
instead of abs_hwi, change moffset type to unsigned HOST_WIDE_INT.

* gcc.dg/pr94121.c: New test.

(cherry picked from commit a644079a702a6228df2ffaace1d88a5f74e4bb9f)

dfp: Fix decimal_to_binary [PR94111]

As e.g. decimal_from_decnumber shows, the REAL_VALUE_TYPE representation
contains a decimal128 embedded in ->sig only if it is rvc_normal, for
other kinds like rvc_inf or rvc_nan, ->sig is ignored and everything is
contained in the REAL_VALUE_TYPE flags (cl, sign, signalling and decimal).
decimal_to_binary which is used when folding a decimal{32,64,128} constant
to a binary floating point type ignores this and thus folds infinities and
NaNs into +0.0.
The following patch fixes that by only doing that for rvc_normal.
Similarly to the binary to decimal folding, it goes through a string, in
order to e.g. deal with canonical NaN mantissas, or binary float formats
that don't support infinities and/or NaNs.

2020-03-11 Jakub Jelinek <jakub@redhat.com>

PR middle-end/94111
* dfp.c (decimal_to_binary): Only use decimal128ToString if from->cl
is rvc_normal, otherwise use real_to_decimal to print the number to
string.

* gcc.dg/dfp/pr94111.c: New test.

(cherry picked from commit 343c467ccdc24edb9acd7c60d54914d9656ab499)

ldist: Further fixes for -ftrapv [PR94114]

As the testcase shows, arithmetics that for -ftrapv would need multiple
basic blocks can show up not just in nb_bytes expressions where we
are calling rewrite_to_non_trapping_overflow for a while already,
but also in the pointer expression to the start of the region.
While the testcase covers just the first hunk and I've failed to create
a testcase for the latter, it is at least in theory possible too, so I've
adjusted that hunk too.

2020-03-11 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/94114
* tree-loop-distribution.c (generate_memset_builtin): Call
rewrite_to_non_trapping_overflow even on mem.
(generate_memcpy_builtin): Call rewrite_to_non_trapping_overflow even
on dest and src.

* gcc.dg/pr94114.c: New test.

(cherry picked from commit 2fd27691f213f2e808626c4cd492b00c801a00fa)

print-rtl: Fix printing of CONST_STRING in DEBUG_INSNs [PR93399]

The following testcase fails to assemble, as CONST_STRING in the DEBUG_INSNs
is printed as is, so if it contains \n and/or \r, we are in trouble:
        .loc 1 14 3
        # DEBUG haystack => [si]
        # DEBUG needle => "
"
In the gimple dumps we print those (STRING_CSTs) as
  # DEBUG haystack => D#1
  # DEBUG needle => "\n"
so this patch uses what we use in tree printing for the CONST_STRINGs too.

2020-03-05  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/93399
* tree-pretty-print.h (pretty_print_string): Declare.
* tree-pretty-print.c (pretty_print_string): Remove forward
declaration, no longer static.  Change nbytes parameter type
from unsigned to size_t.
* print-rtl.c (print_value) <case CONST_STRING>: Use
pretty_print_string and for shrink way too long strings.

* gcc.dg/pr93399.c: New test.

(cherry picked from commit e0d6777cda966b04fc47d544c09839c4fa94343c)

inliner: Copy DECL_BY_REFERENCE in copy_decl_to_var [PR93888]

In the following testcase we emit wrong debug info for the karg
parameter in the DW_TAG_inlined_subroutine into main.
The problem is that the karg PARM_DECL is DECL_BY_REFERENCE and thus
in the IL has const K & type, but in the source just const K.
When the function is inlined, we create a VAR_DECL for it, but don't
set DECL_BY_REFERENCE, so when emitting DW_AT_location, we treat it like
a const K & typed variable, but it has DW_AT_abstract_origin which has
just the const K type and thus the debugger thinks the variable has
const K type.

Fixed by copying the DECL_BY_REFERENCE flag. Not doing it in
copy_decl_for_dup_finish, because copy_decl_no_change already copies
that flag through copy_node and in copy_result_decl_to_var it is
undesirable, as we handle DECL_BY_REFERENCE in that case instead
by changing the type.

2020-03-04 Jakub Jelinek <jakub@redhat.com>

PR debug/93888
* tree-inline.c (copy_decl_to_var): Copy DECL_BY_REFERENCE flag.

* g++.dg/guality/pr93888.C: New test.

(cherry picked from commit d2a810ee83e2952bf351498cecf8f5db28860a24)

i386: Fix some -O0 avx2intrin.h and xopintrin.h intrinsic macros [PR94046]

As the testcases show, the macros we have for -O0 for intrinsics that require
constant argument(s) should first cast the argument to the type the -O1+
inline uses and afterwards to whatever type e.g. a builtin needs.
The PR reported one which violated this, and I've grepped for all double-casts
and grepped out from that meaningful casts where the __m{128,256,512}{,d,i}
first cast is cast to same sized __v* type and has the same kind of element
type (float, double, integral). These 7 macros were using different casts,
and I've double checked them against the inline function types.

2020-03-05 Jakub Jelinek <jakub@redhat.com>

PR target/94046
* config/i386/avx2intrin.h (_mm_mask_i32gather_ps): Fix first cast of
SRC and MASK arguments to __m128 from __m128d.
(_mm256_mask_i32gather_ps): Fix first cast of MASK argument to __m256
from __m256d.
(_mm_mask_i64gather_ps): Fix first cast of MASK argument to __m128
from __m128d.
* config/i386/xopintrin.h (_mm_permute2_pd): Fix first cast of C
argument to __m128i from __m128d.
(_mm256_permute2_pd): Fix first cast of C argument to __m256i from
__m256d.
(_mm_permute2_ps): Fix first cast of C argument to __m128i from __m128.
(_mm256_permute2_ps): Fix first cast of C argument to __m256i from
__m256.

* g++.dg/ext/pr94046-1.C: New test.
* g++.dg/ext/pr94046-2.C: New test.

(cherry picked from commit 07d52e63d999a0a10c7598c34c48365a357d3d5a)

explow: Fix ICE caused by plus_constant [PR94002]

The following testcase ICEs in cross to riscv64-linux.  The problem is
that we have a DImode integral constant (that doesn't fit into SImode),
which is pushed into a constant pool and later access just the first half of
it using a MEM.  When plus_constant is called on such a MEM, if the constant
has mode, we verify the mode, but if it doesn't, we don't and ICE later on
when we think the CONST_INT is a valid SImode constant.

2020-03-03  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/94002
* explow.c (plus_constant): Punt if cst has VOIDmode and
get_pool_mode is different from mode.

* gcc.dg/pr94002.c: New test.

(cherry picked from commit e913d4f4771e04d4254bf6c0e720fec5e324a898)

Daily bump.

[PATCH, rs6000] Fix vector long long subtype (PR96139)

Hi,
This corrects an issue with the powerpc vector long long subtypes.
As reported by SjMunroe, when building some code with -Wall, and
attempting to print an element of a "long long vector" with a
long long printf format string, we will report an error because
the vector sub-type was improperly defined as int.

When defining a V2DI_type_node we use a TARGET_POWERPC64 ternary to
define the V2DI_type_node with "vector long" or "vector long long".
We also need to specify the proper sub-type when we define the type.

Due to some file renames, This is a backport and rework of both
    [PATCH, rs6000] Fix vector long long subtype (PR96139)
and
    [PATCH, rs6000] Testsuite fixup pr96139 tests

PR target/96139

gcc/ChangeLog:
* config/rs6000/rs6000.c (rs6000_init_builtin): Update V2DI_type_node
  and unsigned_V2DI_type_node definitions.

gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr96139-a.c: New test.
* gcc.target/powerpc/pr96139-b.c: New test.
* gcc.target/powerpc/pr96139-c.c: New test.

add intrinsics for vld1(q)_x4 and vst1(q)_x4

This patch adds the intrinsic functions for:
- vld1_<mode>_x4
- vst1_<mode>_x4
- vld1q_<mode>_x4
- vst1q_<mode>_x4

Bootstrapped and tested on aarch64-none-linux-gnu.

Committed on behalf of Sylvia Taylor.

2019-08-22 Sylvia Taylor <sylvia.taylor@arm.com>

gcc/
* config/aarch64/aarch64-simd-builtins.def:
(ld1x4): New.
(st1x4): Likewise.
* config/aarch64/aarch64-simd.md:
(aarch64_ld1x4<VALLDIF:mode>): New pattern.
(aarch64_st1x4<VALLDIF:mode>): Likewise.
(aarch64_ld1_x4_<mode>): Likewise.
(aarch64_st1_x4_<mode>): Likewise.
* config/aarch64/arm_neon.h:
(vld1_s8_x4): New function.
(vld1q_s8_x4): Likewise.
(vld1_s16_x4): Likewise.
(vld1q_s16_x4): Likewise.
(vld1_s32_x4): Likewise.
(vld1q_s32_x4): Likewise.
(vld1_u8_x4): Likewise.
(vld1q_u8_x4): Likewise.
(vld1_u16_x4): Likewise.
(vld1q_u16_x4): Likewise.
(vld1_u32_x4): Likewise.
(vld1q_u32_x4): Likewise.
(vld1_f16_x4): Likewise.
(vld1q_f16_x4): Likewise.
(vld1_f32_x4): Likewise.
(vld1q_f32_x4): Likewise.
(vld1_p8_x4): Likewise.
(vld1q_p8_x4): Likewise.
(vld1_p16_x4): Likewise.
(vld1q_p16_x4): Likewise.
(vld1_s64_x4): Likewise.
(vld1_u64_x4): Likewise.
(vld1_p64_x4): Likewise.
(vld1q_s64_x4): Likewise.
(vld1q_u64_x4): Likewise.
(vld1q_p64_x4): Likewise.
(vld1_f64_x4): Likewise.
(vld1q_f64_x4): Likewise.
(vst1_s8_x4): Likewise.
(vst1q_s8_x4): Likewise.
(vst1_s16_x4): Likewise.
(vst1q_s16_x4): Likewise.
(vst1_s32_x4): Likewise.
(vst1q_s32_x4): Likewise.
(vst1_u8_x4): Likewise.
(vst1q_u8_x4): Likewise.
(vst1_u16_x4): Likewise.
(vst1q_u16_x4): Likewise.
(vst1_u32_x4): Likewise.
(vst1q_u32_x4): Likewise.
(vst1_f16_x4): Likewise.
(vst1q_f16_x4): Likewise.
(vst1_f32_x4): Likewise.
(vst1q_f32_x4): Likewise.
(vst1_p8_x4): Likewise.
(vst1q_p8_x4): Likewise.
(vst1_p16_x4): Likewise.
(vst1q_p16_x4): Likewise.
(vst1_s64_x4): Likewise.
(vst1_u64_x4): Likewise.
(vst1_p64_x4): Likewise.
(vst1q_s64_x4): Likewise.
(vst1q_u64_x4): Likewise.
(vst1q_p64_x4): Likewise.
(vst1_f64_x4): Likewise.
(vst1q_f64_x4): Likewise.

gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/vld1x4.c: New test.
* gcc.target/aarch64/advsimd-intrinsics/vst1x4.c: New test.

(cherry picked from commit 391625888d4d97f9016ab9ac04acc55d81f0c26f)

Patch implementing vld1_*_x3, vst1_*_x2 and vst1_*_x3 intrinsics for AARCH64 for all types.

2018-05-31 Sameera Deshpande <sameera.deshpande@linaro.org>

gcc/
* config/aarch64/aarch64-simd-builtins.def (ld1x3): New.
(st1x2): Likewise.
(st1x3): Likewise.
* config/aarch64/aarch64-simd.md
(aarch64_ld1x3<VALLDIF:mode>): New pattern.
(aarch64_ld1_x3_<mode>): Likewise
(aarch64_st1x2<VALLDIF:mode>): Likewise
(aarch64_st1_x2_<mode>): Likewise
(aarch64_st1x3<VALLDIF:mode>): Likewise
(aarch64_st1_x3_<mode>): Likewise
* config/aarch64/arm_neon.h (vld1_u8_x3): New function.
(vld1_s8_x3): Likewise.
(vld1_u16_x3): Likewise.
(vld1_s16_x3): Likewise.
(vld1_u32_x3): Likewise.
(vld1_s32_x3): Likewise.
(vld1_u64_x3): Likewise.
(vld1_s64_x3): Likewise.
(vld1_f16_x3): Likewise.
(vld1_f32_x3): Likewise.
(vld1_f64_x3): Likewise.
(vld1_p8_x3): Likewise.
(vld1_p16_x3): Likewise.
(vld1_p64_x3): Likewise.
(vld1q_u8_x3): Likewise.
(vld1q_s8_x3): Likewise.
(vld1q_u16_x3): Likewise.
(vld1q_s16_x3): Likewise.
(vld1q_u32_x3): Likewise.
(vld1q_s32_x3): Likewise.
(vld1q_u64_x3): Likewise.
(vld1q_s64_x3): Likewise.
(vld1q_f16_x3): Likewise.
(vld1q_f32_x3): Likewise.
(vld1q_f64_x3): Likewise.
(vld1q_p8_x3): Likewise.
(vld1q_p16_x3): Likewise.
(vld1q_p64_x3): Likewise.
(vst1_s64_x2): Likewise.
(vst1_u64_x2): Likewise.
(vst1_f64_x2): Likewise.
(vst1_s8_x2): Likewise.
(vst1_p8_x2): Likewise.
(vst1_s16_x2): Likewise.
(vst1_p16_x2): Likewise.
(vst1_s32_x2): Likewise.
(vst1_u8_x2): Likewise.
(vst1_u16_x2): Likewise.
(vst1_u32_x2): Likewise.
(vst1_f16_x2): Likewise.
(vst1_f32_x2): Likewise.
(vst1_p64_x2): Likewise.
(vst1q_s8_x2): Likewise.
(vst1q_p8_x2): Likewise.
(vst1q_s16_x2): Likewise.
(vst1q_p16_x2): Likewise.
(vst1q_s32_x2): Likewise.
(vst1q_s64_x2): Likewise.
(vst1q_u8_x2): Likewise.
(vst1q_u16_x2): Likewise.
(vst1q_u32_x2): Likewise.
(vst1q_u64_x2): Likewise.
(vst1q_f16_x2): Likewise.
(vst1q_f32_x2): Likewise.
(vst1q_f64_x2): Likewise.
(vst1q_p64_x2): Likewise.
(vst1_s64_x3): Likewise.
(vst1_u64_x3): Likewise.
(vst1_f64_x3): Likewise.
(vst1_s8_x3): Likewise.
(vst1_p8_x3): Likewise.
(vst1_s16_x3): Likewise.
(vst1_p16_x3): Likewise.
(vst1_s32_x3): Likewise.
(vst1_u8_x3): Likewise.
(vst1_u16_x3): Likewise.
(vst1_u32_x3): Likewise.
(vst1_f16_x3): Likewise.
(vst1_f32_x3): Likewise.
(vst1_p64_x3): Likewise.
(vst1q_s8_x3): Likewise.
(vst1q_p8_x3): Likewise.
(vst1q_s16_x3): Likewise.
(vst1q_p16_x3): Likewise.
(vst1q_s32_x3): Likewise.
(vst1q_s64_x3): Likewise.
(vst1q_u8_x3): Likewise.
(vst1q_u16_x3): Likewise.
(vst1q_u32_x3): Likewise.
(vst1q_u64_x3): Likewise.
(vst1q_f16_x3): Likewise.
(vst1q_f32_x3): Likewise.
(vst1q_f64_x3): Likewise.
(vst1q_p64_x3): Likewise.

gcc/testsuite/
* gcc.target/aarch64/advsimd-intrinsics/vld1x3.c: New test for
vld1x3 intrinsics for aarch64.
* gcc.target/aarch64/advsimd-intrinsics/vst1x2.c: New test for
vst1x2 intrinsics for aarch64.
* gcc.target/aarch64/advsimd-intrinsics/vst1x3.c: New test for
vst1x3 intrinsics for aarch64.

(cherry picked from commit 568421baa5a4cdb7bb7c5ac323c939492ee3f052)

Daily bump.

libstdc++: Fix is_trivially_constructible (PR 94033)

This attempts to make is_nothrow_constructible more robust (and
efficient to compile) by not depending on is_constructible. Instead the
__is_constructible intrinsic is used directly. The helper class
__is_nt_constructible_impl which checks whether the construction is
non-throwing now takes a bool template parameter that is substituted by
the result of the instrinsic. This fixes the reported bug by not using
the already-instantiated (and incorrect) value of std::is_constructible.
I don't think it really fixes the problem in general, because
std::is_nothrow_constructible itself could already have been
instantiated in a context where it gives the wrong result. A proper fix
needs to be done in the compiler.

Backported to the gcc-8 and gcc-9 branches to fix PR 96999.

PR libstdc++/94033
* include/std/type_traits (__is_nt_default_constructible_atom): Remove.
(__is_nt_default_constructible_impl): Remove.
(__is_nothrow_default_constructible_impl): Remove.
(__is_nt_constructible_impl): Add bool template parameter. Adjust
partial specializations.
(__is_nothrow_constructible_impl): Replace class template with alias
template.
(is_nothrow_default_constructible): Derive from alias template
__is_nothrow_constructible_impl instead of
__is_nothrow_default_constructible_impl.
* testsuite/20_util/is_nothrow_constructible/94003.cc: New test.

(cherry picked from commit b3341826531e80e02f194460b4fbe1b0541c0463)

libstdc++: Enable assertions in constexpr string_view members [PR 71960]

There is no longer any reason we can't just use __glibcxx_assert in
constexpr functions. As long as the condition is true, there will be no
call to std::__replacement_assert that would make the function
ineligible for constant evaluation.

PR libstdc++/71960
* include/experimental/string_view (basic_string_view):
Enable debug assertions.
* include/std/string_view (basic_string_view):
Likewise.

Daily bump.

PR fortran/96890 - Wrong answer with intrinsic IALL

The IALL intrinsic would always return 0 when the DIM and MASK arguments
were present since the initial value of repeated BIT-AND operations was
set to 0 instead of -1.

libgfortran/ChangeLog:

* m4/iall.m4: Initial value for result should be -1.
* generated/iall_i1.c (miall_i1): Generated.
* generated/iall_i16.c (miall_i16): Likewise.
* generated/iall_i2.c (miall_i2): Likewise.
* generated/iall_i4.c (miall_i4): Likewise.
* generated/iall_i8.c (miall_i8): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/iall_masked.f90: New test.

(cherry picked from commit 8eeeecbcc17041fdfd3ccc928161ae86e7f9b456)

Daily bump.

Update links to Arm docs

gcc/
* doc/extend.texi: Update links to Arm docs.
* doc/invoke.texi: Likewise.

(cherry picked from commit 09698e44c766c4a05ee463d2e36bc1fdac21dce4)
(cherry picked from commit 0fc33daacbdf993ab0d5830b0af3468b0df7c187)

AArch64: Fix hwasan failure in readline.

My previous fix added an unchecked call to fgets in the new function readline.
fgets can fail when there's an error reading the file in which case it returns
NULL. It also returns NULL when the next character is EOF.

The EOF case is already covered by the existing code but the error case isn't.
This fixes it by returning the empty string on error.

Also I now use strnlen instead of strlen to make sure we never read outside the
buffer.

This was flagged by Matthew Malcomson during his hwasan work.

gcc/ChangeLog:

* config/aarch64/driver-aarch64.c (readline): Check return value fgets.

(cherry picked from commit 341573406b392f4d57e052ce22f80e85a7c479e9)

AArch64: Add test for -mcpu=native

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cpunative/aarch64-cpunative.exp: New file.
* gcc.target/aarch64/cpunative/info_0: New test.
* gcc.target/aarch64/cpunative/info_1: New test.
* gcc.target/aarch64/cpunative/info_10: New test.
* gcc.target/aarch64/cpunative/info_11: New test.
* gcc.target/aarch64/cpunative/info_12: New test.
* gcc.target/aarch64/cpunative/info_13: New test.
* gcc.target/aarch64/cpunative/info_14: New test.
* gcc.target/aarch64/cpunative/info_15: New test.
* gcc.target/aarch64/cpunative/info_2: New test.
* gcc.target/aarch64/cpunative/info_3: New test.
* gcc.target/aarch64/cpunative/info_4: New test.
* gcc.target/aarch64/cpunative/info_5: New test.
* gcc.target/aarch64/cpunative/info_6: New test.
* gcc.target/aarch64/cpunative/info_7: New test.
* gcc.target/aarch64/cpunative/info_8: New test.
* gcc.target/aarch64/cpunative/info_9: New test.
* gcc.target/aarch64/cpunative/native_cpu_0.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_1.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_10.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_13.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_14.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_2.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_3.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_4.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_5.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_6.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_7.c: New test.
* gcc.target/aarch64/cpunative/native_cpu_8.c: New test.

(cherry picked from commit 8bc83ee378e1cac65d75752b5137ec35d9e1aca1)

Testuite: Document environment setting directives

This document some of the existing DejaGnu directives to modify
environment variables before test or compiler invocations.

gcc/ChangeLog:

* doc/sourcebuild.texi (dg-set-compiler-env-var,
dg-set-target-env-var): Document.

(cherry picked from commit 7c4491e33d1be16bfb85d448862a8b956d35e4d8)

Testsuite: Make it easier to debug environment setting functions

This adds verbose output to dg-set-compiler-env-var and dg-set-target-env-var
so you can actually see what they're setting when you add -v -v.

gcc/testsuite/ChangeLog:

* lib/gcc-dg.exp (dg-set-compiler-env-var, dg-set-target-env-var): Add
verbose output.

(cherry picked from commit e410cbff5d5a408b7c64a0c426951afc2a24df93)

Arm: Add GCC_CPUINFO override

This adds an in intentionally undocumented environment variable
GCC_CPUINFO which can be used to test -mcpu=native.

Tests using these are added later on.

gcc/ChangeLog:

* config/arm/driver-arm.c (host_detect_local_cpu): Add GCC_CPUINFO.

(cherry picked from commit 34a6c43487caf3a2a0ec9c7c79c526d116abc8b9)

AArch64: Add GCC_CPUINFO override

This adds an in intentionally undocumented environment variable
GCC_CPUINFO which can be used to test -mcpu=native.

Tests using this are added later on.

gcc/ChangeLog:

* config/aarch64/driver-aarch64.c (host_detect_local_cpu):
Add GCC_CPUINFO.

(cherry picked from commit 55f6addc0c102eab2bf19d94de3ce52f9de0ab91)

AArch64: Fix bugs in -mcpu=native detection.

This patch fixes a couple of issues in AArch64's -mcpu=native processing:

The buffer used to read the lines from /proc/cpuinfo is 128 bytes long.  While
this was enough in the past with the increase in architecture extensions it is
no longer enough.   It results in two bugs:

1) No option string longer than 127 characters is correctly parsed.  Features
   that are supported are silently ignored.

2) It incorrectly enables features that are not present on the machine:
  a) It checks for substring matching instead of full word matching.  This makes
     it incorrectly detect sb support when ssbs is provided instead.
  b) Due to the truncation at the 127 char border it also incorrectly enables
     features due to the full feature being cut off and the part that is left
     accidentally enables something else.

This breaks -mcpu=native detection on some of our newer system.

The patch fixes these issues by reading full lines up to the \n in a string.
This gives us the full feature line.  Secondly it creates a set from this string
to:

1) Reduce matching complexity from O(n*m) to O(n*logm).
2) Perform whole word matching instead of substring matching.

To make this code somewhat cleaner I also changed from using char* to using
std::string and std::set.

Note that I have intentionally avoided the use of ifstream and stringstream
to make it easier to backport.  I have also not change the substring matching
for the initial line classification as I cannot find a documented cpuinfo format
which leads me to believe there may be kernels out there that require this which
may be why the original code does this.

I also do not want this to break if the kernel adds a new line that is long and
indents the file by two tabs to keep everything aligned.  In short I think an
imprecise match is the right thing here.

Test for this is added as the last thing in this series as it requires some
changes to be made to be able to test this.

gcc/ChangeLog:

* config/aarch64/driver-aarch64.c (INCLUDE_SET): New.
(parse_field): Use std::string.
(split_words, readline, find_field): New.
(host_detect_local_cpu): Fix truncation issues.

(cherry picked from commit b399f3c6425f6c33b64e813899cbd589288ef716)

Daily bump.

i386: Fix restore_stack_nonlocal expander [PR96536].

-fcf-protection code in restore_stack_nonlocal uses a branch based on
a clobber result. The patch adds missing compare.

2020-08-18 Uroš Bizjak <ubizjak@gmail.com>

gcc/ChangeLog:

PR target/96536
* config/i386/i386.md (restore_stack_nonlocal):
Add missing compare RTX.

Daily bump.

testsuite: Add -fno-common to pr82374.c [PR94077]

As the PR comments show, the case gcc.dg/gomp/pr82374.c fails on
Power7 since gcc8. But it passes from gcc10. By looking into
the difference, it's due to that gcc10 sets -fno-common as default,
which makes vectorizer force the alignment and be able to use
aligned vector load/store on those targets which doesn't support
unaligned vector load/store (here it's Power7).

As Jakub suggested in the PR, this patch is to append -fno-common
into dg-options.

Verified with gcc8/gcc9 releases on ppc64-redhat-linux (Power7).

gcc/testsuite/ChangeLog:

PR testsuite/94077
* gcc.dg/gomp/pr82374.c: Add option -fno-common.

Daily bump.

libstdc++: Remove unused Makefile.in

This file was unintentionally added by r271568 when backporting a change
from trunk.

* src/c++17/Makefile.in: Remove unused file.

Daily bump.

aarch64: Fix up __aarch64_cas16_acq_rel fallback

As mentioned in the PR, the fallback path when LSE is unavailable writes
incorrect registers to the memory if the previous content compares equal
to x0, x1 - it writes copy of x0, x1 from the start of function, but it
should write x2, x3.

2020-08-03 Jakub Jelinek <jakub@redhat.com>

PR target/96402
* config/aarch64/lse.S (__aarch64_cas16_acq_rel): Use x2, x3 instead
of x(tmp0), x(tmp1) in STXP arguments.

* gcc.target/aarch64/pr96402.c: New test.

(cherry picked from commit 90b43856fdff7d96d93d22970eca8a86c56e0ddc)

libstdc++: Fix FS-dependent filesystem tests

These tests were failing on XFS because it doesn't support setting file
timestamps past 2038, so the expected overflow when reading back a huge
timestamp into a file_time_type didn't happen.

Additionally, the std::filesystem::file_time_type::clock has an
epoch that is out of range of 32-bit time_t so testing times around that
epoch may also fail.

This fixes the tests to give up gracefully if the filesystem doesn't
support times that can't be represented in 32-bit time_t.

Backport from mainline
2020-02-28 Jonathan Wakely <jwakely@redhat.com>

* testsuite/27_io/filesystem/operations/last_write_time.cc: Fixes for
filesystems that silently truncate timestamps.
* testsuite/experimental/filesystem/operations/last_write_time.cc:
Likewise.

(cherry picked from commit 2fa3247fef79ede9ec3638605ea137b0e4d76075)

Fix filesystem::last_write_time failure with 32-bit time_t

* testsuite/27_io/filesystem/operations/last_write_time.cc: Fix
test failures on targets with 32-bit time_t.

(cherry picked from commit 174f1d264274d3f77133713a3853fc016ba527b4)

libstdc++: Fix experimental::path::generic_string (PR 93245)

This function was unimplemented, simply returning the native format
string instead.

PR libstdc++/93245
* include/experimental/bits/fs_path.h (path::generic_string<C,T,A>()):
Return the generic format not the native format.
* testsuite/experimental/filesystem/path/generic/generic_string.cc:
Improve test coverage.

(cherry picked from commit a577c0c26931090e7c25e56ef5ffc807627961ec)

libstdc++: Fix path::generic_string allocator handling (PR 94242)

It's not possible to construct a path::string_type from an allocator of
a different type. Create the correct specialization of basic_string, and
adjust path::_S_str_convert to use a basic_string_view so that it is
independent of the allocator type.

PR libstdc++/94242
* include/bits/fs_path.h (path::_S_str_convert): Replace first
parameter with basic_string_view so that strings with different
allocators can be accepted.
(path::generic_string<C,T,A>()): Use basic_string object that uses the
right allocator type.
* testsuite/27_io/filesystem/path/generic/94242.cc: New test.
* testsuite/27_io/filesystem/path/generic/generic_string.cc: Improve
test coverage.

(cherry picked from commit 9fc985118d9f5014afc1caf32a411ee5803fba61)

Daily bump.