]> gcc.gnu.org Git - gcc.git/log
gcc.git
7 weeks ago[MAINTAINERS] Update my email address
Claudiu Zissulescu [Mon, 1 Jul 2024 07:49:29 +0000 (10:49 +0300)]
[MAINTAINERS] Update my email address

Update my email address.

ChangeLog:

* MAINTAINERS: Update claziss email address.

Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
7 weeks agotree-optimization/115694 - ICE with complex store rewrite
Richard Biener [Sun, 30 Jun 2024 11:07:14 +0000 (13:07 +0200)]
tree-optimization/115694 - ICE with complex store rewrite

The following adds a missed check when forwprop attempts to rewrite
a complex store.

PR tree-optimization/115694
* tree-ssa-forwprop.cc (pass_forwprop::execute): Check the
store is complex before rewriting it.

* g++.dg/torture/pr115694.C: New testcase.

7 weeks agoRemove vcond{,u,eq}<mode> expanders since they will be obsolete.
liuhongt [Mon, 24 Jun 2024 01:19:01 +0000 (09:19 +0800)]
Remove vcond{,u,eq}<mode> expanders since they will be obsolete.

gcc/ChangeLog:

PR target/115517
* config/i386/mmx.md (vcond<mode>v2sf): Removed.
(vcond<MMXMODE124:mode><MMXMODEI:mode>): Ditto.
(vcond<mode><mode>): Ditto.
(vcondu<MMXMODE124:mode><MMXMODEI:mode>): Ditto.
(vcondu<mode><mode>): Ditto.
* config/i386/sse.md (vcond<V_512:mode><VF_512:mode>): Ditto.
(vcond<V_256:mode><VF_256:mode>): Ditto.
(vcond<V_128:mode><VF_128:mode>): Ditto.
(vcond<VI2HFBF_AVX512VL:mode><VHF_AVX512VL:mode>): Ditto.
(vcond<V_512:mode><VI_AVX512BW:mode>): Ditto.
(vcond<V_256:mode><VI_256:mode>): Ditto.
(vcond<V_128:mode><VI124_128:mode>): Ditto.
(vcond<VI8F_128:mode>v2di): Ditto.
(vcondu<V_512:mode><VI_AVX512BW:mode>): Ditto.
(vcondu<V_256:mode><VI_256:mode>): Ditto.
(vcondu<V_128:mode><VI124_128:mode>): Ditto.
(vcondu<VI8F_128:mode>v2di): Ditto.
(vcondeq<VI8F_128:mode>v2di): Ditto.

7 weeks agoOptimize a < 0 ? -1 : 0 to (signed)a >> 31.
liuhongt [Thu, 20 Jun 2024 04:41:13 +0000 (12:41 +0800)]
Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
and x < 0 ? 1 : 0 into (unsigned) x >> 31.

Add define_insn_and_split for the optimization did in
ix86_expand_int_vcond.

gcc/ChangeLog:

PR target/115517
* config/i386/sse.md ("*ashr<mode>3_1"): New
define_insn_and_split.
(*avx512_ashr<mode>3_1): Ditto.
(*avx2_lshr<mode>3_1): Ditto.
(*avx2_lshr<mode>3_2): Ditto and add 2 combine splitter after
it.
* config/i386/mmx.md (mmxscalarsize): New mode attribute.
(*mmw_ashr<mode>3_1): New define_insn_and_split.
("mmx_<insn><mode>3): Add a combine spiltter after it.
(*mmx_ashrv2hi3_1): New define_insn_and_plit, also add a
combine splitter after it.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr111023-2.c: Adjust testcase.
* gcc.target/i386/vect-div-1.c: Ditto.

7 weeks agoAdjust testcase for the regressed testcases after obsolete of vcond{,u,eq}.
liuhongt [Wed, 19 Jun 2024 08:05:58 +0000 (16:05 +0800)]
Adjust testcase for the regressed testcases after obsolete of vcond{,u,eq}.

> Richard suggests that we implement the "obvious" transforms like
> inversion in the middle-end but if for example unsigned compares
> are not supported the us_minus + eq + negative trick isn't on
> that list.
>
> The main reason to restrict vec_cmp would be to avoid
> a <= b ? c : d going with an unsupported vec_cmp but instead
> do a > b ? d : c - the alternative is trying to fix this
> on the RTL side via combine.  I understand the non-native

Yes, I have a patch which can fix most regressions via pattern match
in combine.
Still there is a situation that is difficult to deal with, mainly the
optimization w/o sse4.1 . Because pblendvb/blendvps/blendvpd only
exists under sse4.1, w/o sse4.1, it takes 3
instructions (pand,pandn,por) to simulate the vcond_mask, and the
combine matches up to 4 instructions, which makes it currently
impossible to use the combine to recover those optimizations in the
vcond{,u,eq}.i.e min/max.

In the case of sse 4.1 and above, there is basically no regression anymore.

the regression testcases w/o sse4.1

FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++14  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++17  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++20  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++98  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++14  scan-assembler-times pcmpeqw 2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++17  scan-assembler-times pcmpeqw 2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++20  scan-assembler-times pcmpeqw 2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++98  scan-assembler-times pcmpeqw 2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++14  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++17  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++20  scan-assembler-times pcmpeqb 2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++98  scan-assembler-times pcmpeqb 2
FAIL: gcc.target/i386/pr88540.c scan-assembler minpd

gcc/testsuite/ChangeLog:

PR target/115517
* g++.target/i386/pr100637-1b.C: Add xfail and -mno-sse4.1.
* g++.target/i386/pr100637-1w.C: Ditto.
* g++.target/i386/pr103861-1.C: Ditto.
* gcc.target/i386/pr88540.c: Ditto.
* gcc.target/i386/pr103941-2.c: Add -mno-avx512f.
* g++.target/i386/sse4_1-pr100637-1b.C: New test.
* g++.target/i386/sse4_1-pr100637-1w.C: New test.
* g++.target/i386/sse4_1-pr103861-1.C: New test.
* gcc.target/i386/sse4_1-pr88540.c: New test.

7 weeks agoAdd more splitter for mskmov with avx512 comparison.
liuhongt [Wed, 19 Jun 2024 05:12:00 +0000 (13:12 +0800)]
Add more splitter for mskmov with avx512 comparison.

gcc/ChangeLog:

PR target/115517
* config/i386/sse.md
(*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_lt_avx512): New
define_insn_and_split.
(*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_<u>ext_lt_avx512):
Ditto.
(*<sse2_avx2>_pmovmskb_lt_avx512): Ditto.
(*<sse2_avx2>_pmovmskb_zext_lt_avx512): Ditto.
(*sse2_pmovmskb_ext_lt_avx512): Ditto.
(*pmovsk_kmask_v16qi_avx512): Ditto.
(*pmovsk_mask_v32qi_avx512): Ditto.
(*pmovsk_mask_cmp_<mode>_avx512): Ditto.
(*pmovsk_ptest_<mode>_avx512): Ditto.

7 weeks agoMatch IEEE min/max with UNSPEC_IEEE_{MIN,MAX}.
liuhongt [Tue, 18 Jun 2024 07:52:02 +0000 (15:52 +0800)]
Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}.

These versions of the min/max patterns implement exactly the operations
   min = (op1 < op2 ? op1 : op2)
   max = (!(op1 < op2) ? op1 : op2)

gcc/ChangeLog:
PR target/115517
* config/i386/sse.md (*minmax<mode>3_1): New pre_reload
define_insn_and_split.
(*minmax<mode>3_2): Ditto.

7 weeks agoLower AVX512 kmask comparison back to AVX2 comparison when op_{true,false} is vector...
liuhongt [Tue, 18 Jun 2024 06:03:42 +0000 (14:03 +0800)]
Lower AVX512 kmask comparison back to AVX2 comparison when op_{true,false} is vector -1/0.

gcc/ChangeLog
PR target/115517
* config/i386/sse.md
(*<avx512>_cvtmask2<ssemodesuffix><mode>_not): New pre_reload
splitter.
(*<avx512>_cvtmask2<ssemodesuffix><mode>_not): Ditto.
(*avx2_pcmp<mode>3_6): Ditto.
(*avx2_pcmp<mode>3_7): Ditto.

7 weeks agoAdd more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV)
liuhongt [Mon, 17 Jun 2024 09:16:46 +0000 (17:16 +0800)]
Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV)

These define_insn_and_split are needed after vcond{,u,eq} is obsolete.

gcc/ChangeLog:

PR target/115517
* config/i386/sse.md
(*<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>_gt): New
define_insn_and_split.
(*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_gtint):
Ditto.
(*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_not_gtint):
Ditto.
(*<sse4_1_avx2>_pblendvb_gt): Ditto.
(*<sse4_1_avx2>_pblendvb_gt_subreg_not): Ditto.

7 weeks agoEnable flate-combine.
liuhongt [Wed, 26 Jun 2024 05:52:24 +0000 (13:52 +0800)]
Enable flate-combine.

Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also
define target_insn_cost to prevent post_reload pass_late_combine to
revert the optimziation did in pass_rpad.

Adjust testcases since pass_late_combine generates better code but
break scan assembly.

.i.e
Under 32-bit target, gcc used to generate broadcast from stack and
then do the real operation.
After flate_combine, they're combined into embeded broadcast
operations.

gcc/ChangeLog:

* config/i386/i386-features.cc (ix86_rpad_gate): New function.
* config/i386/i386-options.cc (ix86_override_options_after_change):
Don't disable flate_combine.
* config/i386/i386-passes.def: Move pass_stv2 and pass_rpad
after pre_reload pas_late_combine.
* config/i386/i386-protos.h (ix86_rpad_gate): New declare.
* config/i386/i386.cc (ix86_insn_cost): New function.
(TARGET_INSN_COST): Define.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx512f-broadcast-pr87767-1.c: Adjus
testcase.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: Ditto.
* gcc.target/i386/avx512f-fmadd-sf-zmm-7.c: Ditto.
* gcc.target/i386/avx512f-fmsub-sf-zmm-7.c: Ditto.
* gcc.target/i386/avx512f-fnmadd-sf-zmm-7.c: Ditto.
* gcc.target/i386/avx512f-fnmsub-sf-zmm-7.c: Ditto.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Ditto.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Ditto.
* gcc.target/i386/pr91333.c: Ditto.
* gcc.target/i386/vect-strided-4.c: Ditto.

7 weeks agoExtend lshifrtsi3_1_zext to ?k alternative.
liuhongt [Wed, 26 Jun 2024 05:07:31 +0000 (13:07 +0800)]
Extend lshifrtsi3_1_zext to ?k alternative.

late_combine will combine lshift + zero into *lshifrtsi3_1_zext which
cause extra mov between gpr and kmask, add ?k to the pattern.

gcc/ChangeLog:

PR target/115610
* config/i386/i386.md (<*insnsi3_zext): Add alternative ?k,
enable it only for lshiftrt and under avx512bw.
* config/i386/sse.md (*klshrsi3_1_zext): New define_insn, and
add corresponding define_split after it.

7 weeks agoDefine mask as extern instead of uninitialized local variables.
liuhongt [Wed, 26 Jun 2024 03:17:46 +0000 (11:17 +0800)]
Define mask as extern instead of uninitialized local variables.

The testcases are supposed to scan for vpopcnt{b,w,d,q} operations
with k mask, but mask is defined as uninitialized local variable which
will be set as 0 at rtl expand phase.
And it's further simplified off by late_combine which caused scan assembly failure.
Move the definition of mask outside to make the testcases more stable.

gcc/testsuite/ChangeLog:

PR target/115610
* gcc.target/i386/avx512bitalg-vpopcntb.c: Define mask as
extern instead of uninitialized local variables.
* gcc.target/i386/avx512bitalg-vpopcntbvl.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntw.c: Ditto.
* gcc.target/i386/avx512bitalg-vpopcntwvl.c: Ditto.
* gcc.target/i386/avx512vpopcntdq-vpopcntd.c: Ditto.
* gcc.target/i386/avx512vpopcntdq-vpopcntq.c: Ditto.

7 weeks agoDaily bump.
GCC Administrator [Mon, 1 Jul 2024 00:17:45 +0000 (00:17 +0000)]
Daily bump.

7 weeks agohppa: Fix ICE caused by mismatched predicate and constraint in xmpyu patterns
John David Anglin [Sun, 30 Jun 2024 13:48:21 +0000 (09:48 -0400)]
hppa: Fix ICE caused by mismatched predicate and constraint in xmpyu patterns

2024-06-30  John David Anglin  <danglin@gcc.gnu.org>

gcc/ChangeLog:

PR target/115691
* config/pa/pa.md: Remove incorrect xmpyu patterns.

7 weeks agotree-optimization/115701 - fix maybe_duplicate_ssa_info_at_copy
Richard Biener [Sun, 30 Jun 2024 09:34:43 +0000 (11:34 +0200)]
tree-optimization/115701 - fix maybe_duplicate_ssa_info_at_copy

The following restricts copying of points-to info from defs that
might be in regions invoking UB and are never executed.

PR tree-optimization/115701
* tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy):
Only copy info from within the same BB.

* gcc.dg/torture/pr115701.c: New testcase.

7 weeks agotree-optimization/115701 - factor out maybe_duplicate_ssa_info_at_copy
Richard Biener [Sun, 30 Jun 2024 09:28:11 +0000 (11:28 +0200)]
tree-optimization/115701 - factor out maybe_duplicate_ssa_info_at_copy

The following factors out the code that preserves SSA info of the LHS
of a SSA copy LHS = RHS when LHS is about to be eliminated to RHS.

PR tree-optimization/115701
* tree-ssanames.h (maybe_duplicate_ssa_info_at_copy): Declare.
* tree-ssanames.cc (maybe_duplicate_ssa_info_at_copy): New
function, split out from ...
* tree-ssa-copy.cc (fini_copy_prop): ... here.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt): ...
and here.

7 weeks agoHarden SLP reduction support wrt STMT_VINFO_REDUC_IDX
Richard Biener [Thu, 27 Jun 2024 09:36:07 +0000 (11:36 +0200)]
Harden SLP reduction support wrt STMT_VINFO_REDUC_IDX

The following makes sure that for a SLP reductions all lanes have
the same STMT_VINFO_REDUC_IDX.  Once we move that info and can adjust
it we can implement swapping.  It also makes the existing protection
against operand swapping trigger for all stmts participating in a
reduction, not just the final one marked as reduction-def.

* tree-vect-slp.cc (vect_build_slp_tree_1): Compare
STMT_VINFO_REDUC_IDX.
(vect_build_slp_tree_2): Prevent operand swapping for
all stmts participating in a reduction.

7 weeks agovect: Determine input vectype for multiple lane-reducing operations
Feng Xue [Sun, 16 Jun 2024 05:00:32 +0000 (13:00 +0800)]
vect: Determine input vectype for multiple lane-reducing operations

The input vectype of reduction PHI statement must be determined before
vect cost computation for the reduction. Since lance-reducing operation has
different input vectype from normal one, so we need to traverse all reduction
statements to find out the input vectype with the least lanes, and set that to
the PHI statement.

2024-06-16 Feng Xue <fxue@os.amperecomputing.com>

gcc/
* tree-vect-loop.cc (vectorizable_reduction): Determine input vectype
during traversal of reduction statements.

7 weeks agovect: Fix shift-by-induction for single-lane slp
Feng Xue [Wed, 26 Jun 2024 14:02:53 +0000 (22:02 +0800)]
vect: Fix shift-by-induction for single-lane slp

Allow shift-by-induction for slp node, when it is single lane, which is
aligned with the original loop-based handling.

2024-06-26 Feng Xue <fxue@os.amperecomputing.com>

gcc/
* tree-vect-stmts.cc (vectorizable_shift): Allow shift-by-induction
for single-lane slp node.

gcc/testsuite/
* gcc.dg/vect/vect-shift-6.c
* gcc.dg/vect/vect-shift-7.c

8 weeks agoDaily bump.
GCC Administrator [Sun, 30 Jun 2024 00:16:45 +0000 (00:16 +0000)]
Daily bump.

8 weeks ago[PR115565] cse: Don't use a valid regno for non-register in comparison_qty
Maciej W. Rozycki [Sat, 29 Jun 2024 22:26:55 +0000 (23:26 +0100)]
[PR115565] cse: Don't use a valid regno for non-register in comparison_qty

Use INT_MIN rather than -1 in `comparison_qty' where a comparison is not
with a register, because the value of -1 is actually a valid reference
to register 0 in the case where it has not been assigned a quantity.

Using -1 makes `REG_QTY (REGNO (folded_arg1)) == ent->comparison_qty'
comparison in `fold_rtx' to incorrectly trigger in rare circumstances
and return true for a memory reference, making CSE consider a comparison
operation to evaluate to a constant expression and consequently make the
resulting code incorrectly execute or fail to execute conditional
blocks.

This has caused a miscompilation of rwlock.c from LinuxThreads for the
`alpha-linux-gnu' target, where `rwlock->__rw_writer != thread_self ()'
expression (where `thread_self' returns the thread pointer via a PALcode
call) has been decided to be always true (with `ent->comparison_qty'
using -1 for a reference to to `rwlock->__rw_writer', while register 0
holding the thread pointer retrieved by `thread_self') and code for the
false case has been optimized away where it mustn't have, causing
program lockups.

The issue has been observed as a regression from commit 08a692679fb8
("Undefined cse.c behaviour causes 3.4 regression on HPUX"),
<https://gcc.gnu.org/ml/gcc-patches/2004-10/msg02027.html>, and up to
commit 932ad4d9b550 ("Make CSE path following use the CFG"),
<https://gcc.gnu.org/ml/gcc-patches/2006-12/msg00431.html>, where CSE
has been restructured sufficiently for the issue not to trigger with the
original reproducer anymore.  However the original bug remains and can
trigger, because `comparison_qty' will still be assigned -1 for a memory
reference and the `reg_qty' member of a `cse_reg_info_table' entry will
still be assigned -1 for register 0 where the entry has not been
assigned a quantity, e.g. at initialization.

Use INT_MIN then as noted above, so that the value remains negative, for
consistency with the REGNO_QTY_VALID_P macro (even though not used on
`comparison_qty'), and then so that it should not ever match a valid
negated register number, fixing the regression with commit 08a692679fb8.

gcc/
PR rtl-optimization/115565
* cse.cc (record_jump_cond): Use INT_MIN rather than -1 for
`comparison_qty' if !REG_P.

8 weeks ago[to-be-committed,RISC-V,V4] movmem for RISCV with V extension
Sergei Lewis [Sat, 29 Jun 2024 20:34:31 +0000 (14:34 -0600)]
[to-be-committed,RISC-V,V4] movmem for RISCV with V extension

I hadn't updated my repo on the host where I handle email, so it picked
up the older version of this patch without the testsuite fix.  So, V4
with the testsuite option for lmul fixed.

--

And Sergei's movmem patch.  Just trivial testsuite adjustment for an
option name change and a whitespace fix from me.

I've spun this in my tester for rv32 and rv64.  I'll wait for pre-commit
CI before taking further action.

Just a reminder, this patch is designed to handle the case where we can
issue a single vector load/store which avoids all the complexities of
determining which direction to copy.

--

gcc/ChangeLog

* config/riscv/riscv.md (movmem<mode>): New expander.

gcc/testsuite/ChangeLog

PR target/112109
* gcc.target/riscv/rvv/base/movmem-1.c: New test

8 weeks agoFortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]
Harald Anlauf [Fri, 28 Jun 2024 19:44:06 +0000 (21:44 +0200)]
Fortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]

gcc/fortran/ChangeLog:

PR fortran/114019
* trans-stmt.cc (gfc_trans_allocate): Fix handling of case of
scalar character expression being used for SOURCE.

gcc/testsuite/ChangeLog:

PR fortran/114019
* gfortran.dg/allocate_with_source_33.f90: New test.

8 weeks agoMatch: Support imm form for unsigned scalar .SAT_ADD
Pan Li [Fri, 28 Jun 2024 03:33:41 +0000 (11:33 +0800)]
Match: Support imm form for unsigned scalar .SAT_ADD

This patch would like to support the form of unsigned scalar .SAT_ADD
when one of the op is IMM.  For example as below:

Form IMM:
  #define DEF_SAT_U_ADD_IMM_FMT_1(T)       \
  T __attribute__((noinline))              \
  sat_u_add_imm_##T##_fmt_1 (T x)          \
  {                                        \
    return (T)(x + 9) >= x ? (x + 9) : -1; \
  }

DEF_SAT_U_ADD_IMM_FMT_1(uint64_t)

Before this patch:
__attribute__((noinline))
uint64_t sat_u_add_imm_uint64_t_fmt_1 (uint64_t x)
{
  long unsigned int _1;
  uint64_t _3;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _1 = MIN_EXPR <x_2(D), 18446744073709551606>;
  _3 = _1 + 9;
  return _3;
;;    succ:       EXIT

}

After this patch:
__attribute__((noinline))
uint64_t sat_u_add_imm_uint64_t_fmt_1 (uint64_t x)
{
  uint64_t _3;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _3 = .SAT_ADD (x_2(D), 9); [tail call]
  return _3;
;;    succ:       EXIT

}

The below test suites are passed for this patch:
1. The rv64gcv fully regression test with newlib.
2. The x86 bootstrap test.
3. The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add imm form for .SAT_ADD matching.
* tree-ssa-math-opts.cc (math_opts_dom_walker::after_dom_children):
Add .SAT_ADD matching under PLUS_EXPR.

Signed-off-by: Pan Li <pan2.li@intel.com>
8 weeks agojit: Fix Darwin bootstrap after r15-1699.
Iain Sandoe [Sat, 29 Jun 2024 02:10:59 +0000 (03:10 +0100)]
jit: Fix Darwin bootstrap after r15-1699.

r15-1699-g445c62ee492 contains changes that trigger two maybe-uninitialized
warnings on Darwin, which result in a bootstrap failure.

Note that the warnings are false positives, in fact the variables should be
initialized in the cases of a switch (all values of the switch condition are
covered).

Fixed here by providing default initializations for the relevant variables.

gcc/jit/ChangeLog:

* jit-recording.cc
(recording::memento_of_typeinfo::make_debug_string): Default the value
of ident.
(recording::memento_of_typeinfo::write_reproducer): Default the value
of type.

Signed-off-by: Iain Sandoe <iains@gcc.gnu.org>
8 weeks ago[committed] Fix mcore-elf regression after recent IRA change
Jeff Law [Sat, 29 Jun 2024 00:36:50 +0000 (18:36 -0600)]
[committed] Fix mcore-elf regression after recent IRA change

So the recent IRA change exposed a bug in the mcore backend.

The mcore has a special instruction (xtrb3) which can zero extend a GPR into
R1.  It's useful because zextb requires a matching source/destination.
Unfortunately xtrb3 modifies CC.

The IRA changes twiddle register allocation such that we want to use xtrb3.
Unfortunately CC is live at the point where we want to use xtrb3 and clobbering
CC causes the test to fail.

Exposing the clobber in the expander and insn seems like the best path forward.
We could also drop the xtrb3 alternative, but that seems like it would hurt
codegen more than exposing the clobber.

The bitfield extraction patterns using xtrb look problematic as well, but I
didn't try to fix those.

This fixes the builtn-arith-overflow regressions and appears to fix
20010122-1.c as a side effect.

gcc/
* config/mcore/mcore.md  (zero_extendqihi2): Clobber CC in expander
and matching insn.
(zero_extendqisi2): Likewise.

8 weeks agoDaily bump.
GCC Administrator [Sat, 29 Jun 2024 00:17:22 +0000 (00:17 +0000)]
Daily bump.

8 weeks agoc++: bad 'this' conversion for nullary memfn [PR106760]
Patrick Palka [Fri, 28 Jun 2024 23:45:21 +0000 (19:45 -0400)]
c++: bad 'this' conversion for nullary memfn [PR106760]

Here we notice the 'this' conversion for the call f<void>() is bad, so
we correctly defer deduction for the template candidate, but we end up
never adding it to 'bad_cands' since missing_conversion_p for it returns
false (its only argument is 'this' which has already been determined to
be bad).  This is not a huge deal, but it causes us to longer accept the
call with -fpermissive in release builds, and a tree check ICE in checking
builds.

So if we have a non-strictly viable template candidate that has not been
instantiated, then we need to add it to 'bad_cands' even if no argument
conversion is missing.

PR c++/106760

gcc/cp/ChangeLog:

* call.cc (add_candidates): Relax test for adding a candidate
to 'bad_cands' to also accept an uninstantiated template candidate
that has no missing conversions.

gcc/testsuite/ChangeLog:

* g++.dg/ext/conv3.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
8 weeks agolibstdc++: Define __glibcxx_assert_fail for non-verbose build [PR115585]
Jonathan Wakely [Fri, 28 Jun 2024 14:14:15 +0000 (15:14 +0100)]
libstdc++: Define __glibcxx_assert_fail for non-verbose build [PR115585]

When the library is configured with --disable-libstdcxx-verbose the
assertions just abort instead of calling __glibcxx_assert_fail, and so I
didn't export that function for the non-verbose build. However, that
option is documented to not change the library ABI, so we still need to
export the symbol from the library. It could be needed by programs
compiled against the headers from a verbose build.

The non-verbose definition can just call abort so that it doesn't pull
in I/O symbols, which are unwanted in a non-verbose build.

libstdc++-v3/ChangeLog:

PR libstdc++/115585
* src/c++11/assert_fail.cc (__glibcxx_assert_fail): Add
definition for non-verbose builds.

8 weeks agolibstdc++: Extend std::equal memcmp optimization to std::byte [PR101485]
Jonathan Wakely [Fri, 28 Jun 2024 10:14:39 +0000 (11:14 +0100)]
libstdc++: Extend std::equal memcmp optimization to std::byte [PR101485]

We optimize std::equal to memcmp for integers and pointers, which means
that std::byte comparisons generate bigger code than char comparisons.

We can't use memcmp for arbitrary enum types, because they could have an
overloaded operator== that has custom semantics, but we know that
std::byte doesn't do that.

libstdc++-v3/ChangeLog:

PR libstdc++/101485
* include/bits/stl_algobase.h (__equal_aux1): Check for
std::byte as well.
* testsuite/25_algorithms/equal/101485.cc: New test.

8 weeks agolibstdc++: Do not use C++11 alignof in C++98 mode [PR104395]
Jonathan Wakely [Wed, 26 Jun 2024 13:09:07 +0000 (14:09 +0100)]
libstdc++: Do not use C++11 alignof in C++98 mode [PR104395]

When -faligned-new (or Clang's -faligned-allocation) is used our
allocators try to support extended alignments, gated on the
__cpp_aligned_new macro. However, because they use alignof(_Tp) which is
not a keyword in C++98 mode, using -std=c++98 -faligned-new results in
errors from <memory> and other headers.

We could change them to use __alignof__ instead of alignof, but that
would potentially alter the result of the conditions, because e.g.
alignof(long long) != __alignof__(long long) on some targets. That's
probably not an issue for any types with extended alignment, so maybe it
would be a safe change.

For now, it seems acceptable to just disable the extended alignment
support in C++98 mode, so that -faligned-new enables std::align_val_t
and the corresponding operator new overloads, but doesn't affect
std::allocator, __gnu_cxx::__bitmap_allocator etc.

libstdc++-v3/ChangeLog:

PR libstdc++/104395
* include/bits/new_allocator.h: Disable extended alignment
support in C++98 mode.
* include/bits/stl_tempbuf.h: Likewise.
* include/ext/bitmap_allocator.h: Likewise.
* include/ext/malloc_allocator.h: Likewise.
* include/ext/mt_allocator.h: Likewise.
* include/ext/pool_allocator.h: Likewise.
* testsuite/ext/104395.cc: New test.

8 weeks agolibstdc++: Simplify <ext/aligned_buffer.h> class templates
Jonathan Wakely [Wed, 26 Jun 2024 11:40:51 +0000 (12:40 +0100)]
libstdc++: Simplify <ext/aligned_buffer.h> class templates

As noted in a comment, the __gnu_cxx::__aligned_membuf class template
can be simplified, because alignof(T) and alignas(T) use the correct
alignment for a data member. That's true since GCC 8 and Clang 8. The
EDG front end (as used by Intel icc, aka "Intel C++ Compiler Classic")
does not implement the PR c++/69560 change, so keep using the old
implementation when __EDG__ is defined, to avoid an ABI change for icc.

For __gnu_cxx::__aligned_buffer<T> all supported compilers agree on the
value of __alignof__(T), but we can still simplify it by removing the
dependency on std::aligned_storage<sizeof(T), __alignof__(T)>.

Add a test that checks that the aligned buffer types have the expected
alignment, so that we can tell if changes like this affect their ABI
properties.

libstdc++-v3/ChangeLog:

* include/ext/aligned_buffer.h (__aligned_membuf): Use
alignas(T) directly instead of defining a struct and using 9its
alignment.
(__aligned_buffer): Remove use of std::aligned_storage.
* testsuite/abi/aligned_buffers.cc: New test.

8 weeks agossa_lazy_cache takes an optional bitmap_obstack pointer.
Andrew MacLeod [Wed, 26 Jun 2024 18:53:54 +0000 (14:53 -0400)]
ssa_lazy_cache takes an optional bitmap_obstack pointer.

Allow ssa_lazy cache to allocate bitmaps from a client provided obstack
if so desired.

* gimple-range-cache.cc (ssa_lazy_cache::ssa_lazy_cache): Relocate here.
Check for provided obstack.
(ssa_lazy_cache::~ssa_lazy_cache): Relocate here.  Free bitmap or obstack.
* gimple-range-cache.h (ssa_lazy_cache::ssa_lazy_cache): Move.
(ssa_lazy_cache::~ssa_lazy_cache): Move.
(ssa_lazy_cache::m_ob): New.
* gimple-range.cc (dom_ranger::dom_ranger): Iniitialize obstack.
(dom_ranger::~dom_ranger): Release obstack.
(dom_ranger::pre_bb): Create ssa_lazy_cache using obstack.
* gimple-range.h (m_bitmaps): New.

8 weeks agoi386: Cleanup tmp variable usage in ix86_expand_move
Uros Bizjak [Fri, 28 Jun 2024 15:49:43 +0000 (17:49 +0200)]
i386: Cleanup tmp variable usage in ix86_expand_move

Remove extra assignment, extra temp variable and variable shadowing.

No functional changes intended.

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_move): Remove extra
assignment to tmp variable, reuse tmp variable instead of
declaring new temporary variable and remove tmp variable shadowing.

8 weeks agoUse move-aware auto_vec in map
Jørgen Kvalsvik [Fri, 28 Jun 2024 06:35:31 +0000 (08:35 +0200)]
Use move-aware auto_vec in map

Using auto_vec rather than vec for means the vectors are release
automatically upon return, to stop the leak. The problem seems is that
auto_vec<T, N> is not really move-aware, only the <T, 0> specialization
is.

gcc/ChangeLog:

* tree-profile.cc (find_conditions): Use auto_vec without
embedded storage.

8 weeks agotree-optimization/115652 - more fixing of the fix
Richard Biener [Fri, 28 Jun 2024 11:29:21 +0000 (13:29 +0200)]
tree-optimization/115652 - more fixing of the fix

The following addresses the corner case of an outer loop with an empty
header where we end up asking for the BB of a NULL stmt by
special-casing this case.

PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Handle the case
where the outer loop header block is empty.

8 weeks agoi386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set...
Evgeny Karpov [Fri, 28 Jun 2024 12:37:12 +0000 (12:37 +0000)]
i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL [PR115635]

This patch fixes 3 bugs reported after merging the "Add DLL
import/export implementation to AArch64" series.
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/653955.html The
series refactors the i386 codebase to reuse it in AArch64, which
triggers some bugs.

Bug 115661 - [15 Regression] wrong code at -O{2,3} on x86_64-linux-gnu
since r15-1599-g63512c72df09b4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115661

Bug 115635 - [15 regression] Bootstrap fails with failed self-test
with the rust fe (diagnostic-path.cc:1153: test_empty_path: FAIL:
ASSERT_FALSE ((path.interprocedural_p ()))) since
r15-1599-g63512c72df09b4
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115635

Issue 1. In some code, i386 has been relying on the
legitimize_pe_coff_symbol call on all platforms and should return
NULL_RTX if it is not supported.

Fix: NULL_RTX handling has been added when the target does not support
PECOFF.

Issue 2. ix86_GOT_alias_set is used on all platforms and cannot be
extracted to mingw.

Fix: ix86_GOT_alias_set has been returned as it was and is used on all
platforms for i386.

Bug 115643 - [15 regression] aarch64-w64-mingw32 support today breaks
x86_64-w64-mingw32 build cannot represent relocation type BFD_RELOC_64
since r15-1602-ged20feebd9ea31
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115643

Issue 3. PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED has been added and
used with a negative operator for a complex expression without braces.

Fix: Braces has been added, and
PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED has been renamed to
PE_COFF_LEGITIMIZE_EXTERN_DECL.

2024-06-28  Evgeny Karpov <Evgeny.Karpov@microsoft.com>

gcc/ChangeLog:
PR bootstrap/115635
PR target/115643
PR target/115661
* config/aarch64/cygming.h
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Rename to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
(PE_COFF_LEGITIMIZE_EXTERN_DECL): Likewise.
* config/i386/cygming.h (GOT_ALIAS_SET): Remove the diffinition to
reuse it from i386.h.
(PE_COFF_EXTERN_DECL_SHOULD_BE_LEGITIMIZED): Rename to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
(PE_COFF_LEGITIMIZE_EXTERN_DECL): Likewise.
* config/i386/i386-expand.cc (ix86_expand_move): Return
ix86_GOT_alias_set.
* config/i386/i386-expand.h (ix86_GOT_alias_set): Likewise.
* config/i386/i386.cc (ix86_GOT_alias_set): Likewise.
* config/i386/i386.h (GOT_ALIAS_SET): Likewise.
* config/mingw/winnt-dll.cc (get_dllimport_decl): Use
GOT_ALIAS_SET.
(legitimize_pe_coff_symbol): Rename to
PE_COFF_LEGITIMIZE_EXTERN_DECL.
* config/mingw/winnt-dll.h (ix86_GOT_alias_set): Declare
ix86_GOT_alias_set.

8 weeks agoRemove unused hybrid_* operators in range-ops.
Aldy Hernandez [Fri, 28 Jun 2024 09:27:24 +0000 (11:27 +0200)]
Remove unused hybrid_* operators in range-ops.

gcc/ChangeLog:

* range-op-ptr.cc (class hybrid_and_operator): Remove.
(class hybrid_or_operator): Same.
(class hybrid_min_operator): Same.
(class hybrid_max_operator): Same.

8 weeks agotree-optimization/115640 - outer loop vect with inner SLP permute
Richard Biener [Wed, 26 Jun 2024 12:07:51 +0000 (14:07 +0200)]
tree-optimization/115640 - outer loop vect with inner SLP permute

The following fixes wrong-code when using outer loop vectorization
and an inner loop SLP access with permutation.  A wrong adjustment
to the IV increment is then applied on GCN.

PR tree-optimization/115640
* tree-vect-stmts.cc (vectorizable_load): With an inner
loop SLP access to not apply a gap adjustment.

8 weeks agoamdgcn: Fix RDNA V32 permutations [PR115640]
Andrew Stubbs [Fri, 28 Jun 2024 10:47:50 +0000 (10:47 +0000)]
amdgcn: Fix RDNA V32 permutations [PR115640]

There was an off-by-one error in the RDNA validation check, plus I forgot to
allow for two-to-one permute-and-merge operations.

PR target/115640

gcc/ChangeLog:

* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Modify RDNA checks.

8 weeks agoAdd gfc_class_set_vptr.
Andre Vehreschild [Tue, 11 Jun 2024 10:52:26 +0000 (12:52 +0200)]
Add gfc_class_set_vptr.

First step to adding a general assign all class type's data members
routine.  Having a general routine prevents forgetting to tackle the
edge cases, e.g. setting _len.

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_class_set_vptr): Add setting of _vptr
member.
* trans-intrinsic.cc (conv_intrinsic_move_alloc): First use
of gfc_class_set_vptr and refactor very similar code.
* trans.h (gfc_class_set_vptr): Declare the new function.

gcc/testsuite/ChangeLog:

* gfortran.dg/unlimited_polymorphic_11.f90: Remove unnecessary
casts in gd-final expression.

8 weeks agoUse gfc_reset_vptr more consistently.
Andre Vehreschild [Fri, 7 Jun 2024 06:57:36 +0000 (08:57 +0200)]
Use gfc_reset_vptr more consistently.

The vptr for a class type is set in various ways in different
locations.  Refactor the use and simplify code.

gcc/fortran/ChangeLog:

* trans-array.cc (structure_alloc_comps): Use reset_vptr.
* trans-decl.cc (gfc_trans_deferred_vars): Same.
(gfc_generate_function_code): Same.
* trans-expr.cc (gfc_reset_vptr): Allow supplying the class
type.
(gfc_conv_procedure_call): Use reset_vptr.
* trans-intrinsic.cc (gfc_conv_intrinsic_transfer): Same.

8 weeks agoi386: Handle sign_extend like zero_extend in *concatditi3_[346]
Roger Sayle [Fri, 28 Jun 2024 06:16:07 +0000 (07:16 +0100)]
i386: Handle sign_extend like zero_extend in *concatditi3_[346]

This patch generalizes some of the patterns in i386.md that recognize
double word concatenation, so they handle sign_extend the same way that
they handle zero_extend in appropriate contexts.

As a motivating example consider the following function:

__int128 foo(long long x, unsigned long long y)
{
  return ((__int128)x<<64) | y;
}

when compiled with -O2, x86_64 currently generates:

foo: movq    %rdi, %rdx
        xorl    %eax, %eax
        xorl    %edi, %edi
        orq     %rsi, %rax
        orq     %rdi, %rdx
        ret

with this patch we now generate (the same as if x is unsigned):

foo: movq    %rsi, %rax
        movq    %rdi, %rdx
        ret

Treating both extensions the same way using any_extend is valid as
the top (extended) bits are "unused" after the shift by 64 (or more).
In theory, the RTL optimizers might consider canonicalizing the form
of extension used in these cases, but zero_extend is faster on some
machine, whereas sign extension is supported via addressing modes on
others, so handling both in the machine description is probably best.

2024-06-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386.md (*concat<mode><dwi>3_3): Change zero_extend
to any_extend in first operand to left shift by mode precision.
(*concat<mode><dwi>3_4): Likewise.
(*concat<mode><dwi>3_6): Likewise.

gcc/testsuite/ChangeLog
* gcc.target/i386/concatditi-1.c: New test case.

8 weeks agoi386: Some additional AVX512 ternlog refinements.
Roger Sayle [Fri, 28 Jun 2024 06:12:53 +0000 (07:12 +0100)]
i386: Some additional AVX512 ternlog refinements.

This patch is another round of refinements to fine tune the new ternlog
infrastructure in i386's sse.md.  This patch tweaks ix86_ternlog_idx
to allow multiple MEM/CONST_VECTOR/VEC_DUPLICATE operands prior to
splitting (before reload), when force_register is called on all but
one of these operands.  Conceptually during the dynamic programming,
registers fill the args slots in the order 0, 1, 2, and mem-like
operands fill the slots in the order 2, 0, 1 [preferring the memory
operand to come last].

This patch allows us to remove some of the legacy ternlog patterns
in sse.md without regressions [which is left to the next and final
patch in this series].  An indication that these patterns are no
longer required is shown by the necessary testsuite tweaks below,
where the output assembler for the legacy instructions used hexadecimal,
but with the new ternlog infrastructure now consistently use decimal.

2024-06-28  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_ternlog_idx) <case VEC_DUPLICATE>:
Add a "goto do_mem_operand" as this need not match memory_operand.
<case CONST_VECTOR>: Only args[2] may be volatile memory operand.
Allow MEM/VEC_DUPLICATE/CONST_VECTOR as args[0] and args[1].

gcc/testsuite/ChangeLog
* gcc.target/i386/avx512f-andn-di-zmm-2.c: Match decimal instead
of hexadecimal immediate operand to ternlog.
* gcc.target/i386/avx512f-andn-si-zmm-2.c: Likewise.
* gcc.target/i386/avx512f-orn-si-zmm-1.c: Likewise.
* gcc.target/i386/avx512f-orn-si-zmm-2.c: Likewise.
* gcc.target/i386/pr100711-3.c: Likewise.
* gcc.target/i386/pr100711-4.c: Likewise.
* gcc.target/i386/pr100711-5.c: Likewise.

8 weeks agoDaily bump.
GCC Administrator [Fri, 28 Jun 2024 00:18:04 +0000 (00:18 +0000)]
Daily bump.

8 weeks agolibgccjit: Add ability to get the alignment of a type
Antoni Boucher [Thu, 4 Apr 2024 22:57:07 +0000 (18:57 -0400)]
libgccjit: Add ability to get the alignment of a type

gcc/jit/ChangeLog:

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_28): New ABI tag.
* docs/topics/expressions.rst: Document gcc_jit_context_new_alignof.
* jit-playback.cc (new_alignof): New method.
* jit-playback.h: New method.
* jit-recording.cc (recording::context::new_alignof): New
method.
(recording::memento_of_sizeof::replay_into,
recording::memento_of_typeinfo::replay_into,
recording::memento_of_sizeof::make_debug_string,
recording::memento_of_typeinfo::make_debug_string,
recording::memento_of_sizeof::write_reproducer,
recording::memento_of_typeinfo::write_reproducer): Rename.
* jit-recording.h (enum type_info_type): New enum.
(class memento_of_sizeof class memento_of_typeinfo): Rename.
* libgccjit.cc (gcc_jit_context_new_alignof): New function.
* libgccjit.h (gcc_jit_context_new_alignof): New function.
* libgccjit.map: New function.

gcc/testsuite/ChangeLog:

* jit.dg/all-non-failing-tests.h: New test.
* jit.dg/test-alignof.c: New test.

8 weeks agoc: Error message for incorrect use of static in array declarations.
Martin Uecker [Thu, 27 Jun 2024 19:47:56 +0000 (21:47 +0200)]
c: Error message for incorrect use of static in array declarations.

Add an explicit error messages when c99's static is
used without a size expression in an array declarator.

gcc/c:
* c-parser.cc (c_parser_direct_declarator_inner): Add
error message.

gcc/testsuite:
* gcc.dg/c99-arraydecl-4.c: New test.

8 weeks agofixincludes: adjust stdio fix for macOS 15 headers
Francois-Xavier Coudert [Thu, 27 Jun 2024 16:55:22 +0000 (18:55 +0200)]
fixincludes: adjust stdio fix for macOS 15 headers

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (apple_local_stdio_fn_deprecation): Also apply to
_stdio.h.

8 weeks agoDisable late-combine for -O0 [PR115677]
Richard Sandiford [Thu, 27 Jun 2024 13:51:37 +0000 (14:51 +0100)]
Disable late-combine for -O0 [PR115677]

late-combine relies on df, which for -O0 is only initialised late
(pass_df_initialize_no_opt, after split1).  Other df-based passes
cope with this by requiring optimize > 0, so this patch does the
same for late-combine.

gcc/
PR rtl-optimization/115677
* late-combine.cc (pass_late_combine::gate): New function.

8 weeks agos390: Check for ADDR_REGS in s390_decompose_addrstyle_without_index
Stefan Schulze Frielinghaus [Thu, 27 Jun 2024 13:46:24 +0000 (15:46 +0200)]
s390: Check for ADDR_REGS in s390_decompose_addrstyle_without_index

An explicit check for address registers was not required so far since
during register allocation the processing of address constraints was
sufficient.  However, address constraints themself do not check for
REGNO_OK_FOR_{BASE,INDEX}_P.  Thus, with the newly introduced
late-combine pass in r15-1579-g792f97b44ffc5e we generate new insns with
invalid address registers which aren't fixed up afterwards.

Fixed by explicitly checking for address registers in
s390_decompose_addrstyle_without_index such that those new insns are
rejected.

gcc/ChangeLog:

PR target/115634
* config/s390/s390.cc (s390_decompose_addrstyle_without_index):
Check for ADDR_REGS in s390_decompose_addrstyle_without_index.

8 weeks agotree-optimization/115669 - fix SLP reduction association
Richard Biener [Thu, 27 Jun 2024 09:26:08 +0000 (11:26 +0200)]
tree-optimization/115669 - fix SLP reduction association

The following avoids associating a reduction path as that might
get STMT_VINFO_REDUC_IDX out-of-sync with the SLP operand order.
This is a latent issue with SLP reductions but now easily exposed
as we're doing single-lane SLP reductions.

When we achieved SLP only we can move and update this meta-data.

PR tree-optimization/115669
* tree-vect-slp.cc (vect_build_slp_tree_2): Do not reassociate
chains that participate in a reduction.

* gcc.dg/vect/pr115669.c: New testcase.

8 weeks agolibstdc++: Fix std::codecvt<wchar_t, char, mbstate_t> for empty dest [PR37475]
Jonathan Wakely [Tue, 11 Jun 2024 15:45:43 +0000 (16:45 +0100)]
libstdc++: Fix std::codecvt<wchar_t, char, mbstate_t> for empty dest [PR37475]

For the GNU locale model, codecvt::do_out and codecvt::do_in incorrectly
return 'ok' when the destination range is empty. That happens because
detecting incomplete output is done in the loop body, and the loop is
never even entered if to == to_end.

By restructuring the loop condition so that we check the output range
separately, we can ensure that for a non-empty source range, we always
enter the loop at least once, and detect if the destination range is too
small.

The loops also seem easier to reason about if we return immediately on
any error, instead of checking the result twice on every iteration. We
can use an RAII type to restore the locale before returning, which also
simplifies all the other member functions.

libstdc++-v3/ChangeLog:

PR libstdc++/37475
* config/locale/gnu/codecvt_members.cc (Guard): New RAII type.
(do_out, do_in): Return partial if the destination is empty but
the source is not. Use Guard to restore locale on scope exit.
Return immediately on any conversion error.
(do_encoding, do_max_length, do_length): Use Guard.
* testsuite/22_locale/codecvt/in/char/37475.cc: New test.
* testsuite/22_locale/codecvt/in/wchar_t/37475.cc: New test.
* testsuite/22_locale/codecvt/out/char/37475.cc: New test.
* testsuite/22_locale/codecvt/out/wchar_t/37475.cc: New test.

8 weeks ago[libstdc++] [testsuite] defer to check_vect_support* [PR115454]
Alexandre Oliva [Thu, 27 Jun 2024 10:22:48 +0000 (07:22 -0300)]
[libstdc++] [testsuite] defer to check_vect_support* [PR115454]

The newly-added testcase overrides the default dg-do action set by
check_vect_support_and_set_flags (in libstdc++-dg/conformance.exp), so
it attempts to run the test even if runtime vector support is not
available.

Remove the explicit dg-do directive, so that the default is honored,
and the test is run if vector support is found, and only compiled
otherwise.

for  libstdc++-v3/ChangeLog

PR libstdc++/115454
* testsuite/experimental/simd/pr115454_find_last_set.cc: Defer
to check_vect_support_and_set_flags's default dg-do action.

8 weeks agoAvoid global bitmap space in ranger.
Aldy Hernandez [Wed, 19 Jun 2024 09:42:16 +0000 (11:42 +0200)]
Avoid global bitmap space in ranger.

gcc/ChangeLog:

* gimple-range-cache.cc (update_list::update_list): Add m_bitmaps.
(update_list::~update_list): Initialize m_bitmaps.
* gimple-range-cache.h (ssa_lazy_cache): Add m_bitmaps.
* gimple-range.cc (enable_ranger): Remove global bitmap
initialization.
(disable_ranger): Remove global bitmap release.

8 weeks agolibstdc++: Fix std::format for chrono::duration with unsigned rep [PR115668]
Jonathan Wakely [Wed, 26 Jun 2024 19:22:54 +0000 (20:22 +0100)]
libstdc++: Fix std::format for chrono::duration with unsigned rep [PR115668]

Using std::chrono::abs is only valid if numeric_limits<rep>::is_signed
is true, so using it unconditionally made it ill-formed to format a
duration with an unsigned rep.

The duration formatter might as negate the duration itself instead of
using chrono::abs, because it already needs to check for a negative
value.

libstdc++-v3/ChangeLog:

PR libstdc++/115668
* include/bits/chrono_io.h (formatter<duration<R,P, C>::format):
Do not use chrono::abs.
* testsuite/20_util/duration/io.cc: Check formatting a duration
with unsigned rep.

8 weeks agolibstdc++: Add debug assertions to std::vector<bool> [PR103191]
Jonathan Wakely [Tue, 18 Jun 2024 09:57:45 +0000 (10:57 +0100)]
libstdc++: Add debug assertions to std::vector<bool> [PR103191]

This adds debug assertions for std::vector<bool> element access.

libstdc++-v3/ChangeLog:

PR libstdc++/103191
* include/bits/stl_bvector.h (vector<bool>::operator[])
(vector<bool>::front, vector<bool>::back): Add debug assertions.
* testsuite/23_containers/vector/bool/element_access/constexpr.cc:
Remove dg-error that no longer triggers.

8 weeks agolibstdc++: Enable more debug assertions during constant evaluation [PR111250]
Jonathan Wakely [Tue, 18 Jun 2024 19:57:13 +0000 (20:57 +0100)]
libstdc++: Enable more debug assertions during constant evaluation [PR111250]

Some of our debug assertions expand to nothing unless
_GLIBCXX_ASSERTIONS is defined, which means they are not checked during
constant evaluation. By making them unconditionally expand to a
__glibcxx_assert expression they will be checked during constant
evaluation. This allows us to diagnose more instances of undefined
behaviour at compile-time, such as accessing a vector past-the-end.

libstdc++-v3/ChangeLog:

PR libstdc++/111250
* include/debug/assertions.h (__glibcxx_requires_non_empty_range)
(__glibcxx_requires_nonempty, __glibcxx_requires_subscript):
Define to __glibcxx_assert expressions or to debug mode
__glibcxx_check_xxx expressions.
* testsuite/23_containers/array/element_access/constexpr_c++17.cc:
Add checks for out-of-bounds accesses in constant expressions.
* testsuite/23_containers/vector/element_access/constexpr.cc:
Likewise.

8 weeks agoada: Remove last uses of System.Address_Operations in runtime library
Eric Botcazou [Wed, 12 Jun 2024 14:05:57 +0000 (16:05 +0200)]
ada: Remove last uses of System.Address_Operations in runtime library

This completes the switch from using System.Address_Operations to using only
System.Storage_Elements in the runtime library.  The remaining uses were for
simple optimizations that can be done by the optimizer alone.

gcc/ada/

* libgnat/s-carsi8.adb: Remove clauses for System.Address_Operations
and use only operations of System.Storage_Elements for addresses.
* libgnat/s-casi16.adb: Likewise.
* libgnat/s-casi32.adb: Likewise.
* libgnat/s-casi64.adb: Likewise.
* libgnat/s-casi128.adb: Likewise.
* libgnat/s-carun8.adb: Likewise.
* libgnat/s-caun16.adb: Likewise.
* libgnat/s-caun32.adb: Likewise.
* libgnat/s-caun64.adb: Likewise.
* libgnat/s-caun128.adb: Likewise.
* libgnat/s-geveop.adb: Likewise.

8 weeks agoada: Reject ambiguous function calls in interpolated string expressions
Javier Miranda [Mon, 10 Jun 2024 17:17:59 +0000 (17:17 +0000)]
ada: Reject ambiguous function calls in interpolated string expressions

gcc/ada/

* sem_ch2.adb (Analyze_Interpolated_String_Literal): Report
interpretations of ambiguous parameterless function calls.

8 weeks agoada: Add missing dimension information for target names
Eric Botcazou [Tue, 11 Jun 2024 17:29:22 +0000 (19:29 +0200)]
ada: Add missing dimension information for target names

It is computed from the Etype of N_Target_Name nodes.

gcc/ada/

* sem_ch5.adb (Analyze_Target_Name): Call Analyze_Dimension on the
node once the Etype is set.
* sem_dim.adb (OK_For_Dimension): Set to True for N_Target_Name.
(Analyze_Dimension): Call Analyze_Dimension_Has_Etype for it.

8 weeks agoada: Fix array-manipulating code in Mdll
Ronan Desplanques [Thu, 2 May 2024 07:52:34 +0000 (09:52 +0200)]
ada: Fix array-manipulating code in Mdll

This patch fixes a duo of array assigments in Mdll that were bound
to fail.

gcc/ada/

* mdll.adb (Build_Non_Reloc_DLL): Fix incorrect assignment
to array object.
(Ada_Build_Non_Reloc_DLL): Likewise.

8 weeks agoada: Bug using user defined string literals with interpolated strings
Javier Miranda [Thu, 6 Jun 2024 11:48:02 +0000 (11:48 +0000)]
ada: Bug using user defined string literals with interpolated strings

The frontend rejects the use of user defined string literals
using interpolated strings.

gcc/ada/

* sem_res.adb (Has_Applicable_User_Defined_Literal): Add missing
support for interpolated strings.

8 weeks agoada: Overridden operation field not correctly set for controlling result wrappers
Martin Clochard [Fri, 7 Jun 2024 09:44:45 +0000 (11:44 +0200)]
ada: Overridden operation field not correctly set for controlling result wrappers

Implicit wrapper overridings generated for functions with
controlling result when deriving with null extension may
have field Overridden_Operation incorrectly set, when making
several such derivations in succession. This happens because
overridings were assumed to come from source, and entities
generated by Derive_Subprograms were also assumed to be
derived from source subprograms. Overridden_Operation could
be set to the entity generated by Derive_Subprograms for the
same type, resulting in a cycle between Overriden_Operation
and Alias fields, causing non-termination in GNATprove.

gcc/ada/

* sem_ch6.adb (Check_Overriding_Indicator) Remove Comes_From_Source filter.
(New_Overloaded_Entity) Move up special case of LSP_Subprogram,
and remove Comes_From_Source filter.

8 weeks agoada: Implement first half of Generalized Finalization
Eric Botcazou [Wed, 5 Jun 2024 21:19:53 +0000 (23:19 +0200)]
ada: Implement first half of Generalized Finalization

This implements the first half of the Generalized Finalization proposal,
namely the Finalizable aspect as well as its optional relaxed semantics
for the finalization operations, but the latter part is only implemented
for dynamically allocated objects.

In accordance with the spirit, if not the letter, of the proposal, this
implements the finalizable types declared with strict semantics for the
finalization operations as a direct generalization of controlled types,
which in turn makes it possible to reimplement the latter types in terms
of the former types and ensures full interoperability between them.

The relaxed semantics for the finalization operations is also a direct
generalization of the GNAT pragma No_Heap_Finalization for dynamically
allocated objects, in that it extends the effects of the pragma to all
access types designating the finalizable type, instead of just applying
them to library-level named access types.

gcc/ada/

* aspects.ads (Aspect_Id): Add Aspect_Finalizable.
(Implementation_Defined_Aspect): Add True for Aspect_Finalizable.
(Operational_Aspect): Add True for Aspect_Finalizable.
(Aspect_Argument): Add Expression for Aspect_Finalizable.
(Is_Representation_Aspect): Add False for Aspect_Finalizable.
(Aspect_Names): Add Name_Finalizable for Aspect_Finalizable.
(Aspect_Delay): Add Always_Delay  for Aspect_Finalizable.
* checks.adb: Add with and use clauses for Sem_Elab.
(Install_Primitive_Elaboration_Check): Call Is_Controlled_Procedure.
* einfo.ads (Has_Relaxed_Finalization): Document new flag.
(Is_Controlled_Active): Update documentation.
* exp_aggr.adb (Generate_Finalization_Actions): Replace Find_Prim_Op
with Find_Controlled_Prim_Op for Name_Finalize.
* exp_attr.adb (Expand_N_Attribute_Reference) <Finalization_Size>:
Return 0 if the prefix type has relaxed finalization.
* exp_ch3.adb (Build_Equivalent_Record_Aggregate): Return Empty if
the type needs finalization.
(Expand_Freeze_Record_Type): Call Find_Controlled_Prim_Op instead of
Find_Prim_Op for Name_{Adjust,Initialize,Finalize}.
Call Make_Finalize_Address_Body for all controlled types.
* exp_ch4.adb (Insert_Dereference_Action): Do not generate a call to
Adjust_Controlled_Dereference if the designated type has relaxed
finalization.
* exp_ch6.adb (Needs_BIP_Collection): Return false for an untagged
type that has relaxed finalization.
* exp_ch7.adb (Allows_Finalization_Collection): Return false if the
designated type has relaxed finalization.
(Check_Visibly_Controlled): Call Find_Controlled_Prim_Op instead of
Find_Prim_Op.
(Make_Adjust_Call): Likewise.
(Make_Deep_Record_Body): Likewise.
(Make_Final_Call): Likewise.
(Make_Init_Call): Likewise.
* exp_disp.adb (Set_All_DT_Position): Remove obsolete warning.
* exp_util.ads: Add with and use clauses for Snames.
(Find_Prim_Op): Add precondition.
(Find_Controlled_Prim_Op): New function declaration.
(Name_Of_Controlled_Prim_Op): Likewise.
* exp_util.adb: Remove with and use clauses for Snames.
(Build_Allocate_Deallocate_Proc): Do not build finalization actions
if the designated type has relaxed finalization.
(Find_Controlled_Prim_Op): New function.
(Find_Last_Init): Call Find_Controlled_Prim_Op instead of
Find_Prim_Op.
(Name_Of_Controlled_Prim_Op): New function.
* freeze.adb (Freeze_Entity.Freeze_Record_Type): Propagate the
Has_Relaxed_Finalization flag from components.
* gen_il-fields.ads (Opt_Field_Enum): Add Has_Relaxed_Finalization.
* gen_il-gen-gen_entities.adb (Entity_Kind): Likewise.
* sem_aux.adb (Is_By_Reference_Type): Return true for all controlled
types.
* sem_ch3.adb (Build_Derived_Record_Type): Do not special case types
declared in Ada.Finalization.
(Record_Type_Definition): Propagate the Has_Relaxed_Finalization
flag from components.
* sem_ch13.adb (Analyze_Aspects_At_Freeze_Point): Also process the
Finalizable aspect.
(Analyze_Aspect_Specifications): Likewise. Call Flag_Non_Static_Expr
in more cases.
(Check_Aspect_At_Freeze_Point): Likewise.
(Inherit_Aspects_At_Freeze_Point): Likewise.
(Resolve_Aspect_Expressions): Likewise.
(Resolve_Finalizable_Argument): New procedure.
(Validate_Finalizable_Aspect): Likewise.
* sem_elab.ads: Add with and use clauses for Snames.
(Is_Controlled_Procedure): New function declaration.
* sem_elab.adb: Remove with and use clauses for Snames.
(Is_Controlled_Proc): Move to...
(Is_Controlled_Procedure): ...here and rename.
(Check_A_Call): Call Find_Controlled_Prim_Op instead of
Find_Prim_Op.
(Is_Finalization_Procedure): Likewise.
* sem_util.ads (Propagate_Controlled_Flags): Update documentation.
* sem_util.adb (Is_Fully_Initialized_Type): Replace call to
Find_Optional_Prim_Op with Find_Controlled_Prim_Op.
Call Has_Null_Extension only for derived tagged types.
(Propagate_Controlled_Flags): Propagate Has_Relaxed_Finalization.
* snames.ads-tmpl (Name_Finalizable): New name.
(Name_Relaxed_Finalization): Likewise.
* libgnat/s-finroo.ads (Root_Controlled): Add Finalizable aspect.
* doc/gnat_rm/gnat_language_extensions.rst: Document implementation
of Generalized Finalization.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

8 weeks agoi386: Refactor vcvttps2qq/vcvtqq2ps patterns.
Hu, Lin1 [Tue, 25 Jun 2024 10:25:59 +0000 (18:25 +0800)]
i386: Refactor vcvttps2qq/vcvtqq2ps patterns.

Refactor vcvttps2qq/vcvtqq2ps patterns for remove redundant
round_*_modev8sf_condition.

gcc/ChangeLog:

* config/i386/sse.md
(float<floatunssuffix><sselongvecmodelower><mode>2<mask_name>
<round_name>): Refactor the pattern.
(unspec_fix<vcvtt_uns_suffix>_trunc<mode><sselongvecmodelower>2
<mask_name><round_saeonly_name>): Ditto.
(fix<fixunssuffix>_trunc<mode><sselongvecmodelower>2<mask_name>
<round_saeonly_name>): Ditto.
* config/i386/subst.md (round_modev8sf_condition): Remove.
(round_saeonly_modev8sf_condition): Ditto.

8 weeks agovect: support direct conversion under x86-64-v3.
Hu, Lin1 [Wed, 6 Mar 2024 11:58:48 +0000 (19:58 +0800)]
vect: support direct conversion under x86-64-v3.

gcc/ChangeLog:

PR target/107432
* config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f):
New function for generate a series of suitable insn.
* config/i386/i386-protos.h (ix86_expand_trunc_with_avx2_noavx512f):
Define new function.
* config/i386/sse.md: Extend trunc<mode><mode>2 for x86-64-v3.
(ssebytemode) Add V8HI.
(PMOV_DST_MODE_2_AVX2): New mode iterator.
(PMOV_SRC_MODE_3_AVX2): Ditto.
* config/i386/mmx.md
(trunc<mode><mmxhalfmodelower>2): Ditto.
(avx512vl_trunc<mode><mmxhalfmodelower>2): Ditto.
(truncv2si<mode>2): Ditto.
(avx512vl_truncv2si<mode>2): Ditto.
(mmxbytemode): New mode attr.

gcc/testsuite/ChangeLog:

PR target/107432
* gcc.target/i386/pr107432-8.c: New test.
* gcc.target/i386/pr107432-9.c: Ditto.
* gcc.target/i386/pr92645-4.c: Modify test.

8 weeks agovect: Support v4hi -> v4qi.
Hu, Lin1 [Wed, 28 Feb 2024 10:11:55 +0000 (18:11 +0800)]
vect: Support v4hi -> v4qi.

gcc/ChangeLog:

PR target/107432
* config/i386/mmx.md
(VI2_32_64): New mode iterator.
(mmxhalfmode): New mode atter.
(mmxhalfmodelower): Ditto.
(truncv2hiv2qi2): Extend mode v4hi and change name from
truncv2hiv2qi to trunc<mode><mmxhalfmodelower>2.

gcc/testsuite/ChangeLog:

PR target/107432
* gcc.target/i386/pr107432-1.c: Modify test.
* gcc.target/i386/pr107432-6.c: Add test.
* gcc.target/i386/pr108938-3.c: This patch supports
truncv4hiv4qi affect bswap optimization, so I added
the -mno-avx option for now, and open a bugzilla.

8 weeks agovect: generate suitable convert insn for int -> int, float -> float and int <-> float.
Hu, Lin1 [Thu, 1 Feb 2024 07:15:01 +0000 (15:15 +0800)]
vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

gcc/ChangeLog:

PR target/107432
* tree-vect-generic.cc
(expand_vector_conversion): Support convert for int -> int,
float -> float and int <-> float.
* tree-vect-stmts.cc (vectorizable_conversion): Wrap the
indirect convert part.
(supportable_indirect_convert_operation): New function.
* tree-vectorizer.h (supportable_indirect_convert_operation):
Define the new function.

gcc/testsuite/ChangeLog:

PR target/107432
* gcc.target/i386/pr107432-1.c: New test.
* gcc.target/i386/pr107432-2.c: Ditto.
* gcc.target/i386/pr107432-3.c: Ditto.
* gcc.target/i386/pr107432-4.c: Ditto.
* gcc.target/i386/pr107432-5.c: Ditto.
* gcc.target/i386/pr107432-6.c: Ditto.
* gcc.target/i386/pr107432-7.c: Ditto.

8 weeks agoRISC-V: Add testcases for vector truncate after .SAT_SUB
Pan Li [Mon, 24 Jun 2024 14:25:57 +0000 (22:25 +0800)]
RISC-V: Add testcases for vector truncate after .SAT_SUB

This patch would like to add the test cases of the vector truncate after
.SAT_SUB.  Aka:

  #define DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(OUT_T, IN_T)                   \
  void __attribute__((noinline))                                       \
  vec_sat_u_sub_trunc_##OUT_T##_fmt_1 (OUT_T *out, IN_T *op_1, IN_T y, \
        unsigned limit)                 \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        IN_T x = op_1[i];                                              \
        out[i] = (OUT_T)(x >= y ? x - y : 0);                          \
      }                                                                \
  }

The below 3 cases are included.

DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint8_t, uint16_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint16_t, uint32_t)
DEF_VEC_SAT_U_SUB_TRUNC_FMT_1(uint32_t, uint64_t)

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add helper
test macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_scalar.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-3.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-1.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-2.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_trunc-run-3.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
8 weeks agoLoongArch: NFC: Dedup and sort the comment in loongarch_print_operand_reloc
Xi Ruoyao [Sun, 16 Jun 2024 04:22:40 +0000 (12:22 +0800)]
LoongArch: NFC: Dedup and sort the comment in loongarch_print_operand_reloc

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_print_operand_reloc):
Dedup and sort the comment describing modifiers.

8 weeks agoLoongArch: Tweak IOR rtx_cost for bstrins
Xi Ruoyao [Sat, 15 Jun 2024 10:29:43 +0000 (18:29 +0800)]
LoongArch: Tweak IOR rtx_cost for bstrins

Consider

    c &= 0xfff;
    a &= ~0xfff;
    b &= ~0xfff;
    a |= c;
    b |= c;

This can be done with 2 bstrins instructions.  But we need to recognize
it in loongarch_rtx_costs or the compiler will not propagate "c & 0xfff"
forward.

gcc/ChangeLog:

* config/loongarch/loongarch.cc:
(loongarch_use_bstrins_for_ior_with_mask): Split the main logic
into ...
(loongarch_use_bstrins_for_ior_with_mask_1): ... here.
(loongarch_rtx_costs): Special case for IOR those can be
implemented with bstrins.

gcc/testsuite/ChangeLog;

* gcc.target/loongarch/bstrins-3.c: New test.

8 weeks agoFix wrong cost of MEM when addr is a lea.
liuhongt [Mon, 24 Jun 2024 09:53:22 +0000 (17:53 +0800)]
Fix wrong cost of MEM when addr is a lea.

416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0.
The commit adjust rtx_cost of mem to reduce cost of (add op0 disp).
But Cost of ADDR could be cheaper than XEXP (addr, 0) when it's a lea.
It is the case in the PR, the patch adjust rtx_cost to only handle reg
+ disp, for other forms, they're basically all LEA which doesn't have
additional cost of ADD.

gcc/ChangeLog:

PR target/115462
* config/i386/i386.cc (ix86_rtx_costs): Make cost of MEM (reg +
disp) just a little bit more than MEM (reg).

gcc/testsuite/ChangeLog:
* gcc.target/i386/pr115462.c: New test.

8 weeks agoInternal-fn: Support new IFN SAT_TRUNC for unsigned scalar int
Pan Li [Wed, 26 Jun 2024 01:28:05 +0000 (09:28 +0800)]
Internal-fn: Support new IFN SAT_TRUNC for unsigned scalar int

This patch would like to add the middle-end presentation for the
saturation truncation.  Aka set the result of truncated value to
the max value when overflow.  It will take the pattern similar
as below.

Form 1:
  #define DEF_SAT_U_TRUC_FMT_1(WT, NT) \
  NT __attribute__((noinline))         \
  sat_u_truc_##T##_fmt_1 (WT x)        \
  {                                    \
    bool overflow = x > (WT)(NT)(-1);  \
    return ((NT)x) | (NT)-overflow;    \
  }

For example, truncated uint16_t to uint8_t, we have

* SAT_TRUNC (254)   => 254
* SAT_TRUNC (255)   => 255
* SAT_TRUNC (256)   => 255
* SAT_TRUNC (65536) => 255

Given below SAT_TRUNC from uint64_t to uint32_t.

DEF_SAT_U_TRUC_FMT_1 (uint64_t, uint32_t)

Before this patch:
__attribute__((noinline))
uint32_t sat_u_truc_T_fmt_1 (uint64_t x)
{
  _Bool overflow;
  unsigned int _1;
  unsigned int _2;
  unsigned int _3;
  uint32_t _6;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  overflow_5 = x_4(D) > 4294967295;
  _1 = (unsigned int) x_4(D);
  _2 = (unsigned int) overflow_5;
  _3 = -_2;
  _6 = _1 | _3;
  return _6;
;;    succ:       EXIT

}

After this patch:
__attribute__((noinline))
uint32_t sat_u_truc_T_fmt_1 (uint64_t x)
{
  uint32_t _6;

;;   basic block 2, loop depth 0
;;    pred:       ENTRY
  _6 = .SAT_TRUNC (x_4(D)); [tail call]
  return _6;
;;    succ:       EXIT

}

The below tests are passed for this patch:
*. The rv64gcv fully regression tests.
*. The rv64gcv build with glibc.
*. The x86 bootstrap tests.
*. The x86 fully regression tests.

gcc/ChangeLog:

* internal-fn.def (SAT_TRUNC): Add new signed IFN sat_trunc as
unary_convert.
* match.pd: Add new matching pattern for unsigned int sat_trunc.
* optabs.def (OPTAB_CL): Add unsigned and signed optab.
* tree-ssa-math-opts.cc (gimple_unsigend_integer_sat_trunc): Add
new decl for the matching pattern generated func.
(match_unsigned_saturation_trunc): Add new func impl to match
the .SAT_TRUNC.
(math_opts_dom_walker::after_dom_children): Add .SAT_TRUNC match
function under BIT_IOR_EXPR case.

Signed-off-by: Pan Li <pan2.li@intel.com>
8 weeks agoVect: Support truncate after .SAT_SUB pattern in zip
Pan Li [Thu, 27 Jun 2024 01:28:04 +0000 (09:28 +0800)]
Vect: Support truncate after .SAT_SUB pattern in zip

The zip benchmark of coremark-pro have one SAT_SUB like pattern but
truncated as below:

void test (uint16_t *x, unsigned b, unsigned n)
{
  unsigned a = 0;
  register uint16_t *p = x;

  do {
    a = *--p;
    *p = (uint16_t)(a >= b ? a - b : 0); // Truncate after .SAT_SUB
  } while (--n);
}

It will have gimple before vect pass,  it cannot hit any pattern of
SAT_SUB and then cannot vectorize to SAT_SUB.

_2 = a_11 - b_12(D);
iftmp.0_13 = (short unsigned int) _2;
_18 = a_11 >= b_12(D);
iftmp.0_5 = _18 ? iftmp.0_13 : 0;

This patch would like to improve the pattern match to recog above
as truncate after .SAT_SUB pattern.  Then we will have the pattern
similar to below,  as well as eliminate the first 3 dead stmt.

_2 = a_11 - b_12(D);
iftmp.0_13 = (short unsigned int) _2;
_18 = a_11 >= b_12(D);
iftmp.0_5 = (short unsigned int).SAT_SUB (a_11, b_12(D));

The below tests are passed for this patch.
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.

gcc/ChangeLog:

* match.pd: Add convert description for minus and capture.
* tree-vect-patterns.cc (vect_recog_build_binary_gimple_call): Add
new logic to handle in_type is incompatibile with out_type,  as
well as rename from.
(vect_recog_build_binary_gimple_stmt): Rename to.
(vect_recog_sat_add_pattern): Leverage above renamed func.
(vect_recog_sat_sub_pattern): Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
8 weeks agotree-optimization/115652 - amend last fix
Richard Biener [Wed, 26 Jun 2024 17:23:26 +0000 (19:23 +0200)]
tree-optimization/115652 - amend last fix

The previous fix breaks in the degenerate case when the discovered
last_stmt is equal to the first stmt in the block since then we
undo a required stmt advancement.

PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Only insert
at the start of the block if that strictly dominates
the discovered dependent stmt.

8 weeks agotree-optimization/115493 - complete previous fix
Richard Biener [Wed, 26 Jun 2024 17:11:04 +0000 (19:11 +0200)]
tree-optimization/115493 - complete previous fix

The following fixes the 2nd occurance of new_temp missed with the
previous fix.

PR tree-optimization/115493
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Use
first scalar result.

8 weeks agoDaily bump.
GCC Administrator [Thu, 27 Jun 2024 00:17:31 +0000 (00:17 +0000)]
Daily bump.

8 weeks agolibstdc++: Add script to update docs for a new release branch
Jonathan Wakely [Tue, 25 Jun 2024 22:59:19 +0000 (23:59 +0100)]
libstdc++: Add script to update docs for a new release branch

This should be run on a release branch after branching from trunk.
Various links and references to trunk in the docs will be updated to
refer to the new release branch.

libstdc++-v3/ChangeLog:

* scripts/update_release_branch.sh: New file.

8 weeks agolibstdc++: Remove duplicate test
Jonathan Wakely [Thu, 20 Jun 2024 21:17:08 +0000 (22:17 +0100)]
libstdc++: Remove duplicate test

We currently have 808590.cc which only runs for C++98 mode, and
808590-cxx11.cc which only runs for C++11 and later, but have almost
identical content (except for a defaulted special member in the C++11
one, to suppress a -Wdeprecated-copy warning).

This was done originally to ensure that the test ran for both C++98 mode
and C++11 mode, because the logic being tested was different enough to
need both to be tested. But it's trivial to run all tests in multiple
-std modes now, using GLIBCXX_TESTSUITE_STDS, so we don't need two
separate tests. We can remove one of the tests and allow the other one
to run in any -std mode.

libstdc++-v3/ChangeLog:

* testsuite/20_util/specialized_algorithms/uninitialized_copy/808590.cc:
Copy defaulted assignment operator from 808590-cxx11.cc to
suppress a warning.
* testsuite/20_util/specialized_algorithms/uninitialized_copy/808590-cxx11.cc:
Removed.

8 weeks agolibstdc++: Increase timeouts for PSTL tests in debug mode [PR90276]
Jonathan Wakely [Wed, 12 Jun 2024 16:11:23 +0000 (17:11 +0100)]
libstdc++: Increase timeouts for PSTL tests in debug mode [PR90276]

These tests compile very slowly in debug mode.

libstdc++-v3/ChangeLog:

PR libstdc++/90276
* testsuite/25_algorithms/pstl/alg_modifying_operations/rotate_copy.cc:
Increase timeout for debug mode.
* testsuite/25_algorithms/pstl/alg_modifying_operations/transform_binary.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_nonmodifying/mismatch.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/lexicographical_compare.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/minmax_element.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/set_symmetric_difference.cc:
Likewise.

8 weeks agolibstdc++: Work around some PSTL test failures for debug mode [PR90276]
Jonathan Wakely [Thu, 6 Jun 2024 10:50:06 +0000 (11:50 +0100)]
libstdc++: Work around some PSTL test failures for debug mode [PR90276]

This addresses one known failure due to a bug in the upstream tests, and
a number of timeouts due to the algorithms running much more slowly with
debug mode checks enabled.

libstdc++-v3/ChangeLog:

PR libstdc++/90276
* testsuite/25_algorithms/pstl/alg_sorting/partial_sort.cc
[_GLIBCXX_DEBUG]: Add xfail-run-if for debug mode.
* testsuite/25_algorithms/pstl/alg_nonmodifying/nth_element.cc
[_GLIBCXX_DEBUG]: Reduce size of test data.
* testsuite/25_algorithms/pstl/alg_sorting/includes.cc:
Likewise.
* testsuite/25_algorithms/pstl/alg_sorting/set_util.h:
Likewise.

8 weeks agolibstdc++: Fix std::chrono::tzdb to work with vanguard format
Jonathan Wakely [Tue, 30 Apr 2024 08:52:13 +0000 (09:52 +0100)]
libstdc++: Fix std::chrono::tzdb to work with vanguard format

I found some issues in the std::chrono::tzdb parser by testing the
tzdata "vanguard" format, which uses new features that aren't enabled in
the "main" and "rearguard" data formats.

Since 2024a the keyword "minimum" is no longer valid for the FROM and TO
fields in a Rule line, which means that "m" is now a valid abbreviation
for "maximum". Previously we expected either "mi" or "ma". For backwards
compatibility, a FROM field beginning with "mi" is still supported and
is treated as 1900. The "maximum" keyword is only allowed in TO now,
because it makes no sense in FROM. To support these changes the
minmax_year and minmax_year2 classes for parsing FROM and TO are
replaced with a single years_from_to class that reads both fields.

The vanguard format makes use of %z in Zone FORMAT fields, which caused
an exception to be thrown from ZoneInfo::set_abbrev because no % or /
characters were expected when a Zone doesn't use a named Rule. The
ZoneInfo::to(sys_info&) function now uses format_abbrev_str to replace
any %z with the current offset. Although format_abbrev_str also checks
for %s and STD/DST formats, those only make sense when a named Rule is
in effect, so won't occur when ZoneInfo::to(sys_info&) is used.

This change also implements a feature that has always been missing from
time_zone::_M_get_sys_info: finding the Rule that is active before the
specified time point, so that we can correctly handle %s in the FORMAT
for the first new sys_info that gets created. This requires implementing
a poorly documented feature of zic, to get the LETTERS field from a
later transition, as described at
https://mm.icann.org/pipermail/tz/2024-April/058891.html
In order for this to work we need to be able to distinguish an empty
letters field (as used by CE%sT where the variable part is either empty
or "S") from "the letters field is not known for this transition". The
tzdata file uses "-" for an empty letters field, which libstdc++ was
previously replacing with "" when the Rule was parsed. Instead, we now
preserve the "-" in the Rule object, so that "" can be used for the case
where we don't know the letters (and so need to decide it).

libstdc++-v3/ChangeLog:

* src/c++20/tzdb.cc (minmax_year, minmax_year2): Remove.
(years_from_to): New class replacing minmax_year and
minmax_year2.
(format_abbrev_str, select_std_or_dst_abbrev): Move earlier in
the file. Handle "-" for letters.
(ZoneInfo::to): Use format_abbrev_str to expand %z.
(ZoneInfo::set_abbrev): Remove exception. Change parameter from
reference to value.
(operator>>(istream&, Rule&)): Do not clear letters when it
contains "-".
(time_zone::_M_get_sys_info): Add missing logic to find the Rule
in effect before the time point.
* testsuite/std/time/tzdb/1.cc: Adjust for vanguard format using
"GMT" as the Zone name, not as a Link to "Etc/GMT".
* testsuite/std/time/time_zone/sys_info_abbrev.cc: New test.

8 weeks agotree-optimization/115629 - missed tail merging
Richard Biener [Tue, 25 Jun 2024 12:04:31 +0000 (14:04 +0200)]
tree-optimization/115629 - missed tail merging

The following fixes a missed tail-merging observed for the testcase
in PR115629.  The issue is that when deps_ok_for_redirect doesn't
compute both would be valid prevailing blocks it rejects the merge.
The following instead makes sure to record the working block as
prevailing.  Also stmt comparison fails for indirect references
and is not handling memory references thoroughly, failing to unify
array indices and pointers indirected.  The following attempts to
fix this.

PR tree-optimization/115629
* tree-ssa-tail-merge.cc (gimple_equal_p): Handle
memory references better.
(deps_ok_for_redirect): Handle the case not both blocks
are considered a valid prevailing block.

* gcc.dg/tree-ssa/tail-merge-1.c: New testcase.

8 weeks agoRISC-V: Update testcase comments to point to PSABI rather than Table A.6
Patrick O'Neill [Tue, 25 Jun 2024 21:14:18 +0000 (14:14 -0700)]
RISC-V: Update testcase comments to point to PSABI rather than Table A.6

Table A.6 was originally the source of truth for the recommended mappings.
Point to the PSABI doc since the memory model mappings have been moved there.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/a-rvwmo-fence.c: Replace A.6 reference with PSABI.
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-rvwmo-store-release.c: Ditto.
* gcc.target/riscv/amo/a-ztso-fence.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-acquire.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: Ditto.
* gcc.target/riscv/amo/a-ztso-store-release.c: Ditto.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: Ditto.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: Ditto.
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: Ditto.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
8 weeks agoRISC-V: Consolidate amo testcase variants
Patrick O'Neill [Tue, 25 Jun 2024 21:14:17 +0000 (14:14 -0700)]
RISC-V: Consolidate amo testcase variants

Many riscv/amo/ testcases use check-function-bodies. These testcases can be
consolidated with related testcases (memory ordering variants) without affecting
the assertions.

Give functions descriptive names so testsuite failures are obvious from the
'FAIL:' line.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-a-6-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-amo-add-5.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-1.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-2.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-3.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-4.c: Removed.
* gcc.target/riscv/amo/amo-table-ztso-fence-5.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-1.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-2.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-3.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-4.c: Removed.
* gcc.target/riscv/amo/amo-zalrsc-amo-add-5.c: Removed.
* gcc.target/riscv/amo/a-rvwmo-fence.c: New test.
* gcc.target/riscv/amo/a-ztso-fence.c: New test.
* gcc.target/riscv/amo/zaamo-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zaamo-ztso-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-rvwmo-amo-add-int.c: New test.
* gcc.target/riscv/amo/zalrsc-ztso-amo-add-int.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
8 weeks agoRISC-V: Rename amo testcases
Patrick O'Neill [Tue, 25 Jun 2024 21:14:16 +0000 (14:14 -0700)]
RISC-V: Rename amo testcases

Rename riscv/amo/ testcases to follow a '{ext}-{model}-{name}-{memory order}.c'
naming convention.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/amo/amo-table-a-6-load-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-load-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-compat-3.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-1.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-store-2.c: Move to...
* gcc.target/riscv/amo/a-rvwmo-store-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-load-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-load-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-3.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-compat-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-1.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-store-2.c: Move to...
* gcc.target/riscv/amo/a-ztso-store-release.c: ...here.
* gcc.target/riscv/amo/amo-zaamo-preferred-over-zalrsc.c: Move to...
* gcc.target/riscv/amo/zaamo-preferred-over-zalrsc.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-6.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire-release.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-consume.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-release.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-7.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-compare-exchange-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-compare-exchange-int-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acq-rel.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-release.c: ...here.
* gcc.target/riscv/amo/amo-table-a-6-subword-amo-add-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-rvwmo-subword-amo-add-char-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-6.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-consume.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-7.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-compare-exchange-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-compare-exchange-int-seq-cst.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-4.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acq-rel.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-2.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-acquire.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-1.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-relaxed.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-3.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-release.c: ...here.
* gcc.target/riscv/amo/amo-table-ztso-subword-amo-add-5.c: Move to...
* gcc.target/riscv/amo/zalrsc-ztso-subword-amo-add-char-seq-cst.c: ...here.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
8 weeks agors6000, change altivec*-runnable.c test file names
Carl Love [Fri, 21 Jun 2024 15:56:36 +0000 (11:56 -0400)]
rs6000, change altivec*-runnable.c test file names

Changed the names of the test files.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-1-runnable.c: Change the name to
altivec-38.c.
* gcc.target/powerpc/altivec-2-runnable.c: Change the name to
p8vector-builtin-9.c.

8 weeks agors6000, altivec-2-runnable.c update the require-effective-target
Carl Love [Fri, 14 Jun 2024 16:46:00 +0000 (12:46 -0400)]
rs6000, altivec-2-runnable.c update the require-effective-target

The test requires a minimum of Power8 vector HW and a compile level
of -O2.  Update the dg test directives.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-2-runnable.c: Change the
require-effective-target for the test.

8 weeks agors6000, altivec-1-runnable.c update the require-effective-target
Carl Love [Mon, 24 Jun 2024 16:31:19 +0000 (12:31 -0400)]
rs6000, altivec-1-runnable.c update the require-effective-target

Update the dg test directives.

gcc/testsuite/ChangeLog:gcc/testsuite/ChangeLog:
* gcc.target/powerpc/altivec-1-runnable.c: Change the
require-effective-target for the test.

8 weeks ago[committed] Remove compromised sh test
Jeff Law [Wed, 26 Jun 2024 13:20:29 +0000 (07:20 -0600)]
[committed] Remove compromised sh test

Surya's recent patch to IRA improves the code for sh/pr54602-1.c slightly.
Specifically it's able to eliminate a save/restore in the prologue/epilogue and
a bit of register shuffling.

As a result there literally aren't any insns that can be used to fill the delay
slot of the return, so a nop gets emitted and the test fails.

Given there literally aren't any insns to move into the delay slot, the best
course of action is to just drop the test.

gcc/testsuite
* gcc.target/sh/pr54602-1.c: Delete test.

8 weeks ago[committed][RISC-V] Fix expected output for thead store pair test
Jeff Law [Wed, 26 Jun 2024 12:59:26 +0000 (06:59 -0600)]
[committed][RISC-V] Fix expected output for thead store pair test

Surya's patch to IRA has improved the code we generate for one of the thead
store pair tests for both rv32 and rv64.  This patch adjusts the expectations
of that test.

I've verified that the test now passes on rv32 and rv64 in my tester.  Pushing
to the trunk.

gcc/testsuite
* gcc.target/riscv/xtheadmempair-3.c: Update expected output.

8 weeks agotree-optimization/115652 - adjust insertion gsi for SLP
Richard Biener [Wed, 26 Jun 2024 07:25:27 +0000 (09:25 +0200)]
tree-optimization/115652 - adjust insertion gsi for SLP

The following adjusts how SLP computes the insertion location.  In
particular it advanced the insert iterator of the found last_stmt.
The vectorizer will later insert stmts _before_ it.  But we also
have the constraint that possibly masked ops may not be scheduled
outside of the loop and as we do not model the loop mask in the
SLP graph we have to adjust for that.  The following moves this
to after the advance since it isn't compatible with that as the
current GIMPLE_COND exception shows.  The PR is about in-order
reduction vectorization which also isn't happy when that's the
very first stmt.

PR tree-optimization/115652
* tree-vect-slp.cc (vect_schedule_slp_node): Advance the
iterator based on last_stmt only for vector defs.

8 weeks agoRecord edge true/false value for gcov
Jørgen Kvalsvik [Tue, 4 Jun 2024 12:16:22 +0000 (14:16 +0200)]
Record edge true/false value for gcov

Make gcov aware which edges are the true/false to more accurately
reconstruct the CFG.  There are plenty of bits left in arc_info and it
opens up for richer reporting.

gcc/ChangeLog:

* gcov-io.h (GCOV_ARC_TRUE): New.
(GCOV_ARC_FALSE): New.
* gcov.cc (struct arc_info): Add true_value, false_value.
(read_graph_file): Read true_value, false_value.
* profile.cc (branch_prob): Write GCOV_ARC_TRUE, GCOV_ARC_FALSE.

8 weeks agoUse the term MC/DC in help for gcov --conditions
Jørgen Kvalsvik [Tue, 25 Jun 2024 06:41:45 +0000 (08:41 +0200)]
Use the term MC/DC in help for gcov --conditions

Without key terms like "masking" and "MC/DC" it is not at all obvious
what --conditions actually reports on, and there is no easy path for the
user to figure out. By at least including the two key terms MC/DC and
masking users have something to search for.

gcc/ChangeLog:

* gcov.cc (print_usage): Reference masking MC/DC.

8 weeks agoAdd section on MC/DC in gcov manual
Jørgen Kvalsvik [Fri, 21 Jun 2024 18:28:01 +0000 (20:28 +0200)]
Add section on MC/DC in gcov manual

gcc/ChangeLog:

* doc/gcov.texi: Add MC/DC section.

8 weeks agoUse auto_vec for memory release on return
Jørgen Kvalsvik [Mon, 24 Jun 2024 19:55:46 +0000 (21:55 +0200)]
Use auto_vec for memory release on return

Using auto_vec ensure this memory is cleaned up on function exit.

gcc/ChangeLog:

* tree-profile.cc (find_conditions): Use auto_vec.

8 weeks agoarm: make arm_predict_doloop_p reject loops with calls
Andre Vieira [Wed, 26 Jun 2024 10:07:01 +0000 (11:07 +0100)]
arm: make arm_predict_doloop_p reject loops with calls

With the introduction of low overhead loops we defined arm_predict_doloop_p,
this is meant to be a low-weight check to rule out loops we are not considering
for doloop optimization and it is used by other passes to prevent optimizations
that may hurt the doloop optimization later on. The reason these are meant to be
lightweight is because it's used by pre-RTL optimizations, meaning we can't do
the same checks that doloop does.

After the definition of arm_predict_doloop_p, when testing for armv8.1-m.main,
tree-ssa/ivopts-3.c failed the scan-dump check as the dump now matched an extra
'!= 0' introduced by:
Doloop cmp iv use: if (ivtmp_1 != 0)
Predict loop 1 can perform doloop optimization later.

where previously we had:
Predict doloop failure due to target specific checks.

and after this patch:
Predict doloop failure due to call in loop.
Predict doloop failure due to target specific checks.

Added a copy of the original tree-ssa/ivopts-3.c as a target specifc test to
check for the new dump message.

gcc/ChangeLog:

* config/arm/arm.cc (arm_predict_doloop_p): Reject loops with function
calls that are not builtins.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/ivopts-3.c: New test.

8 weeks ago[aarch64] Add support for -mcpu=grace
Kyrylo Tkachov [Wed, 26 Jun 2024 07:42:11 +0000 (09:42 +0200)]
[aarch64] Add support for -mcpu=grace

This adds support for the NVIDIA Grace CPU to aarch64.
We reuse the tuning decisions for the Neoverse V2 core, but include a
number of architecture features that are not enabled by default in
-mcpu=neoverse-v2.

This allows Grace users to more simply target the CPU with -mcpu=grace
rather than remembering what extensions to tag on top of
-mcpu=neoverse-v2.

Bootstrapped and tested on aarch64-none-linux-gnu.
gcc/

* config/aarch64/aarch64-cores.def (grace): New entry.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document the above.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
8 weeks agoi386: Remove declaration of unused functions
Evgeny Karpov [Tue, 25 Jun 2024 21:59:35 +0000 (21:59 +0000)]
i386: Remove declaration of unused functions

The patch fixes the issue introduced in
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=63512c72df09b43d56ac7680cdfd57a66d40c636
and reported at
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655599.html .

Regards,
Evgeny

The patch fixes the issue with compilation on x86_64-gnu-linux
when warnings for unused functions are treated as errors.

gcc/ChangeLog:

* config/i386/i386.cc (legitimize_dllimport_symbol): Remove unused
functions.
(legitimize_pe_coff_extern_decl): Likewise.

8 weeks agors6000: Fix wrong RTL patterns for vector merge high/low short on LE
Kewen Lin [Wed, 26 Jun 2024 07:16:17 +0000 (02:16 -0500)]
rs6000: Fix wrong RTL patterns for vector merge high/low short on LE

Commit r12-4496 changes some define_expands and define_insns
for vector merge high/low short, which are altivec_vmrg[hl]h.
These defines are mainly for built-in function vec_merge{h,l}
and some internal gen function needs.  These functions should
consider endianness, taking vec_mergeh as example, as PVIPR
defines, vec_mergeh "Merges the first halves (in element order)
of two vectors", it does note it's in element order.  So it's
mapped into vmrghh on BE while vmrglh on LE respectively.
Although the mapped insns are different, as the discussion in
PR106069, the RTL pattern should be still the same, it is
conformed before commit r12-4496, but gets changed into
different patterns on BE and LE starting from commit r12-4496.
Similar to 32-bit element case in commit log of r15-1504, this
16-bit element pattern on LE doesn't actually match what the
underlying insn is intended to represent, once some optimization
like combine does some changes basing on it, it would cause
the unexpected consequence.  The newly constructed test case
pr106069-2.c is a typical example for this issue on element type
short.

So this patch is to fix the wrong RTL pattern, ensure the
associated RTL patterns become the same as before which can
have the same semantic as their mapped insns.  With the
proposed patch, the expanders like altivec_vmrghh expands
into altivec_vmrghh_direct_be or altivec_vmrglh_direct_le
depending on endianness, "direct" can easily show which
insn would be generated, _be and _le are mainly for the
different RTL patterns as endianness.

Co-authored-by: Xionghu Luo <xionghuluo@tencent.com>
PR target/106069
PR target/115355

gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vmrghh_direct): Rename to ...
(altivec_vmrghh_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
(altivec_vmrghh_direct_le): New define_insn.
(altivec_vmrglh_direct): Rename to ...
(altivec_vmrglh_direct_be): ... this.  Add condition BYTES_BIG_ENDIAN.
(altivec_vmrglh_direct_le): New define_insn.
(altivec_vmrghh): Adjust by calling gen_altivec_vmrghh_direct_be
for BE and gen_altivec_vmrglh_direct_le for LE.
(altivec_vmrglh): Adjust by calling gen_altivec_vmrglh_direct_be
for BE and gen_altivec_vmrghh_direct_le for LE.
(vec_widen_umult_hi_v16qi): Adjust the call to
gen_altivec_vmrghh_direct by gen_altivec_vmrghh for BE
and by gen_altivec_vmrglh for LE.
(vec_widen_smult_hi_v16qi): Likewise.
(vec_widen_umult_lo_v16qi): Adjust the call to
gen_altivec_vmrglh_direct by gen_altivec_vmrglh for BE
and by gen_altivec_vmrghh for LE.
(vec_widen_smult_lo_v16qi): Likewise.
* config/rs6000/rs6000.cc (altivec_expand_vec_perm_const): Replace
CODE_FOR_altivec_vmrghh_direct by
CODE_FOR_altivec_vmrghh_direct_be for BE and
CODE_FOR_altivec_vmrghh_direct_le for LE.  And replace
CODE_FOR_altivec_vmrglh_direct by
CODE_FOR_altivec_vmrglh_direct_be for BE and
CODE_FOR_altivec_vmrglh_direct_le for LE.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr106069-2.c: New test.

This page took 0.151324 seconds and 5 git commands to generate.