This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PR 80293] Don't totally-sRA char arrays
Hi,
On Wed, Apr 12, 2017 at 01:55:01PM +0200, Richard Biener wrote:
> On Wed, 12 Apr 2017, Martin Jambor wrote:
>
> > Hi,
> >
> > the patch below is an attempt to deal with PR 80293 as non-invasively
> > as possible. Basically, it switches off total SRA scalarization of
> > any local aggregates which contains an array of elements that have one
> > byte (or less).
> >
> > The logic behind this is that accessing such arrays element-wise
> > usually results in poor code and that such char arrays are often used
> > for non-statically-typed content anyway, and we do not want to copy
> > that byte per byte.
> >
> > Alan, do you think this could impact your constant pool scalarization
> > too severely?
>
> Hmm, isn't one of the issues that we have
>
> if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var)))
> {
> if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
> <= max_scalarization_size)
> {
> create_total_scalarization_access (var);
>
> which limits the size of scalarizable vars but not the number
> of accesses we create for total scalarization?
Well, yes, but at least I understood from your comment #4 in the bug
that you did not want to limit the number of replacements for gcc 7?
>
> Is scalarizable_type_p only used in contexts where we have no hint
> of the actual accesses?
There are should_scalarize_away_bitmap and
cannot_scalarize_away_bitmap hints.
Total scalarization is a hack process where we chop up small
aggregates according to their types - as opposed to actual accesses,
meaning it is an alien process to the rest of SRA - knowing that they
will go completely away afterwards (that is ensured by
cannot_scalarize_away_bitmap) and that this will facilitate copy
propagation (this is indicated by should_scalarize_away_bitmap, which
has a bit set if there is a non-scalar assignment _from_ (a part of)
the aggregate).
> That is, for the constant pool case we
> usually have
>
> x = .LC0;
> .. = x[2];
>
> so we have a "hint" that accesses on x are those we'd want to
> optimize to accesses to .LC0.
You don't need total scalarization for this, just the existence of
read from x[2] would be sufficient but thanks for pointing me to the
right direction. For constant pool decl scalarization, it is not
important to introduce artificial accesses for x but for .LC0.
Therefore, I have adapted the patch to allow char arrays for const
decls only and verified that it scalarizes a const-pool array of chars
on Aarch64. The (otherwise yet untested) result is below.
What do you think?
Martin
2017-04-13 Martin Jambor <mjambor@suse.cz>
* tree-sra.c (scalarizable_type_p): New parameter const_decl, make
char arrays not totally scalarizable if it is false.
(analyze_all_variable_accesses): Pass correct value in the new
parameter.
testsuite/
* g++.dg/tree-ssa/pr80293.C: New test.
---
gcc/testsuite/g++.dg/tree-ssa/pr80293.C | 45 +++++++++++++++++++++++++++++++++
gcc/tree-sra.c | 21 ++++++++++-----
2 files changed, 60 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr80293.C
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr80293.C b/gcc/testsuite/g++.dg/tree-ssa/pr80293.C
new file mode 100644
index 00000000000..7faf35ae983
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr80293.C
@@ -0,0 +1,45 @@
+// { dg-do compile }
+// { dg-options "-O2 -std=gnu++11 -fdump-tree-optimized" } */
+
+#include <array>
+
+// Return a copy of the underlying memory of an arbitrary value.
+template <
+ typename T,
+ typename = typename std::enable_if<std::is_trivially_copyable<T>::value>::type
+>
+auto getMem(
+ T const & value
+) -> std::array<char, sizeof(T)> {
+ auto ret = std::array<char, sizeof(T)>{};
+ __builtin_memcpy(ret.data(), &value, sizeof(T));
+ return ret;
+}
+
+template <
+ typename T,
+ typename = typename std::enable_if<std::is_trivially_copyable<T>::value>::type
+>
+auto fromMem(
+ std::array<char, sizeof(T)> const & buf
+) -> T {
+ auto ret = T{};
+ __builtin_memcpy(&ret, buf.data(), sizeof(T));
+ return ret;
+}
+
+double foo1(std::uint64_t arg) {
+ return fromMem<double>(getMem(arg));
+}
+
+double foo2(std::uint64_t arg) {
+ return *reinterpret_cast<double*>(&arg);
+}
+
+double foo3(std::uint64_t arg) {
+ double ret;
+ __builtin_memcpy(&ret, &arg, sizeof(arg));
+ return ret;
+}
+
+// { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "optimized" } }
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 02453d3ed9a..ab06be66131 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -949,10 +949,12 @@ create_access (tree expr, gimple *stmt, bool write)
/* Return true iff TYPE is scalarizable - i.e. a RECORD_TYPE or fixed-length
ARRAY_TYPE with fields that are either of gimple register types (excluding
- bit-fields) or (recursively) scalarizable types. */
+ bit-fields) or (recursively) scalarizable types. CONST_DECL must be true if
+ we are considering a decl from constant pool. If it is false, char arrays
+ will be refused. */
static bool
-scalarizable_type_p (tree type)
+scalarizable_type_p (tree type, bool const_decl)
{
gcc_assert (!is_gimple_reg_type (type));
if (type_contains_placeholder_p (type))
@@ -970,7 +972,7 @@ scalarizable_type_p (tree type)
return false;
if (!is_gimple_reg_type (ft)
- && !scalarizable_type_p (ft))
+ && !scalarizable_type_p (ft, const_decl))
return false;
}
@@ -978,10 +980,16 @@ scalarizable_type_p (tree type)
case ARRAY_TYPE:
{
+ HOST_WIDE_INT min_elem_size;
+ if (const_decl)
+ min_elem_size = 0;
+ else
+ min_elem_size = BITS_PER_UNIT;
+
if (TYPE_DOMAIN (type) == NULL_TREE
|| !tree_fits_shwi_p (TYPE_SIZE (type))
|| !tree_fits_shwi_p (TYPE_SIZE (TREE_TYPE (type)))
- || (tree_to_shwi (TYPE_SIZE (TREE_TYPE (type))) <= 0)
+ || (tree_to_shwi (TYPE_SIZE (TREE_TYPE (type))) <= min_elem_size)
|| !tree_fits_shwi_p (TYPE_MIN_VALUE (TYPE_DOMAIN (type))))
return false;
if (tree_to_shwi (TYPE_SIZE (type)) == 0
@@ -995,7 +1003,7 @@ scalarizable_type_p (tree type)
tree elem = TREE_TYPE (type);
if (!is_gimple_reg_type (elem)
- && !scalarizable_type_p (elem))
+ && !scalarizable_type_p (elem, const_decl))
return false;
return true;
}
@@ -2653,7 +2661,8 @@ analyze_all_variable_accesses (void)
{
tree var = candidate (i);
- if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var)))
+ if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var),
+ constant_decl_p (var)))
{
if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
<= max_scalarization_size)
--
2.12.0