[PATCH] fold strlen() of aggregate members (PR 77357)
Richard Biener
richard.guenther@gmail.com
Fri Jul 6 15:52:00 GMT 2018
On Fri, Jul 6, 2018 at 1:54 AM Martin Sebor <msebor@gmail.com> wrote:
>
> GCC folds accesses to members of constant aggregates except
> for character arrays/strings. For example, the strlen() call
> below is not folded:
>
> const char a[][4] = { "1", "12" };
>
> int f (void) { retturn strlen (a[1]); }
>
> The attached change set enhances the string_constant() function
> to make it possible to extract string constants from aggregate
> initializers (CONSTRUCTORS).
>
> The initial solution was much simpler but as is often the case,
> MEM_REF made it fail to fold things like:
>
> int f (void) { retturn strlen (a[1] + 1); }
>
> Handling those made the project a bit more interesting and
> the final solution somewhat more involved.
>
> To handle offsets into aggregate string members the patch also
> extends the fold_ctor_reference() function to extract entire
> string array initializers even if the offset points past
> the beginning of the string and even though the size and
> exact type of the reference are not known (there isn't enough
> information in a MEM_REF to determine that).
>
> Tested along with the patch for PR 86415 on x86_64-linux.
+ if (TREE_CODE (init) == CONSTRUCTOR)
+ {
+ tree type;
+ if (TREE_CODE (arg) == ARRAY_REF
+ || TREE_CODE (arg) == MEM_REF)
+ type = TREE_TYPE (arg);
+ else if (TREE_CODE (arg) == COMPONENT_REF)
+ {
+ tree field = TREE_OPERAND (arg, 1);
+ type = TREE_TYPE (field);
+ }
+ else
+ return NULL_TREE;
what's wrong with just
type = TREE_TYPE (field);
?
+ base_off *= BITS_PER_UNIT;
poly_uint64 isn't enough for "bits", with wide-int you'd use offset_int,
for poly you'd then use poly_offset?
You extend fold_ctor_reference to treat size == 0 specially but then
bother to compute a size here - that looks unneeded?
While the offset of the reference determines the first field in the
CONSTRUCTOR, how do you know the access doesn't touch
adjacent ones? STRING_CSTs do not have to be '\0' terminated,
so consider
char x[2][4] = { "abcd", "abcd" };
and MEM[&x] with a char[8] type? memcpy "inlining" will create
such MEMs for example.
@@ -6554,8 +6577,16 @@ fold_nonarray_ctor_reference (tree type, tree ctor,
tree byte_offset = DECL_FIELD_OFFSET (cfield);
tree field_offset = DECL_FIELD_BIT_OFFSET (cfield);
tree field_size = DECL_SIZE (cfield);
- offset_int bitoffset;
- offset_int bitoffset_end, access_end;
+
+ if (!field_size && TREE_CODE (cval) == STRING_CST)
+ {
+ /* Determine the size of the flexible array member from
+ the size of the string initializer provided for it. */
+ unsigned HOST_WIDE_INT len = TREE_STRING_LENGTH (cval);
+ tree eltype = TREE_TYPE (TREE_TYPE (cval));
+ len *= tree_to_uhwi (TYPE_SIZE (eltype));
+ field_size = build_int_cst (size_type_node, len);
+ }
Why does this only apply to STRING_CST initializers and not CONSTRUCTORS,
say, for
struct S { int i; int a[]; } s = { 1, { 2, 3, 4, 5, 6 } };
? And why not use simply
field_size = TYPE_SIZE (TREE_TYPE (cval));
like you do in c_strlen?
Otherwise looks reasonable.
Thanks,
Richard.
> Martin
More information about the Gcc-patches
mailing list