Bug 31150 - [4.2/4.3/4.4 Regression] Not promoting an whole array to be static const
Summary: [4.2/4.3/4.4 Regression] Not promoting an whole array to be static const
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.3.0
: P2 normal
Target Milestone: 4.4.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on: 21478
Blocks:
  Show dependency treegraph
 
Reported: 2007-03-12 16:46 UTC by Andrew Pinski
Modified: 2008-12-22 23:45 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-08-06 15:19:28


Attachments
fix PR31150 (718 bytes, patch)
2007-06-26 07:42 UTC, revital eres
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andrew Pinski 2007-03-12 16:46:38 UTC
Take:
int main()
{
  char str[2][34] = {"a","b"};

  __builtin_puts(str[0]);

  return 0;
}

In 4.0.2, we used to get:
  static char C.0[2][34] = {"a", "b"};
      char str[2][34];

      str = C.0;
      __builtin_puts (&str[0][0]);


But in 4.1.1 and above, we get:
  char str[2][34];

  str = {};
  str[0] = "a";
  str[1] = "b";

Which causes us to first zero out the array and then do load/stores on the stack.
Comment 1 Andrew Pinski 2007-04-16 00:56:22 UTC
This was introduced by the fix for PR 21478.
Comment 2 revital eres 2007-06-25 12:31:52 UTC
I would like to be assigned to this bug.

Thanks,
Revital
Comment 3 revital eres 2007-06-26 07:42:44 UTC
Created attachment 13791 [details]
fix PR31150

Attached is a patch to initialize the scalar elmenets of the array
which should fix this problem. 

char str[2][34] = {"a","b"};
==>
str = {};
str[0][0] = 97;
str[1][0] = 98;

I appreciate any comments.

Thanks,
Revital
Comment 4 patchapp@dberlin.org 2007-06-28 05:50:19 UTC
Subject: Bug number PR31150

A patch for this bug has been added to the patch tracker.
The mailing list url for the patch is http://gcc.gnu.org/ml/gcc-patches/2007-06/msg02037.html
Comment 5 revital eres 2007-06-28 11:55:05 UTC
(Form off-line discussion with Richard Guenther)

For- 

char str[2][16] = {"thisis16charslo","thisis16charslo"};

On ppc64 we will get -

static char C.0[2][16] = {"thisis16charslo", "thisis16charslo"};

while on x86_64 - 

str[0] = "thisis16charslo";
str[1] = "thisis16charslo";

and this patch makes this output much worse (although it should be
applied only on sparse data)
Comment 6 Jakub Jelinek 2007-08-23 20:36:13 UTC
It is unfortunate that gimple can only initialize the whole array, unless
__builtin_memcpy is used.  Unfortunately __builtin_memcpy has a different
drawback - the var will be forced to be TREE_ADDRESSABLE even when it otherwise
wouldn't have to be.
So, we have the option to add some flag to count_type_elements etc., so that
they will behave one way for expr.c etc. and one way for the gimplifier - there it would need to count whole arrays as units if they are initialized from string
literals, etc.
The other option is IMHO to hint during expansion expand_assignment that
the memory is already cleared by previous var = {} and so it can avoid storing
zeros.  Or teach some RTL pass to remove the redundant zero stores.  If
the big clear_storage is done through clearing by pieces, DSE will already handle this to some extent (will remove redundant stores from the big clear_storage), but if it is done through a special insn (like rep stosb) or
through memset call, nothing will remove it.

While looking at this, I have noticed that a simple
struct A { char c[10]; };

void
foo (void)
{
  struct A a = { "abcdefghi" };
  baz (&a);
}

void
bar (void)
{
  struct A a;
  __builtin_strcpy (&a.c[0], "abcdefghi");
  baz (&a);
}

is not as efficient as it ought to be in the foo function - while strcpy/memcpy
can and in this case will use store_by_pieces, store_expr will always emit the string literal to memory and then emit a block move from it to target followed
optionally by clear_storage of the rest of the memory.
I'm testing a patch to improve this ATM.
Comment 7 Andrew Pinski 2008-07-03 23:22:50 UTC
(In reply to comment #4)
> http://gcc.gnu.org/ml/gcc-patches/2007-06/msg02037.html

This patch causes the following simple testcase to fail as we are stopping at the first NULL character:
int main()
{
  char str[2][34] = {"a\0c","b"};

  if (str[0][2] != 'c')
    __builtin_abort ();

  return 0;
}
Comment 8 Joseph S. Myers 2008-07-04 22:00:29 UTC
Closing 4.1 branch.
Comment 9 revital eres 2008-07-06 05:29:29 UTC
Following http://gcc.gnu.org/ml/gcc/2008-07/msg00104.html
I would like to ask to be unassign from this bug.

Thanks
Comment 10 Jakub Jelinek 2008-12-22 23:42:44 UTC
Subject: Bug 31150

Author: jakub
Date: Mon Dec 22 23:41:17 2008
New Revision: 142892

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=142892
Log:
	PR middle-end/31150
	* dse.c (struct store_info): Add const_rhs field.
	(clear_rhs_from_active_local_stores): Clear also const_rhs.
	(record_store): Try also cselib_expand_value_rtx to get a constant.
	(find_shift_sequence, get_stored_val): Use const_rhs instead of
	rhs if worthwhile.
	* cselib.c (cselib_record_sets): If !cselib_record_memory and
	there is just one set from read-only MEM, look at REG_EQUAL or
	REG_EQUIV note.

	* dse.c (struct store_info): Add redundant_reason field.
	(record_store): When storing the same constant as has been
	stored by an earlier store, set redundant_reason field
	to the earlier store's insn_info_t.  Don't delete cannot_delete
	insns.
	(find_shift_sequence): Remove read_info argument, add read_mode
	and require_cst arguments.  Return early if require_cst and
	constant wouldn't be returned.
	(get_stored_val): New function.
	(replace_read): Use it.
	(scan_insn): Put even cannot_delete insns with exactly 1 store
	into active_local_stores.
	(dse_step1): Don't delete cannot_delete insns.  Remove redundant
	constant stores if contains_cselib_groups and earlier store storing
	the same value hasn't been eliminated.
	(dse_step6): Renamed to dse_step7.  New function.
	(dse_step7): Renamed from dse_step6.
	(rest_of_handle_dse): Call dse_step6 and dse_step7 at the end.
	* cselib.c (cselib_expand_value_rtx): Don't wrap CONST_INTs
	into CONST unless really necessary.  Handle SUBREG, unary,
	ternary, bitfield and compares specially, to be able to simplify
	operations on constants.
	(expand_loc): Try to optimize LO_SUM.

	* dse.c (get_call_args): New function.
	(scan_insn): Don't handle BUILT_IN_BZERO.  For memset, attempt
	to get call arguments and if successful and both len and val are
	constants, handle the call as (mem:BLK) (const_int) store.

	* dse.c (struct store_info): Add is_large bool field, change
	positions_needed into a union of a bitmask and bitmap + count.
	(free_store_info): Free bitmap if is_large.
	(set_usage_bits): Don't look at stores where
	offset + width >= MAX_OFFSET.
	(set_position_unneeded, set_all_positions_unneeded,
	any_positions_needed_p, all_positions_needed_p): New static inline
	functions.
	(record_store): Handle BLKmode stores of CONST_INT, if
	MEM_SIZE is set on the MEM.  Use the new positions_needed
	accessor inlines.
	(replace_read): Handle reads from BLKmode CONST_INT stores.
	(check_mem_read_rtx): Use all_positions_needed_p function.
	(dse_step1): Free large positions_needed bitmaps and clear is_large.

	* dse.c (struct store_info): Change begin and end types to
	HOST_WIDE_INT.

	* dse.c (record_store): Fix check for unused store.

	* expr.c (block_clear_fn): No longer static.
	* expr.h (block_clear_fn): Declare.
	* dse.c (scan_insn): Memset and bzero can just read their
	arguments.

	* gcc.c-torture/execute/20081218-1.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/20081218-1.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cselib.c
    trunk/gcc/dse.c
    trunk/gcc/expr.c
    trunk/gcc/expr.h
    trunk/gcc/testsuite/ChangeLog

Comment 11 Jakub Jelinek 2008-12-22 23:45:14 UTC
Fixed on the trunk.  Won't fix for older releases.