This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[RFC] Request for comments on ivopts patch

From: "Steve Ellcey " <sellcey at imgtec dot com>
To: <gcc-patches at gcc dot gnu dot org>
Cc: <amker dot cheng at gmail dot com>
Date: Tue, 8 Dec 2015 10:56:04 -0800
Subject: [RFC] Request for comments on ivopts patch
Authentication-results: sourceware.org; auth=none

I have an ivopts optimization question/proposal.  When compiling the
attached program the ivopts pass prefers the original ivs over new ivs
and that causes us to generate less efficient code on MIPS.  It may
affect other platforms too.

The Source code is a C strcmp:

int strcmp (const char *p1, const char *p2)
{
  const unsigned char *s1 = (const unsigned char *) p1;
  const unsigned char *s2 = (const unsigned char *) p2;
  unsigned char c1, c2;
  do {
      c1 = (unsigned char) *s1++;
      c2 = (unsigned char) *s2++;
      if (c1 == '\0') return c1 - c2;
  } while (c1 == c2);
  return c1 - c2;
}

Currently the code prefers the original ivs and so it generates
code that increments s1 and s2 before doing the loads (and uses
a -1 offset):

  <bb 3>:
  # s1_1 = PHI <p1_4(D)(2), s1_6(6)>
  # s2_2 = PHI <p2_5(D)(2), s2_9(6)>
  s1_6 = s1_1 + 1;
  c1_8 = MEM[base: s1_6, offset: 4294967295B];
  s2_9 = s2_2 + 1;
  c2_10 = MEM[base: s2_9, offset: 4294967295B];
  if (c1_8 == 0)
    goto <bb 4>;
  else
    goto <bb 5>;

If I remove the cost increment for non-original ivs then GCC
does the loads before the increments:

 <bb 3>:
  # ivtmp.6_17 = PHI <ivtmp.6_24(2), ivtmp.6_14(6)>
  # ivtmp.7_21 = PHI <ivtmp.7_22(2), ivtmp.7_23(6)>
  _25 = (void *) ivtmp.6_17;
  c1_8 = MEM[base: _25, offset: 0B];
  _26 = (void *) ivtmp.7_21;
  c2_10 = MEM[base: _26, offset: 0B];
  if (c1_8 == 0)
    goto <bb 4>;
  else
    goto <bb 5>;
.
.
  <bb 5>:
  ivtmp.6_14 = ivtmp.6_17 + 1;
  ivtmp.7_23 = ivtmp.7_21 + 1;
  if (c1_8 == c2_10)
    goto <bb 6>;
  else
    goto <bb 7>;


This second case (without the preference for the original IV)
generates better code on MIPS because the final assembly 
has the increment instructions between the loads and the tests
of the values being loaded and so there is no delay (or less delay)
between the load and use.  It seems like this could easily be 
the case for other platforms too so I was wondering what people
thought of this patch:


2015-12-08  Steve Ellcey  <sellcey@imgtec.com>

	* tree-ssa-loop-ivopts.c (determine_iv_cost): Remove preference for
	original ivs.


diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 98dc451..26daabc 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -5818,14 +5818,6 @@ determine_iv_cost (struct ivopts_data *data, struct iv_cand *cand)
 
   cost = cost_step + adjust_setup_cost (data, cost_base.cost);
 
-  /* Prefer the original ivs unless we may gain something by replacing it.
-     The reason is to make debugging simpler; so this is not relevant for
-     artificial ivs created by other optimization passes.  */
-  if (cand->pos != IP_ORIGINAL
-      || !SSA_NAME_VAR (cand->var_before)
-      || DECL_ARTIFICIAL (SSA_NAME_VAR (cand->var_before)))
-    cost++;
-
   /* Prefer not to insert statements into latch unless there are some
      already (so that we do not create unnecessary jumps).  */
   if (cand->pos == IP_END

Follow-Ups:
- Re: [RFC] Request for comments on ivopts patch
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]