This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Don't emit cmpstrsi and movstr* insns with -Os (take 2)


On Sun, 7 Sep 2003, Richard Henderson wrote:
> On Sat, Sep 06, 2003 at 04:58:40PM -0600, Roger Sayle wrote:
> > Disabling cmpstrsi with -Os shouldn't be controversial, the only backend
> > .md file that defines a pattern for it is i386.md.  My belief is that
> > disabling movstr* should also be a space win on other platforms, most of
> > which don't have the CISC advantages of x86's variable instruction length
> > encoding, complex memory move instructions and unaligned memory accesses.
>
> Any chance you could get this tested on s390?  If you get
> similar results there, then this is ok.

I've managed to get access to an s390, and unfortunately movstr* _is_
a space win on that target.  Doh!  Interestingly, the Linux instance
that I had access to on the zSeries mainframe only had 256Mb of memory
and 800Mb of hard-disk virtualized (not enough to fully bootstrap GCC),
so even though these machines can run thousands of simultaneous Linux
instances, -Os might not be the academic exercise that I initially
thought.  The CSiBE results for s390-ibm-linux-gnu are:

Configuration		Size (bytes)	Delta
-Os			1203178
-Os -mmvcle		1203054		-124
-Os (+patch)		1203658		+480
-Os -mmvcle (+patch)	1203774		+596


So my original patch that universally disabled movstr* patterns in the
middle-end actually increases code size on s390.  Also notice that
Ulrich's suggestion of enabling the -mmvcle instruction(s) when using
-Os helps code size, and appears to be a good idea.  [I'll let him
submit the patch to do that].


In the face of these results, I've completely rewritten the original
patch to only disable cmpstrsi and movstrsi patterns with -Os in the
i386 backend.  The change to do this in i386.md is trivial, unfortunately
the middle-end now had to be able to handle the consequences of
gen_cmpstrsi failing (which previously it never did).  This requires
stabilization of parameter lists to prevent re-evaluation of arguments
in builtins.c, much like we are already doing for the mathematical
builtins.

Before someone points out the apparent duplication, the code is written
the way it is so that if we fail to inline strcmp as a cmpstrsi we fall
back to calling the library strcmp and not the library strncmp (which
would increase code size!).

I also needed to disable some of the tests in string-opt-8.c that were
checking that we were inlining strncmp on i386.  Now that we don't do
this when optimizing for size, the original test fails at -Os.


The CSiBE results against this new patch on i686-pc-linux-gnu are:

Configuration		Size (bytes)	Delta
-Os			989446
-Os (disable cmpstrsi)	987312		-2134
-Os (disable movstrsi)	988896		-550
-Os (disable both)	986627		-2819


The following patch has been tested on i686-pc-linux-gnu with a complete
"make bootstrap", all languages except treelang, and regression tested
with a top-level "make -k check" with no new failures.

Ok for mainline?


2003-09-11  Roger Sayle  <roger@eyesopen.com>

	* builtins.c (expand_builtin_strcmp): Defend against the possibility
	that gen_cmpstrsi may fail: Stabilize the argument list against
	re-evaluation and expand the library call directly using this saved
	argument list if a cmpstrsi sequence can't be generated.
	(expand_builtin_strncmp): Likewise.

	* config/i386/i386.md (cmpstrsi, movstrsi): Disable with -Os.

	* gcc.c-torture/execute/string-opt-8.c: Don't test optimizations
	that inline strncmp as cmpstrsi on i386 when compiled with -Os.


Index: builtins.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/builtins.c,v
retrieving revision 1.250
diff -c -3 -p -r1.250 builtins.c
*** builtins.c	9 Sep 2003 19:20:42 -0000	1.250
--- builtins.c	11 Sep 2003 04:18:01 -0000
*************** expand_builtin_strcmp (tree exp, rtx tar
*** 3286,3291 ****
--- 3286,3292 ----
      tree len, len1, len2;
      rtx arg1_rtx, arg2_rtx, arg3_rtx;
      rtx result, insn;
+     tree fndecl;

      int arg1_align
        = get_pointer_alignment (arg1, BIGGEST_ALIGNMENT) / BITS_PER_UNIT;
*************** expand_builtin_strcmp (tree exp, rtx tar
*** 3341,3364 ****
  	   && REGNO (result) >= FIRST_PSEUDO_REGISTER))
        result = gen_reg_rtx (insn_mode);

      arg1_rtx = get_memory_rtx (arg1);
      arg2_rtx = get_memory_rtx (arg2);
      arg3_rtx = expand_expr (len, NULL_RTX, VOIDmode, 0);
      insn = gen_cmpstrsi (result, arg1_rtx, arg2_rtx, arg3_rtx,
  			 GEN_INT (MIN (arg1_align, arg2_align)));
!     if (!insn)
!       return 0;
!
!     emit_insn (insn);
!
!     /* Return the value in the proper mode for this function.  */
!     mode = TYPE_MODE (TREE_TYPE (exp));
!     if (GET_MODE (result) == mode)
!       return result;
!     if (target == 0)
!       return convert_to_mode (mode, result, 0);
!     convert_move (target, result, 0);
!     return target;
    }
  #endif
    return 0;
--- 3342,3377 ----
  	   && REGNO (result) >= FIRST_PSEUDO_REGISTER))
        result = gen_reg_rtx (insn_mode);

+     /* Stabilize the arguments in case gen_cmpstrsi fails.  */
+     arg1 = save_expr (arg1);
+     arg2 = save_expr (arg2);
+
      arg1_rtx = get_memory_rtx (arg1);
      arg2_rtx = get_memory_rtx (arg2);
      arg3_rtx = expand_expr (len, NULL_RTX, VOIDmode, 0);
      insn = gen_cmpstrsi (result, arg1_rtx, arg2_rtx, arg3_rtx,
  			 GEN_INT (MIN (arg1_align, arg2_align)));
!     if (insn)
!       {
! 	emit_insn (insn);
!
! 	/* Return the value in the proper mode for this function.  */
! 	mode = TYPE_MODE (TREE_TYPE (exp));
! 	if (GET_MODE (result) == mode)
! 	  return result;
! 	if (target == 0)
! 	  return convert_to_mode (mode, result, 0);
! 	convert_move (target, result, 0);
! 	return target;
!       }
!
!     /* Expand the library call ourselves using a stabilized argument
!        list to avoid re-evaluating the function's arguments twice.  */
!     arglist = build_tree_list (NULL_TREE, arg2);
!     arglist = tree_cons (NULL_TREE, arg1, arglist);
!     fndecl = get_callee_fndecl (exp);
!     exp = build_function_call_expr (fndecl, arglist);
!     return expand_call (exp, target, target == const0_rtx);
    }
  #endif
    return 0;
*************** expand_builtin_strncmp (tree exp, rtx ta
*** 3433,3438 ****
--- 3446,3452 ----
      tree len, len1, len2;
      rtx arg1_rtx, arg2_rtx, arg3_rtx;
      rtx result, insn;
+     tree fndecl;

      int arg1_align
        = get_pointer_alignment (arg1, BIGGEST_ALIGNMENT) / BITS_PER_UNIT;
*************** expand_builtin_strncmp (tree exp, rtx ta
*** 3491,3514 ****
  	   && REGNO (result) >= FIRST_PSEUDO_REGISTER))
        result = gen_reg_rtx (insn_mode);

      arg1_rtx = get_memory_rtx (arg1);
      arg2_rtx = get_memory_rtx (arg2);
      arg3_rtx = expand_expr (len, NULL_RTX, VOIDmode, 0);
      insn = gen_cmpstrsi (result, arg1_rtx, arg2_rtx, arg3_rtx,
  			 GEN_INT (MIN (arg1_align, arg2_align)));
!     if (!insn)
!       return 0;
!
!     emit_insn (insn);
!
!     /* Return the value in the proper mode for this function.  */
!     mode = TYPE_MODE (TREE_TYPE (exp));
!     if (GET_MODE (result) == mode)
!       return result;
!     if (target == 0)
!       return convert_to_mode (mode, result, 0);
!     convert_move (target, result, 0);
!     return target;
    }
  #endif
    return 0;
--- 3505,3542 ----
  	   && REGNO (result) >= FIRST_PSEUDO_REGISTER))
        result = gen_reg_rtx (insn_mode);

+     /* Stabilize the arguments in case gen_cmpstrsi fails.  */
+     arg1 = save_expr (arg1);
+     arg2 = save_expr (arg2);
+     len = save_expr (len);
+
      arg1_rtx = get_memory_rtx (arg1);
      arg2_rtx = get_memory_rtx (arg2);
      arg3_rtx = expand_expr (len, NULL_RTX, VOIDmode, 0);
      insn = gen_cmpstrsi (result, arg1_rtx, arg2_rtx, arg3_rtx,
  			 GEN_INT (MIN (arg1_align, arg2_align)));
!     if (insn)
!       {
! 	emit_insn (insn);
!
! 	/* Return the value in the proper mode for this function.  */
! 	mode = TYPE_MODE (TREE_TYPE (exp));
! 	if (GET_MODE (result) == mode)
! 	  return result;
! 	if (target == 0)
! 	  return convert_to_mode (mode, result, 0);
! 	convert_move (target, result, 0);
! 	return target;
!       }
!
!     /* Expand the library call ourselves using a stabilized argument
!        list to avoid re-evaluating the function's arguments twice.  */
!     arglist = build_tree_list (NULL_TREE, len);
!     arglist = tree_cons (NULL_TREE, arg2, arglist);
!     arglist = tree_cons (NULL_TREE, arg1, arglist);
!     fndecl = get_callee_fndecl (exp);
!     exp = build_function_call_expr (fndecl, arglist);
!     return expand_call (exp, target, target == const0_rtx);
    }
  #endif
    return 0;
Index: config/i386/i386.md
===================================================================
RCS file: /cvs/gcc/gcc/gcc/config/i386/i386.md,v
retrieving revision 1.481
diff -c -3 -p -r1.481 i386.md
*** config/i386/i386.md	23 Aug 2003 21:18:58 -0000	1.481
--- config/i386/i386.md	11 Sep 2003 04:18:06 -0000
***************
*** 16129,16135 ****
     (use (match_operand:BLK 1 "memory_operand" ""))
     (use (match_operand:SI 2 "nonmemory_operand" ""))
     (use (match_operand:SI 3 "const_int_operand" ""))]
!   ""
  {
   if (ix86_expand_movstr (operands[0], operands[1], operands[2], operands[3]))
     DONE;
--- 16129,16135 ----
     (use (match_operand:BLK 1 "memory_operand" ""))
     (use (match_operand:SI 2 "nonmemory_operand" ""))
     (use (match_operand:SI 3 "const_int_operand" ""))]
!   "! optimize_size"
  {
   if (ix86_expand_movstr (operands[0], operands[1], operands[2], operands[3]))
     DONE;
***************
*** 16849,16855 ****
  		    (match_operand:BLK 2 "general_operand" "")))
     (use (match_operand 3 "general_operand" ""))
     (use (match_operand 4 "immediate_operand" ""))]
!   ""
  {
    rtx addr1, addr2, out, outlow, count, countreg, align;

--- 16849,16855 ----
  		    (match_operand:BLK 2 "general_operand" "")))
     (use (match_operand 3 "general_operand" ""))
     (use (match_operand 4 "immediate_operand" ""))]
!   "! optimize_size"
  {
    rtx addr1, addr2, out, outlow, count, countreg, align;

Index: testsuite/gcc.c-torture/execute/string-opt-8.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/string-opt-8.c,v
retrieving revision 1.10
diff -c -3 -p -r1.10 string-opt-8.c
*** testsuite/gcc.c-torture/execute/string-opt-8.c	15 Jul 2003 21:26:36 -0000	1.10
--- testsuite/gcc.c-torture/execute/string-opt-8.c	11 Sep 2003 15:20:32 -0000
*************** int main ()
*** 65,71 ****
    s2 = s1; s3 = s1+4;
    if (strncmp (++s2, ++s3+2, 1) >= 0 || s2 != s1+1 || s3 != s1+5)
      abort();
! #if !defined(__OPTIMIZE__) || defined(__i386__)
    /* These tests work on platforms which support cmpstrsi.  We test it
       at -O0 on all platforms to ensure the strncmp logic is correct.  */
    s2 = s1;
--- 65,71 ----
    s2 = s1; s3 = s1+4;
    if (strncmp (++s2, ++s3+2, 1) >= 0 || s2 != s1+1 || s3 != s1+5)
      abort();
! #if !defined(__OPTIMIZE__) || (defined(__i386__) && !defined(__OPTIMIZE_SIZE__))
    /* These tests work on platforms which support cmpstrsi.  We test it
       at -O0 on all platforms to ensure the strncmp logic is correct.  */
    s2 = s1;


Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]