This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [wide-int] Handle more add and sub cases inline


I would like to see some comment to the effect that this to allow inlining for the common case for widest int and offset int without inlining the uncommon case for regular wide-int.




On 11/28/2013 12:38 PM, Richard Sandiford wrote:
Currently add and sub have no fast path for offset_int and widest_int,
they just call the out-of-line version.  This patch handles the
single-HWI cases inline.  At least on x86_64, this only adds one branch
per call; the fast path itself is straight-line code.

On the same fold-const.ii testcase, this reduces the number of
add_large calls from 877507 to 42459.  It reduces the number of
sub_large calls from 25707 to 148.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/wide-int.h
===================================================================
--- gcc/wide-int.h	2013-11-28 13:34:19.596839877 +0000
+++ gcc/wide-int.h	2013-11-28 16:08:11.387731775 +0000
@@ -2234,6 +2234,17 @@ wi::add (const T1 &x, const T2 &y)
        val[0] = xi.ulow () + yi.ulow ();
        result.set_len (1);
      }
+  else if (STATIC_CONSTANT_P (precision > HOST_BITS_PER_WIDE_INT)
+	   && xi.len + yi.len == 2)
+    {
+      unsigned HOST_WIDE_INT xl = xi.ulow ();
+      unsigned HOST_WIDE_INT yl = yi.ulow ();
+      unsigned HOST_WIDE_INT resultl = xl + yl;
+      val[0] = resultl;
+      val[1] = (HOST_WIDE_INT) resultl < 0 ? 0 : -1;
+      result.set_len (1 + (((resultl ^ xl) & (resultl ^ yl))
+			   >> (HOST_BITS_PER_WIDE_INT - 1)));
+    }
    else
      result.set_len (add_large (val, xi.val, xi.len,
  			       yi.val, yi.len, precision,
@@ -2288,6 +2299,17 @@ wi::sub (const T1 &x, const T2 &y)
        val[0] = xi.ulow () - yi.ulow ();
        result.set_len (1);
      }
+  else if (STATIC_CONSTANT_P (precision > HOST_BITS_PER_WIDE_INT)
+	   && xi.len + yi.len == 2)
+    {
+      unsigned HOST_WIDE_INT xl = xi.ulow ();
+      unsigned HOST_WIDE_INT yl = yi.ulow ();
+      unsigned HOST_WIDE_INT resultl = xl - yl;
+      val[0] = resultl;
+      val[1] = (HOST_WIDE_INT) resultl < 0 ? 0 : -1;
+      result.set_len (1 + (((resultl ^ xl) & (xl ^ yl))
+			   >> (HOST_BITS_PER_WIDE_INT - 1)));
+    }
    else
      result.set_len (sub_large (val, xi.val, xi.len,
  			       yi.val, yi.len, precision,



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]