[PATCH] x86: Speed up target attribute handling by using a cache

Martin Liška mliska@suse.cz
Mon Nov 22 13:03:19 GMT 2021


On 11/22/21 10:36, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> The target attribute handling is very expensive and for the common case
> from x86intrin.h where many functions get implicitly the same target
> attribute, we can speed up compilation a lot by caching it.
> 
> The following patches both create a single entry cache, where they cache
> for a particular target attribute argument list the resulting
> DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION
> values from ix86_valid_target_attribute_p and use the cache if the
> args are the same as last time and we start either from NULL values
> of those, or from the recorded values for those from last time.
> 
> Compiling a simple:
>   #include <x86intrin.h>
> 
>   int i;
> testcase with ./cc1 -quiet -O2 -isystem include/ test.c
> takes on my WS without the patches ~0.392s and with either of the
> patches ~0.182s, i.e. roughly half the time as before.
> For ./cc1plus -quiet -O2 -isystem include/ test.c
> it is slightly worse, the speed up is from ~0.613s to ~0.403s.
> 
> The difference between the 2 patches is that the first one uses copy_list
> while the second one uses a vec, so I think the second one has the advantage
> of creating less GC garbage.

Hello.

I see only one patch attached, Jakub. Can you please send also the second one?

> I've verified both patches achieve the same content of those
> DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION
> nodes as before on x86intrin.h by doing debug_tree on those and comparing
> the stderr from without these patches to with these patches.
> 
> Both patches were bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk (and which one)?
> 
> 2021-11-22  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* attribs.h (simple_cst_list_equal): Declare.
> 	* attribs.c (simple_cst_list_equal): No longer static.
> 	* config/i386/i386-options.c (target_attribute_cache): New variable.
> 	(ix86_valid_target_attribute_p): Cache DECL_FUNCTION_SPECIFIC_TARGET
> 	and DECL_FUNCTION_SPECIFIC_OPTIMIZATION based on args.
> 
> --- gcc/attribs.h.jj	2021-11-11 14:35:37.442350841 +0100
> +++ gcc/attribs.h	2021-11-19 11:52:08.843252645 +0100
> @@ -60,6 +60,7 @@ extern tree build_type_attribute_variant
>   extern tree build_decl_attribute_variant (tree, tree);
>   extern tree build_type_attribute_qual_variant (tree, tree, int);
>   
> +extern bool simple_cst_list_equal (const_tree, const_tree);
>   extern bool attribute_value_equal (const_tree, const_tree);
>   
>   /* Return 0 if the attributes for two types are incompatible, 1 if they
> --- gcc/attribs.c.jj	2021-11-11 14:35:37.442350841 +0100
> +++ gcc/attribs.c	2021-11-19 11:51:43.473615692 +0100
> @@ -1290,7 +1290,7 @@ cmp_attrib_identifiers (const_tree attr1
>   /* Compare two constructor-element-type constants.  Return 1 if the lists
>      are known to be equal; otherwise return 0.  */
>   
> -static bool
> +bool
>   simple_cst_list_equal (const_tree l1, const_tree l2)
>   {
>     while (l1 != NULL_TREE && l2 != NULL_TREE)
> --- gcc/config/i386/i386-options.c.jj	2021-11-15 13:19:07.347900863 +0100
> +++ gcc/config/i386/i386-options.c	2021-11-20 00:27:32.919505947 +0100
> @@ -1403,6 +1403,8 @@ ix86_valid_target_attribute_tree (tree f
>     return t;
>   }
>   
> +static GTY(()) tree target_attribute_cache[3];

I would come up with a struct that would wrap the 3 trees as
target_attribute_cache[index] accessing is not much intuitive.

Thanks,
Martin

> +
>   /* Hook to validate attribute((target("string"))).  */
>   
>   bool
> @@ -1423,6 +1425,19 @@ ix86_valid_target_attribute_p (tree fnde
>         && strcmp (TREE_STRING_POINTER (TREE_VALUE (args)), "default") == 0)
>       return true;
>   
> +  if ((DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == target_attribute_cache[1]
> +       || DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE)
> +      && (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl)
> +	  == target_attribute_cache[2]
> +	  || DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE)
> +      && simple_cst_list_equal (args, target_attribute_cache[0]))
> +    {
> +      DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_attribute_cache[1];
> +      DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl)
> +	= target_attribute_cache[2];
> +      return true;
> +    }
> +
>     tree old_optimize = build_optimization_node (&global_options,
>   					       &global_options_set);
>   
> @@ -1456,8 +1471,17 @@ ix86_valid_target_attribute_p (tree fnde
>     if (new_target == error_mark_node)
>       ret = false;
>   
> -  else if (fndecl && new_target)
> +  else if (new_target)
>       {
> +      if (DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE
> +	  && DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE)
> +	{
> +	  target_attribute_cache[0] = copy_list (args);
> +	  target_attribute_cache[1] = new_target;
> +	  target_attribute_cache[2]
> +	    = old_optimize != new_optimize ? new_optimize : NULL_TREE;
> +	}
> +
>         DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target;
>   
>         if (old_optimize != new_optimize)
> 
> 	Jakub
> 



More information about the Gcc-patches mailing list