[PATCH] options, lto: Optimize streaming of optimization nodes

Jan Hubicka hubicka@ucw.cz
Mon Sep 14 09:02:26 GMT 2020


> On Mon, Sep 14, 2020 at 09:31:52AM +0200, Richard Biener wrote:
> > But does it make any noticable difference in the end?  Using
> 
> Yes.
> 
> > bp_pack_var_len_unsigned just causes us to [u]leb encode half-bytes
> > rather than full bytes.  Using hardcoded 8/16/32/64 makes it still
> > dependent on what 'int' is at maximum on the host.
> > 
> > That is, I'd indeed prefer bp_pack_var_len_unsigned over hard-coding
> > 8, 16, etc., but can you share a size comparison of the bitpack?
> > I guess with bp_pack_var_len_unsigned it might shrink in half
> > compared to the current code and streaming standard -O2?
> 
> So, I've tried
> --- gcc/tree-streamer-out.c.jj	2020-07-28 15:39:10.079755251 +0200
> +++ gcc/tree-streamer-out.c	2020-09-14 10:31:29.106957258 +0200
> @@ -489,7 +489,11 @@ streamer_write_tree_bitfields (struct ou
>      pack_ts_translation_unit_decl_value_fields (ob, &bp, expr);
>  
>    if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
> +{
> +long ts = ob->main_stream->total_size;
>      cl_optimization_stream_out (ob, &bp, TREE_OPTIMIZATION (expr));
> +fprintf (stderr, "total_size %ld\n", (long) (ob->main_stream->total_size - ts));
> +}

You should be able to read the sizes from streaming dump file as well.
>  
>    if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
>      bp_pack_var_len_unsigned (&bp, CONSTRUCTOR_NELTS (expr));
> hack without and with the following patch on a simple small testcase with
> -O2 -flto.
> Got 574 bytes without the opc-save-gen.awk change and 454 bytes with it,
> that is ~ 21% saving on the TREE_OPTIMIZATION streaming.
> 
> 2020-09-14  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* optc-save-gen.awk: In cl_optimization_stream_out use
> 	bp_pack_var_len_{int,unsigned} instead of bp_pack_value.  In
> 	cl_optimization_stream_in use bp_unpack_var_len_{int,unsigned}
> 	instead of bp_unpack_value.  Formatting fix.
> 
> --- gcc/optc-save-gen.awk.jj	2020-09-14 09:04:35.879854156 +0200
> +++ gcc/optc-save-gen.awk	2020-09-14 10:38:47.722424942 +0200
> @@ -1257,8 +1257,10 @@ for (i = 0; i < n_opt_val; i++) {
>  	otype = var_opt_val_type[i];
>  	if (otype ~ "^const char \\**$")
>  		print "  bp_pack_string (ob, bp, ptr->" name", true);";
> +	else if (otype ~ "^unsigned")
> +		print "  bp_pack_var_len_unsigned (bp, ptr->" name");";
>  	else
> -		print "  bp_pack_value (bp, ptr->" name", 64);";
> +		print "  bp_pack_var_len_int (bp, ptr->" name");";
>  }
>  print "  for (size_t i = 0; i < sizeof (ptr->explicit_mask) / sizeof (ptr->explicit_mask[0]); i++)";
>  print "    bp_pack_value (bp, ptr->explicit_mask[i], 64);";
> @@ -1274,14 +1276,15 @@ print "{";
>  for (i = 0; i < n_opt_val; i++) {
>  	name = var_opt_val[i]
>  	otype = var_opt_val_type[i];
> -	if (otype ~ "^const char \\**$")
> -	{
> -	      print "  ptr->" name" = bp_unpack_string (data_in, bp);";
> -	      print "  if (ptr->" name")";
> -	      print "    ptr->" name" = xstrdup (ptr->" name");";
> +	if (otype ~ "^const char \\**$") {
> +		print "  ptr->" name" = bp_unpack_string (data_in, bp);";
> +		print "  if (ptr->" name")";
> +		print "    ptr->" name" = xstrdup (ptr->" name");";
>  	}
> +	else if (otype ~ "^unsigned")
> +		print "  ptr->" name" = (" var_opt_val_type[i] ") bp_unpack_var_len_unsigned (bp);";
>  	else
> -	      print "  ptr->" name" = (" var_opt_val_type[i] ") bp_unpack_value (bp, 64);";
> +		print "  ptr->" name" = (" var_opt_val_type[i] ") bp_unpack_var_len_int (bp);";

Not making difference between signed/unsigned was my implementation
lazyness at the time code was added. So this looks like nice cleanup.

Especially for the new param machinery, most of streamed values are
probably going to be the default values.  Perhaps somehow we could
stream them more effectively.

Overall we sould not get much more than 1 optimize/target node per unit
so the size should show up only when you stream a lot of very small .o
files.

Honza
>  }
>  print "  for (size_t i = 0; i < sizeof (ptr->explicit_mask) / sizeof (ptr->explicit_mask[0]); i++)";
>  print "    ptr->explicit_mask[i] = bp_unpack_value (bp, 64);";
> 
> 
> 	Jakub
> 


More information about the Gcc-patches mailing list