[patch]: Enhance rtl-dse to remove unaligned read after writes.

Ian Lance Taylor iant@google.com
Wed Sep 12 21:21:00 GMT 2007


Kenneth Zadeck <Kenneth.Zadeck@NaturalBridge.com> writes:

> 2007-07-24  Eric Christopher  <echristo@apple.com>
> 	    Kenneth Zadeck <zadeck@naturalbridge.com>
> 
> 	* dse.c (find_shift_sequence): New function.
> 	(replace_read): Add case to remove read if it requires shift.
> 	* config/i386/i386.c (ix86_expand_prologue): Fixed typo in comment.

This is OK with the changes below.

Thanks.


> +/* The modes are different and the values source and target do not
> +   line up, we need to extract the value in the right from the rhs of
> +   the store, shift it, and then put it into a form that can be shoved
> +   into the read_insn.  This function generates a right SHIFT of a
> +   value that is at least ACCESS_SIZE bytes wide of READ_MODE.  The
> +   shift sequence is returned or NULL if we failed to find a
> +   shift.  */

Please rewrite the first sentence to be grammatical, and rewrite the
last sentence into active voice.


> +static rtx
> +find_shift_sequence (rtx read_reg,
> +		     int access_size,
> +		     store_info_t store_info,
> +		     read_info_t read_info,
> +		     int shift)
> +{
> +  enum machine_mode store_mode = GET_MODE (store_info->mem);
> +  enum machine_mode read_mode = GET_MODE (read_info->mem);
> +
> +  /* Some machines like the x86 have shift insns for each size of
> +     operand.  Other machines like the ppc or the ia-64 may only have
> +     shift insns that shift values within 32 or 64 bit registers.
> +     This loop tries to find the smallest shift insn that will right
> +     justify the value we want to read but is available in one insn on
> +     the machine.  */
> +
> +  while (access_size < UNITS_PER_WORD)
> +    {
> +      rtx target;
> +      enum machine_mode new_mode
> +	= smallest_mode_for_size (access_size * BITS_PER_UNIT,
> +				  GET_MODE_CLASS (read_mode));
> +      rtx new_reg = gen_reg_rtx (new_mode);
> +
> +      start_sequence ();
> +
> +      /* In theory we could also check for an ashr.  Ian Taylor knows
> +	 of one dsp where the cost of these two was not the same.  But
> +	 this really is a rare case anyway.  */
> +      target = expand_binop (new_mode, lshr_optab, new_reg,
> +			     GEN_INT (shift), new_reg, 1, OPTAB_DIRECT);
> +
> +      if (target == new_reg)
> +	{
> +	  rtx shift_seq = get_insns ();
> +	  end_sequence ();
> +
> +	  /* If cost is too great, set target to NULL and
> +	     let the iteration happen. */
> +	  if (shift_seq != NULL)
> +	    {
> +	      int cost = 0;
> +	      rtx insn;
> +
> +	      for (insn = shift_seq; insn != NULL_RTX; insn = NEXT_INSN (insn))
> +		cost += insn_rtx_cost (insn);
> +
> +	      if (cost <= COSTS_N_INSNS (1))
> +		{

The computation up to here is essentially independent of the arguments
and could be precomputed.  It may not be worth doing so.  But please
add a comment saying that we could precompute if worthwhile.  Or we
could at least cache the results.  The result technically depends on
SHIFT, ACCESS_SIZE, and GET_MODE_CLASS (READ_MODE).  But in practice
the answer will depend only on ACCESS_SIZE.


> +   Depending on the alignement and the mode of the store and
> +   subsequent load.

Typo: "alignement" -> "alignment".


Ian



More information about the Gcc-patches mailing list