This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [wide-int] Avoid some changes in output code

From: Richard Sandiford <rdsandiford at googlemail dot com>
To: Richard Biener <rguenther at suse dot de>
Cc: gcc-patches at gcc dot gnu dot org, zadeck at naturalbridge dot com, mikestump at comcast dot net
Date: Mon, 04 Nov 2013 09:34:39 +0000
Subject: Re: [wide-int] Avoid some changes in output code
Authentication-results: sourceware.org; auth=none
References: <871u30mrv6 dot fsf at talisman dot default> <alpine dot LNX dot 2 dot 00 dot 1311041010361 dot 4261 at zhemvz dot fhfr dot qr> <alpine dot LNX dot 2 dot 00 dot 1311041014190 dot 4261 at zhemvz dot fhfr dot qr>

Richard Biener <rguenther@suse.de> writes:
> On Mon, 4 Nov 2013, Richard Biener wrote:
>
>> On Fri, 1 Nov 2013, Richard Sandiford wrote:
>> 
>> > I'm building one target for each supported CPU and comparing the wide-int
>> > assembly output of gcc.c-torture, gcc.dg and g++.dg with the corresponding
>> > output from the merge point.  This patch removes all the differences I saw
>> > for alpha-linux-gnu in gcc.c-torture.
>> > 
>> > Hunk 1: Preserve the current trunk behaviour in which the shift count
>> > is truncated if SHIFT_COUNT_TRUNCATED and not otherwise.  This was by
>> > inspection after hunk 5.
>> > 
>> > Hunks 2 and 3: Two cases where force_fit_to_type could extend to wider
>> > types and where we weren't extending according to the sign of the source.
>> > We should probably assert that the input is at least as wide as the type...
>> > 
>> > Hunk 4: The "&" in:
>> > 
>> > 	      if ((TREE_INT_CST_HIGH (arg1) & mask_hi) == mask_hi
>> > 		  && (TREE_INT_CST_LOW (arg1) & mask_lo) == mask_lo)
>> > 
>> > had got dropped during the conversion.
>> > 
>> > Hunk 5: The old code was:
>> > 
>> > 	  if (shift > 0)
>> > 	    {
>> > 	      *mask = r1mask.llshift (shift, TYPE_PRECISION (type));
>> > 	      *val = r1val.llshift (shift, TYPE_PRECISION (type));
>> > 	    }
>> > 	  else if (shift < 0)
>> > 	    {
>> > 	      shift = -shift;
>> > 	      *mask = r1mask.rshift (shift, TYPE_PRECISION (type), !uns);
>> > 	      *val = r1val.rshift (shift, TYPE_PRECISION (type), !uns);
>> > 	    }
>> > 
>> > and these precision arguments had two purposes: to control which
>> > bits of the first argument were shifted, and to control the truncation
>> > mask for SHIFT_TRUNCATED.  We need to pass a width to the shift functions
>> > for the second.
>> > 
>> > (BTW, I'm running the comparisons with CONST_WIDE_INT locally moved to the
>> > end of the !GENERATOR_FILE list in rtl.def, since the current position caused
>> > some spurious differences.  The "problem" AFAICT is that hash_rtx hashes
>> > on code, RTL PRE creates registers in the hash order of the associated
>> > expressions, RA uses register numbers as a tie-breaker during ordering,
>> > and so the order of rtx_code can indirectly influence register allocation.
>> > First time I'd realised that could happen, so just thought I'd mention it.
>> > I think we should keep rtl.def in the current (logical) order though.)
>> > 
>> > Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK for wide-int?
>> 
>> Bah - can you instead try removing the use of SHIFT_COUNT_TRUNCATED
>> from double-int.c on the trunk?  If not then putting this at the
>> callers of wi::rshift and friends is clearly asking for future
>> mistakes.
>> 
>> Oh, and I think that honoring SHIFT_COUNT_TRUNCATED anywhere else
>> than in machine instruction patterns was a mistake in the past.
>
> Oh, and my understanding is that it's maybe required for correctness on
> the RTL side if middle-end code removes masking operations.
> Constant propagation results later may not change the result because
> of that.  I'm not sure this is the way it happens, but if we have
>
>   (lshift:si (reg:si 1) (and:qi (reg:qi 2) 0x1f))
>
> and we'd transform it to (based on SHIFT_COUNT_TRUNCATED)
>
>   (lshift:si (reg:si 1) (reg:qi 2))
>
> and later constant propagate 77 to reg:qi 2.
>
> That isn't the way SHIFT_COUNT_TRUNCATED should work, of course,
> but instead the backends should have a define_insn which matches
> the 'and' and combine should try to match that.
>
> You can see that various architectures may have truncating shifts
> but not for all patterns which results in stuff like
>
> config/aarch64/aarch64.h:#define SHIFT_COUNT_TRUNCATED !TARGET_SIMD
>
> which clearly means that it's not implemented in terms of providing
> shift insns which can match a shift operand that is masked.
>
> That is, either try to clean that up (hooray!), or preserve
> the double-int.[ch] behavior (how does CONST_INT RTL constant
> folding honor it?).

The CONST_INT code does the modulus explicitly:

	  /* Truncate the shift if SHIFT_COUNT_TRUNCATED, otherwise make sure
	     the value is in range.  We can't return any old value for
	     out-of-range arguments because either the middle-end (via
	     shift_truncation_mask) or the back-end might be relying on
	     target-specific knowledge.  Nor can we rely on
	     shift_truncation_mask, since the shift might not be part of an
	     ashlM3, lshrM3 or ashrM3 instruction.  */
	  if (SHIFT_COUNT_TRUNCATED)
	    arg1 = (unsigned HOST_WIDE_INT) arg1 % width;
	  else if (arg1 < 0 || arg1 >= GET_MODE_BITSIZE (mode))
	    return 0;

There was a strong feeling (from me and others) that wide-int.h
should not depend on tm.h.  I don't think wi::lshift itself should
know about SHIFT_COUNT_TRUNCATED.  Especially since (and I think we agree
on this :-)) it's a terrible interface.  It would be better if it took
a mode as an argument and (perhaps) returned a mask as a result.  But that
would make it even harder for wide-int.h to do the right thing, because
it doesn't know about modes.  Having the callers do it seems like the
right thing to me, both now and after any future tweaks to
SHIFT_COUNT_TRUNCATED.

If your objection is that it's too easy to forget, then I don't think
that's a problem at the RTL level.  simplify_const_binary_operation
is the only place that does constant RTX shifts.

Sorry, I don't have time to replace SHIFT_COUNT_TRUNCATED right now.

Thanks,
Richard

Follow-Ups:
- Re: [wide-int] Avoid some changes in output code
  - From: Richard Biener

References:
- [wide-int] Avoid some changes in output code
  - From: Richard Sandiford
- Re: [wide-int] Avoid some changes in output code
  - From: Richard Biener
- Re: [wide-int] Avoid some changes in output code
  - From: Richard Biener

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]