[wide-int] Avoid some changes in output code

Kenneth Zadeck zadeck@naturalbridge.com
Mon Nov 4 14:47:00 GMT 2013


On 11/04/2013 04:12 AM, Richard Biener wrote:
> On Fri, 1 Nov 2013, Richard Sandiford wrote:
>
>> I'm building one target for each supported CPU and comparing the wide-int
>> assembly output of gcc.c-torture, gcc.dg and g++.dg with the corresponding
>> output from the merge point.  This patch removes all the differences I saw
>> for alpha-linux-gnu in gcc.c-torture.
>>
>> Hunk 1: Preserve the current trunk behaviour in which the shift count
>> is truncated if SHIFT_COUNT_TRUNCATED and not otherwise.  This was by
>> inspection after hunk 5.
>>
>> Hunks 2 and 3: Two cases where force_fit_to_type could extend to wider
>> types and where we weren't extending according to the sign of the source.
>> We should probably assert that the input is at least as wide as the type...
>>
>> Hunk 4: The "&" in:
>>
>> 	      if ((TREE_INT_CST_HIGH (arg1) & mask_hi) == mask_hi
>> 		  && (TREE_INT_CST_LOW (arg1) & mask_lo) == mask_lo)
>>
>> had got dropped during the conversion.
>>
>> Hunk 5: The old code was:
>>
>> 	  if (shift > 0)
>> 	    {
>> 	      *mask = r1mask.llshift (shift, TYPE_PRECISION (type));
>> 	      *val = r1val.llshift (shift, TYPE_PRECISION (type));
>> 	    }
>> 	  else if (shift < 0)
>> 	    {
>> 	      shift = -shift;
>> 	      *mask = r1mask.rshift (shift, TYPE_PRECISION (type), !uns);
>> 	      *val = r1val.rshift (shift, TYPE_PRECISION (type), !uns);
>> 	    }
>>
>> and these precision arguments had two purposes: to control which
>> bits of the first argument were shifted, and to control the truncation
>> mask for SHIFT_TRUNCATED.  We need to pass a width to the shift functions
>> for the second.
>>
>> (BTW, I'm running the comparisons with CONST_WIDE_INT locally moved to the
>> end of the !GENERATOR_FILE list in rtl.def, since the current position caused
>> some spurious differences.  The "problem" AFAICT is that hash_rtx hashes
>> on code, RTL PRE creates registers in the hash order of the associated
>> expressions, RA uses register numbers as a tie-breaker during ordering,
>> and so the order of rtx_code can indirectly influence register allocation.
>> First time I'd realised that could happen, so just thought I'd mention it.
>> I think we should keep rtl.def in the current (logical) order though.)
>>
>> Tested on x86_64-linux-gnu and powerpc64-linux-gnu.  OK for wide-int?
> Bah - can you instead try removing the use of SHIFT_COUNT_TRUNCATED
> from double-int.c on the trunk?  If not then putting this at the
> callers of wi::rshift and friends is clearly asking for future
> mistakes.
>
> Oh, and I think that honoring SHIFT_COUNT_TRUNCATED anywhere else
> than in machine instruction patterns was a mistake in the past.
I disagree.   The programmer has the right to expect that the optimizer 
transforms the program in a manner that is consistent, but faster, with 
the underlying machine.   Saying you are going to do it in the patterns 
is not realistic.

I should point out that i had originally put the shift count truncated 
inside wide-int and you (richi) told me to push it back out to the 
callers.   I personally think that it should be returned to wide-int to 
assure consistent behavior.

However, i would also say that it would be nice if were actually done 
right.   There are (as far as i know) actually 3 ways handling shift 
amounts that have been used in the past:

(1) mask the shift amount by (bitsize - 1).   This is the most popular 
and is what is done if you set SHIFT_COUNT_TRUNCATED.
(2) mask the shift amount by ((2*bitsize) -1).   There are a couple of 
architectures that do this including the ppc.   However, gcc has never 
done it correctly for these platforms.
(3) assume that the shift amount can be a big number.    The only 
machine that i know of that ever did this was the Knuth's mmix, and 
AFAIK, it has never "known" silicon.   (it turns out that this is almost 
impossible to build efficiently).   And yet, we support this if you do 
not set SHIFT_COUNT_TRUNCATED.
I will, if you like, fix this properly.   But i want to see a plan that 
at least richi and richard are happy with first.    My gut feeling is to 
provide a target hook that takes a bitsize and a shift value as wide-int 
(or possibly a hwi) and returns a machine specific shift amount.    Then 
i would put that inside of the shift routines of wide-int.

kenny







More information about the Gcc-patches mailing list