This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Implement switch statements with bit tests


> 
> The following patch allows GCC to implement switch/case statements
> using a short sequence of bit comparison operations.  The original
> idea was suggested to to me by Andi Kleen and Jan Hubicka of SuSe.
> 
> The theory is that for small dense switch statements with few
> unique targets, it is possible to implement "switch(x)" with
> comparisons of the form "if ((1<<x) & CST)", where CST is a
> suitable integer constant.
> 
> To be slightly more precise the generated RTL looks like:
> 
> 	t1 = x - MINVAL
> 	if (t1 > RANGE)
> 	  goto DEFAULT
> 	t2 = 1 << t1
> 	if (t2 & CST1)
> 	  goto CASE1
> 	if (t2 & CST2)
> 	  goto CASE2
> 	if (t2 & CST3)
> 	  goto CASE3
> 	goto DEFAULT
> 
> Here MINVAL is the lowest case value in the switch, RANGE is
> the value MAXVAL - MINVAL, i.e. the highest case value - MINVAL,
> DEFAULT is the default case target.  To work RANGE must be less
> than the number of bits in SImode, i.e. 31 or less on typical
> 32-bit architectures.
> 
> The implementation below allows atmost three unique non-default case
> targets, as longer sequences are potentially better implemented
> by tablejumps.  Similarly, we require several bits to be set
> in order to benefit from this approach.  In the patch below,
> we'll select this strategy if there is a single target with atleast
> three bits set, two targets with atleast five bits set or three
> targets with atleast 6 bits set.
> 
> 
> In an anaylsis of the 5414 switch statements encountered by GCC
> during stages 2 and 3 of a bootstrap on i686-pc-cygwin, 236 or
> about 4.35% meet the required criteria.  For example, 3786 have
> a range < 32, 812 of these branched to a only single target or
> the default, but only 60 of these had three of more cases values
> that branched to that single target.
> 
> As a performance optimization, the case with the most bits set
> is tested first, then if there is more than one bit-test, by
> decreasing bit/case counts.
> 
> As another optimization, we also try to avoid the subtraction
> by MINVAL, if MAXVAL is less than the number of bits in SImode.
> In the same switch analysis above, 130 of the 236 eligible
> switch statements could avoid the subtraction.
> 
> 
> As an example,
> 
> int foo(int x)
> {
>   switch (x)
>     {
>     case 4:
>     case 6:
>     case 9:
>     case 11:
>       return 30;
>     }
>   return 31;
> }
> 
> generates the following code with -O2 on i686-pc-cygwin
> 
> _foo:	pushl	%ebp
> 	movl	%esp, %ebp
> 	movl	8(%ebp), %ecx
> 	cmpl	$8, $ecx
> 	ja	L2
> 	movl	$1, $eax
> 	movl	$30, $edx
> 	sall	$cl, $eax
> 	testl	$274, $eax
> 	jne	L1
> L2:
> 	movl	$31, %edx
> L1:
> 	popl	%ebp
> 	movl	%edx, %eax
> 	ret
> 

When written as a 'C' like transformation, what you are doing is 
undefined, since you can end up with a shift that is more than the word 
size (consider calling foo with x=65536).  Many processors can end up 
doing various random things when called with over-length shifts (some may 
take many cycles to produce the result, others -- such as the ARM -- may 
reduce the range of the value in some way -- mod 256 on the ARM).


> The one caveat to the above approach is that this is only a win
> if the target provides an efficient instruction to perform 1<<x.
> To account for this, I introduce a new target macro that allows
> backends to disable this implementation strategy.  The default
> value of CASE_USE_BIT_TESTS checks the optabs to see if the
> target .md defines an ashlsi3 pattern.  This is a heuristic, as
> I suspect there may be targets that define ashlsi3 but actually
> implement it via an inlined loop or a library call.  On these
> targets, the patch below will not affect correctness, but may
> negatively impact performance unless the target's .h file is
> tweaked to contain the line:
> 
> #define CASE_USE_BIT_TESTS  0
> 
> [Is there a better way to test for an efficient left shift insn?]

Just testing for such an insn isn't sufficient.  You really need to know 
that the behaviour is correct for oversized shits and maybe wrap the whole 
test in something that checks the range of the shift.

> The following patch has been tested by a complete bootstrap, all
> languages except Ada and treelang, on i686-pc-linux-gnu with no
> new regressions.  It has also been bootstrapped on i686-pc-cygwin.

As currently implemented I believe this will produce *incorrect* code on 
the ARM.

R.



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]