rfc: another switch optimization idea

Dinar Temirbulatov dtemirbulatov@gmail.com
Mon Mar 25 21:24:00 GMT 2013


Hi,
We noticed some performance gains if we are not using jump over some
simple switch statements. Here is the idea: Check whether the switch
statement can be expanded with conditional instructions. In that case
jump tables should be avoided since some branch instructions can be
eliminated in further passes (replaced by conditional execution).

   For example:
   switch (i)
   {
     case 1: sum += 1;
     case 2: sum += 3;
     case 3: sum += 5;
     case 4: sum += 10;
   }

Using jump tables the following code will be generated (ARM assembly):

   ldrcc pc, [pc, r0, lsl #2]
   b .L5
   .L0:
        .word L1
        .word L2
        .word L3
        .word L4

   .L1:
        add r3, #1
   .L2:
        add r3, #4
   .L3:
        add r3, #5
   .L4:
        add r3, #10
   .L5

Although this code has a constant complexity it can be improved by the
conditional execution to avoid implicit branching:

   cmp r0,1
   addeq r3, #1
   cmp r0,2
   addeq r3, #4
   cmp r0,3
   addeq r3, #5
   cmp r0,4
   addeq r3, #10

Although the assembly below requires more assembly instructions to be
executed, it doesn't violate the CPU pipeline (since no branching is
performed).

The original version of patch for was developed by Alexey Kravets. I
measured some performance improvements/regressions using spec 2000 int
benchmark on Samsumg's exynos 5250. Here is the result:

before:
                           Base      Base      Base      Peak
Peak      Peak
   Benchmarks    Ref Time  Run Time   Ratio    Ref Time  Run Time   Ratio
   ------------           --------      --------  --------  --------
--------  --------
   164.gzip          1400        287       487*     1400       288       485*
   175.vpr           1400        376       373*     1400       374       374*
   176.gcc           1100        121       912*     1100       118       933*
   181.mcf           1800        242       743*     1800       251       718*
   186.crafty        1000        159       628*     1000       165       608*
   197.parser        1800       347       518*     1800       329       547*
   252.eon           1300       960       135*     1300       960       135*
   253.perlbmk       1800      214       842*     1800       212       848*
   254.gap           1100       138       797*     1100       136       806*
   255.vortex        1900       253       750*     1900       255       744*
   256.bzip2         1500       237       632*     1500       230       653*
   300.twolf                                 X                             X
   SPECint_base2000                       561
   SPECint2000                                                          563

After:
   164.gzip          1400   286       490    *     1400   288       486    *
   175.vpr           1400   213       656    *     1400   215       650    *
   176.gcc           1100   119       923    *     1100   118       933    *
   181.mcf          1800   247       730    *     1800   251       717    *
   186.crafty        1000   145       688    *     1000   150       664    *
   197.parser       1800   296       608    *     1800   275       654    *
   252.eon                                   X                             X
   253.perlbmk     1800   206       872    *     1800   211       853    *
   254.gap           1100   133       825    *     1100   131       838    *
   255.vortex        1900   241       789    *     1900   239       797    *
   256.bzip2         1500   235       638    *     1500   226       663    *
   300.twolf                                 X                             X

The error in 252.eon was due to incorrect setup. Also "if (count >
3*PARAM_VALUE (PARAM_SWITCH_JUMP_TABLES_BB_OPS_LIMIT))" does not look
correct, and probably it is better to move this code in the earlier
stage just before the gimple expand and keep preference expand state
(jump-tables or not) for every switch statement to avoid dealing with
the RTL altogether.

                     thanks, Dinar.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: switch.patch
Type: application/octet-stream
Size: 8871 bytes
Desc: not available
URL: <https://gcc.gnu.org/pipermail/gcc/attachments/20130325/ef6637c9/attachment.obj>


More information about the Gcc mailing list