[patch] expmed.c: Improve multiplication by more constants.
Kazu Hirata
kazu@codesourcery.com
Mon Apr 13 17:20:00 GMT 2009
Hi,
Attached is a patch to improve synth_mult.
Specifically, this patch improves multiplication by constants of the
form
-m * (2^^n - 1) for some positive integers m and n.
Without this patch, arm-none-eabi-gcc -mcpu=cortex-a8 produces a
multiplication by -28 like so:
<multminus28>:
mvn r3, #27 ; r3 = -28
mul r0, r3, r0 ; r0 = r3 * r0
bx lr
I've inserted comments after ';' for readability. With this patch,
you get:
<multminus28>:
sub r0, r0, r0, lsl #3 ; r0 = r0 - (r0 << 3)
lsl r0, r0, #2 ; r0 = r0 << 2
bx lr
When you call synth_mult with T == -28 and mode == SImode, synth_mult
tries a shift by 2 bits because T is divisible by 4. synth_mult
without this patch always performs an unsigned shift on T. So,
synth_mult is recursively called with T being:
0x3ffffff9
rather than
0xfffffff9
A multiplication by 0x3ffffff9 is hard to synthesize, but 0xfffffff9,
which is -7, is doable. Specifically, we can do:
sub r0, r0, r0, lsl #3
on ARM.
The patch fixes the problem by trying out the result of a signed shift
on T. For example, when T = -28 and mode == SImode are given, we
pass:
0xfffffff9 = 0xffffffe4 >> 2
to the recursively called synth_mult, which gives you:
A * -7 = A - A * 8
if costs permit.
Tested on x86_64-pc-linux-gnu and arm-none-eabi on top of the previous
synth_mult patch. OK to apply?
Kazu Hirata
2009-04-13 Kazu Hirata <kazu@codesourcery.com>
* expmed.c (synth_mult): When trying out a shift, pass the result
of a signed shift.
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c (revision 145996)
+++ gcc/expmed.c (working copy)
@@ -2542,6 +2542,38 @@ synth_mult (struct algorithm *alg_out, u
best_alg->log[best_alg->ops] = m;
best_alg->op[best_alg->ops] = alg_shift;
}
+
+ /* See if treating ORIG_T as a signed number yields a better
+ sequence. Try this sequence only for a negative ORIG_T
+ as it would be useless for a non-negative ORIG_T. */
+ if ((HOST_WIDE_INT) orig_t < 0)
+ {
+ /* Shift ORIG_T as follows because a right shift of a
+ negative-valued signed type is implementation
+ defined. */
+ q = ~(~orig_t >> m);
+ /* The function expand_shift will choose between a shift
+ and a sequence of additions, so the observed cost is
+ given as MIN (m * add_cost[speed][mode],
+ shift_cost[speed][mode][m]). */
+ op_cost = m * add_cost[speed][mode];
+ if (shift_cost[speed][mode][m] < op_cost)
+ op_cost = shift_cost[speed][mode][m];
+ new_limit.cost = best_cost.cost - op_cost;
+ new_limit.latency = best_cost.latency - op_cost;
+ synth_mult (alg_in, q, &new_limit, mode);
+
+ alg_in->cost.cost += op_cost;
+ alg_in->cost.latency += op_cost;
+ if (CHEAPER_MULT_COST (&alg_in->cost, &best_cost))
+ {
+ struct algorithm *x;
+ best_cost = alg_in->cost;
+ x = alg_in, alg_in = best_alg, best_alg = x;
+ best_alg->log[best_alg->ops] = m;
+ best_alg->op[best_alg->ops] = alg_shift;
+ }
+ }
}
if (cache_hit)
goto done;
More information about the Gcc-patches
mailing list