This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Reduce the number of extraneous rtl when expandingmutliply


On Tue, 9 Nov 2004, Andrew Pinski wrote:
> 	* expmed.c (expand_mult_const): If we have an alg_m as the first
> 	operation and there is only one other operation, then don't copy
> 	the register into a new one.

I don't think this is safe.  I was just about to approve this patch
(under the condition that you mention PR middle-end/18293 in the
ChangeLog) when I it struck me that we can't use op0 directly as
the accumulator, as op0's pseudo mustn't be modified, but the
accum pseudo may potentially be modified in place.

Consider the possibility where op0 is the pseudo corresponding to
a user-declared register variable upon calling expand_mult_const.
If the second step in the "struct algorithm" is anything other than
alg_shift, when not optimizing, we can call force_operand with
accum (== op0) as target.  This will destructively modify op0,
which may still be live outside of the call to expand_mult_const.


Interestingly, for the test case in the PR, the second (and final)
step in the algorithm is alg_shift, so this approach is safe...


I think a better solution to this particular PR, which would also
further improve compile-time performance and memory usage, would
be to special case multiplications by powers of two at the start
of expand_mult, just before the current call to choose_mult_variant,
and instead directly call expand_shift (LSHIFT_EXPR, ...).  This
bypasses synth_mult (which is starting to show up in profiling)
for the common cases of multiplications by 1, 2, 4, 8, 16 etc...
expand_shift contains the necessary shift by addition optimizations
and should always be the solution ultimately chosen by synth_mult
for powers of two.  It also has the added benefit of avoiding the
reg-reg copy to a temporary accumulator.


Sorry for not noticing the flaw in your original solution earlier.
Do you agree with the above analysis and/or benefits of the above
counter proposal?


Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]