[PATCH][GCC7] Remove scaling of COMPONENT_REF/ARRAY_REF ops 2/3
Eric Botcazou
ebotcazou@adacore.com
Mon May 2 09:13:00 GMT 2016
> The following experiment resulted from looking at making
> array_ref_low_bound and array_ref_element_size non-mutating. Again
> I wondered why we do this strange scaling by offset/element alignment.
The idea is to expose the alignment factor to the RTL expander:
tree tem
= get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
&unsignedp, &reversep, &volatilep, true);
[...]
rtx offset_rtx = expand_expr (offset, NULL_RTX, VOIDmode,
EXPAND_SUM);
[...]
op0 = offset_address (op0, offset_rtx,
highest_pow2_factor (offset));
With the scaling, offset is something like _69 * 4 so highest_pow2_factor can
see the factor and passes it down to offset_address:
(gdb) p debug_rtx(op0)
(mem/c:SI (plus:SI (reg/f:SI 193)
(reg:SI 194)) [3 *s.16_63 S4 A32])
With your patch in the same situation:
(gdb) p debug_rtx(op0)
(mem/c:SI (plus:SI (reg/f:SI 139)
(reg:SI 116 [ _33 ])) [3 *s.16_63 S4 A8])
On strict-alignment targets, this makes a big difference, e.g. SPARC:
ld [%i4+%i5], %i0
vs
ldub [%i5+%i4], %g1
sll %g1, 24, %g1
add %i5, %i4, %i5
ldub [%i5+1], %i0
sll %i0, 16, %i0
or %i0, %g1, %i0
ldub [%i5+2], %g1
sll %g1, 8, %g1
or %g1, %i0, %g1
ldub [%i5+3], %i0
or %i0, %g1, %i0
Now this is mitigated by a couple of things:
1. the above pessimization only happens on the RHS; on the LHS, the expander
calls highest_pow2_factor_for_target instead of highest_pow2_factor and the
former takes into account the type's alignment thanks to the MAX:
/* Similar, except that the alignment requirements of TARGET are
taken into account. Assume it is at least as aligned as its
type, unless it is a COMPONENT_REF in which case the layout of
the structure gives the alignment. */
static unsigned HOST_WIDE_INT
highest_pow2_factor_for_target (const_tree target, const_tree exp)
{
unsigned HOST_WIDE_INT talign = target_align (target) / BITS_PER_UNIT;
unsigned HOST_WIDE_INT factor = highest_pow2_factor (exp);
return MAX (factor, talign);
}
2. highest_pow2_factor can be rescued by the set_nonzero_bits machinery of
the SSA CCP pass because it calls tree_ctz. The above example was compiled
with -O -fno-tree-ccp on SPARC; at -O, the code isn't pessimized.
> So - the following patch gets rid of that scaling. For a "simple"
> C testcase
>
> void bar (void *);
> void foo (int n)
> {
> struct S { struct R { int b[n]; } a[2]; int k; } s;
> s.k = 1;
> s.a[1].b[7] = 3;
> bar (&s);
> }
This only exposes the LHS case, here's a more complete testcase:
void bar (void *);
int foo (int n)
{
struct S { struct R { char b[n]; } a[2]; int k; } s;
s.k = 1;
s.a[1].b[7] = 3;
bar (&s);
return s.k;
}
--
Eric Botcazou
More information about the Gcc-patches
mailing list