autoinc / postinc not used
stefan@franke.ms
stefan@franke.ms
Tue Jan 16 20:02:38 GMT 2024
Hi all,
I work a lot with the good old m68k target where post-increment is supported, and I was surprised that there almost no post-increments are used in the generated code.
This simple code:
void memclr (int length, long * ptr) {
for(;length--;){
*ptr++= 0;
}
}
does not use post-increments on AVR or SH. See also https://godbolt.org/z/fTvdv65rr
On M68K a post-increment appears:
memclr:
move.l 4(%sp),%d1
move.l 8(%sp),%a0
move.l %d1,%d0
subq.l #1,%d0
tst.l %d1
jeq .L1
.L3:
clr.l (%a0)+
dbra %d0,.L3
clr.w %d0
subq.l #1,%d0
jcc .L3
.L1:
rts
If you change the code and add a 2nd statement to the loop:
void memclr (int length, long * ptr) {
for(;length--;){
*ptr++= 0;
*ptr++= 0;
}
}
the post-increment disappears:
memclr:
move.l 4(%sp),%d1
move.l 8(%sp),%a0
move.l %d1,%d0
subq.l #1,%d0
tst.l %d1
jeq .L1
.L3:
clr.l (%a0)
addq.l #8,%a0
clr.l -4(%a0)
dbra %d0,.L3
clr.w %d0
subq.l #1,%d0
jcc .L3
.L1:
rts
This is caused by several unfortunate conversions/optimizations. Here comes the first:
The GIMPLE PASS converts post-increments by creating the next pointer before the current pointer is used, which looks like
ptr.0 = ptr;
ptr = ptr.0 + 4;
*ptr.0 = 0;
ptr.1 = ptr;
ptr = ptr.1 + 4;
*ptr.1 = 0;
In the following steps this gets optimized further but in the end the addition stays always in front of the last zero assignment and ends up to become a +8. Since the +8 does not match the size also the first post-increment gets lost. And the last zero assignment is done with offset -4. That explains the generated code.
Now here comes my question:
Is there a more conforming/easier/better way to swap the generated gimple instructions than patching gimplify_modify_expr and check for assignment pairs where the pointer-add can be moved behind the memory assignment?
My hack is ugly:
gimple * p2 = gimple_seq_last_stmt(*pre_p);
if (p2->code == GIMPLE_ASSIGN && p2->prev && p2->prev != p2)
{
gimple * p1 = p2->prev;
if (p1->code == GIMPLE_ASSIGN)
{
tree b = gimple_assign_lhs(p1);
tree x1 = gimple_assign_lhs(p2);
tree x2 = gimple_assign_rhs1(p2);
if (b != x2 && (TREE_CODE(b) == VAR_DECL || TREE_CODE(x2) == VAR_DECL || TREE_CODE(b) == PARM_DECL || TREE_CODE(x2) == PARM_DECL) &&
((TREE_CODE(x1) == VAR_DECL && TREE_CODE(x2) == MEM_REF && TREE_OPERAND(x2, 0) != b
&& (TREE_CODE(TREE_OPERAND(x2, 0)) == VAR_DECL || TREE_CODE(TREE_OPERAND(x2, 0)) == PARM_DECL)) ||
(TREE_CODE(x1) == MEM_REF && (TREE_CODE(x2) == INTEGER_CST || (TREE_CODE(x2) == VAR_DECL && TREE_OPERAND(x1, 0) != b)))
&& (TREE_CODE(TREE_OPERAND(x1, 0)) == VAR_DECL || TREE_CODE(TREE_OPERAND(x1, 0)) == PARM_DECL)))
{
gimple_stmt_iterator to = gsi_last (*pre_p);
gimple_stmt_iterator from = to;
from.ptr = p1;
gsi_remove (&from, false);
gsi_insert_after (&to, p1, GSI_NEW_STMT);
}
}
}
(there are more modifications necessary to create better code, but it’s possible)
Thanks
Stefan
More information about the Gcc-help
mailing list