AW: AW: setmemsi, movmemsi and post_inc
Jeff Law
jeffreyalaw@gmail.com
Fri Mar 26 01:25:47 GMT 2021
On 3/25/2021 2:16 PM, Stefan Franke wrote:
>
> At least it seems possible to use auto_inc inside an emitted loop, since that yields a separate bb...
I wouldn't rely on that.
>
> Loop unrolling and auto_inc (post_inc) does not play well since there are two issues. consider these mem refs, with mode size 4:
> a[0] = ...
> a[4] = ...
> a[8] = ...
> a[12] = ...
>
> loop unrolling does something like
> b = a
> b[0] = ...
> b[4] = ...
> b[8] = ...
> b[12] = ...
> b = b + 16
> b[0] = ...
> ...
>
> 1. cse folds the memory refs from b to a, and but not the [4]
> b = a
> a[0] = ...
> b[4] = ...
> a[8] = ...
> a[12] = ...
> ...
> And you end up with one post_inc in the beginning and the rest without.
So that argues that you want to fixup CSE and possibly your ports
costing model.
>
> My workaround here is to consider the DF_REG_USE_COUNT and DF_REG_DEF_COUNT to decide if b[4] should be folded too
> if (DF_REG_USE_COUNT(REGNO(folded_arg0)) <= 2 || DF_REG_DEF_COUNT(REGNO(folded_arg0)) > 1)
> break;
>
> 2. auto-inc-dec does not yet handle the form of mem refs with offset and a matching add after these.
> Since the above pattern as insns looks like
> b = a + x
> *b = ...
> a = a + x + 4
> and is detected as PRE_ADD, I convert it into
> a = a + x
> *a = ...
> a = a + 4
> which is now a POST_INC, update the variables and auto-inc-dec generates post increments up to the top, where x gets zero.
>
> => there is room for improvements^^
Yup, and that's where I'd focus my efforts. Emitting auto-increment
addressing modes at a point where the compiler isn't expecting them is
just asking for trouble. It may work today, it may work for a decade,
but experience shows that if you break the rules it will break one day.
jeff
More information about the Gcc-help
mailing list