[PATCH] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

Christoph Müllner cmuellner@gcc.gnu.org
Tue Nov 2 21:18:43 GMT 2021


On Tue, Nov 2, 2021 at 9:15 PM Vineet Gupta <vineetg@rivosinc.com> wrote:
>
>
>
> On 11/2/21 1:09 PM, Christoph Müllner wrote:
> >>>> Without overlap_op_by_pieces we get:
> >>>>     8e:   00053023                sd      zero,0(a0)
> >>>>     92:   00052423                sw      zero,8(a0)
> >>>>     96:   00051623                sh      zero,12(a0)
> >>>>     9a:   00050723                sb      zero,14(a0)
> >> To generate even the non optimized code above with gcc 11 [1][2], what
> >> do I need to do. Despite -mno-strict-align and trying -mtune={rocket,
> >> sifive-7-series}, I only get the fully unrolled version
> > You need a tuning struct with slow_unaligned_access == false.
> > Both, Rocket and Sifive 7, have slow unaligned access set to true.
> > Mainline you have thead-c906 which would work.
>
> But doesn't -mno-strict-align imply that ?

Opposite direction.
With `-mno-strict-align` emitted code might contain unaligned accesses
if `slow_unaligned_access == false`.
If `slow_unaligned_access == false`, then `-mstrict-align` will
prevent unaligned accesses.
Usually, there is a good reason why `slow_unaliged_access` is set to
`true` (e.g. a significant penalty
in case of unaligned accesses). It wouldn't make sense to overrule this.


>
> Thx,
> -Vineet


More information about the Gcc-patches mailing list