This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/57339] New: [SH] Wrong ISR FPU register save/restore


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57339

            Bug ID: 57339
           Summary: [SH] Wrong ISR FPU register save/restore
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
                CC: kkojima at gcc dot gnu.org
            Target: sh*-*-*

The following test case

extern void foo (void);

#pragma interrupt
#pragma nosave_low_regs
void
isr (void)
{
  foo ();
}

with -O2 -m4 -ml compiles to:

isr:
        fmov.s    fr0,@-r15
        fmov.s    fr1,@-r15
        fmov.s    fr2,@-r15
        fmov.s    fr3,@-r15
        fmov.s    fr4,@-r15
        fmov.s    fr5,@-r15
        fmov.s    fr6,@-r15
        fmov.s    fr7,@-r15
        fmov.s    fr8,@-r15
        fmov.s    fr9,@-r15
        fmov.s    fr10,@-r15
        fmov.s    fr11,@-r15
        sts.l    mach,@-r15
        sts.l    macl,@-r15
        sts.l    fpul,@-r15
        sts.l    fpscr,@-r15
        sts.l    pr,@-r15
        mov.l    .L3,r1
        lds.l    @r1+,fpscr      ! 57    fpu_switch/2    [length = 2]
        mov.l    .L4,r1
        jsr    @r1
        nop
    lds.l    @r15+,pr
        lds.l    @r15+,fpscr     ! 38    fpu_switch/2    [length = 2]
        lds.l    @r15+,fpul
        lds.l    @r15+,macl
        lds.l    @r15+,mach
        fmov.s    @r15+,fr11
        fmov.s    @r15+,fr10
        fmov.s    @r15+,fr9
        fmov.s    @r15+,fr8
        fmov.s    @r15+,fr7
        fmov.s    @r15+,fr6
        fmov.s    @r15+,fr5
        fmov.s    @r15+,fr4
        fmov.s    @r15+,fr3
        fmov.s    @r15+,fr2
        fmov.s    @r15+,fr1
        rte
        fmov.s    @r15+,fr0


The FPU register save/restore insns in the ISR epilogue and prologue are wrong
since they refer to the FPSCR setting at the point of the interruption, but the
FPSCR setting is not known at ISR function entry.

For example, if the interrupted context's FPSCR has FPSCR.SZ = 1 (64 bit fmov)
the code above will not work properly and is most likely going to result in an
address error.

A better way would be to do something like...

isr:
                                ! FPSCR.SZ, FPSCR.PR, FPSCR.FR of interrupted
                                ! context is not known.
        mov     r15,r0
        or      #7,r0
        xor     #7,r0
        mov     r0,r15          ! r15 = 64 bit aligned stack ptr
        mova   .Lvariables,r0

        sts.l   fpscr,@-r15     ! push FPSCR of interrupted context
        add     #-4,r15         ! keep r15 64 bit aligned

        lds.l   @r0+,fpscr      ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0
                                ! if FPSCR.FR was 1 we have now switched the
                                ! FP reg banks and will be saving XF regs.
                                ! that's OK if the ISR code does not switch
                                ! FPSCR.FR.
        fmov.d  dr0,@-r15
        fmov.d  dr2,@-r15
        fmov.d  dr4,@-r15
        fmov.d  dr6,@-r15
        fmov.d  dr8,@-r15
        fmov.d  dr10,@-r15

        sts.l    fpscr,@-r15     ! push FPSCR setting for restoring FP regs
                                ! before ISR returns.

        lds.l   @r0+,fpscr      ! default FPSCR setting for ISRs
                                ! FPSCR.PR must be kept = 0
        sts.l   fpul,@-r15
        sts.l   mach,@-r15
        sts.l   macl,@-r15
        sts.l   pr,@-r15

        ...
        < ISR code >
        ...

        lds.l   @r15+,pr
        lds.l   @r15+,macl
        lds.l   @r15+,mach
        lds.l   @r15+,fpul
        lds.l   @r15+,fpscr     ! pop FPSCR setting for FP reg restore
                                ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0
        fmov.d  @r15+,dr10
        fmov.d  @r15+,dr8
        fmov.d  @r15+,dr6
        fmov.d  @r15+,dr4
        fmov.d  @r15+,dr2
        fmov.d  @r15+,dr0
        add     #4,r15
        rte
        lds.l   @r15+,fpscr     ! restore FPSCR of interrupted context
                                ! this potentially switches FPSCR.FR

        .align 2
.Lvariables:
        .long   1 << 20         ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0
        .long   <default FPSCR value for ISR>  ! FPSCR.FR must be = 0


Some remarks:

---------
This insn:
        add     #-4,r15         ! keep r15 64 bit aligned

can be replaced with one of the push insns that follow the fmov.d pushes.

---------
The above won't work for SH2E and SH3E.  Those need their own sequence.

---------
The stack alignment can be omitted if it has been aligned already somehow else,
e.g. by specifying 'sp_switch' function attribute.  On SH3* and SH4* the ISR
vectors are grouped, so there is always some common code that is executed
before it invokes the actual ISR.  That code might as well align the stack.

---------
An alternative stack alignment sequence could be:

        mova   .Lvariables,r0
        mov.l   @r0+,r1
        and     r1,r15

        ...

        .align 2
.Lvariables:
        .long   0xFFFFFFF8
        .long   1 << 20         ! FPSCR.SZ = 1, FPSCR.PR = 0, FPSCR.FR = 0
        .long   <default FPSCR value for ISR>  ! FPSCR.FR must be = 0

However, this is probably a bit slower.


---------
On SH4, if the ISR code path changes the FPSCR.FR setting, the FP regs will get
scrambled.  ISR code must not modify the FPSCR.FR setting.  This restriction
could be removed by also pushing/popping XF regs (additional 8 push + 8 pop
insns).

---------
On SH2A and SH2E R0 is not a banked register and must be pushed before dealing
with the FP regs.


I'm not sure how this should be implemented in the compiler, since currently
there are some issues with FP mode switching and fmov.d support, which probably
would need to be fixed first.
One idea for now would be to emit fixed ISR prologue / epilogue asm blocks that
deal with the FP regs, if FP regs need to be saved/restored for an ISR.

Kaz, do you happen to have any suggestions/advices/opinions?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]