x32 is a 64-bit process with 32-bit software pointer and kernel may place x32 shadow stack above 4GB. We need to save and restore 64-bit shadow stack register for x32: [hjl@gnu-tools-1 tmp]$ cat x.c void *buf[5]; int execute(void) { if (__builtin_setjmp (buf) == 0) return 0; else return 1; } [hjl@gnu-tools-1 tmp]$ /usr/gcc-8.0.0-x32/bin/gcc -O2 -fcf-protection -mcet x.c -S -mx32 [hjl@gnu-tools-1 tmp]$ cat x.s .file "x.c" .text .p2align 4,,15 .globl execute .type execute, @function execute: .LFB0: .cfi_startproc endbr64 .L2: endbr64 xorl %eax, %eax movl %esp, buf(%rip) movl $.L2, buf+4(%rip) movl %esp, buf+8(%rip) rdsspd %eax <<<<<<<<<< Only 32-bit shadow stack register is saved. movl %eax, buf+12(%rip) xorl %eax, %eax ret .cfi_endproc .LFE0: .size execute, .-execute .comm buf,20,16 Since builtin jmp buf size is 5 pointers. We have space to save 64-bit shadow stack pointers: 32-bit SP, 32-bit FP, 32-bit IP, 64-bit SSP.
Created attachment 43274 [details] x32 patch
updated __builtin_setjmp and __builtin_longjmp to use 64bit instructions and registers. The patch is attached.
(In reply to igor.v.tsimbalist from comment #1) > Created attachment 43274 [details] > x32 patch machine_mode mode; - mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, operands[0], - 3 * GET_MODE_SIZE (Pmode))); - reg_ssp = gen_reg_rtx (Pmode); + mode = (TARGET_X32 || TARGET_64BIT) ? DImode : SImode; Just replace Pmode with word_mode and use word_mode to save/restore SSP. Please also add comments to show why it is needed: x32 is a 64-bit process with 32-bit software pointer and kernel may place x32 shadow stack above 4GB. We need to save and restore 64-bit shadow stack register for x32 and why it is OK: builtin jmp buf size is 5 pointers. We have space to save 64-bit shadow stack pointers: 32-bit SP, 32-bit FP, 32-bit IP, 64-bit SSP for x32.
Created attachment 43280 [details] updated patch
(In reply to igor.v.tsimbalist from comment #4) > Created attachment 43280 [details] > updated patch - mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, operands[0], - 3 * GET_MODE_SIZE (Pmode))); - reg_ssp = gen_reg_rtx (Pmode); + mem = gen_rtx_MEM (word_mode, plus_constant (Pmode, operands[0], + 3 * GET_MODE_SIZE (Pmode))); The first 3 fields are SP, FP and IP, which are in ptr_mode, not Pmode. /* Compute the numebr of frames to adjust. */ + reg_adj = gen_rtx_SUBREG (Pmode, reg_ssp, 0); reg_ssp must be in word_mode, not in Pmode. Please show the assembly outputs of __builtin_setjmp and __builtin_longjmp with the updated patch.
(In reply to H.J. Lu from comment #5) > (In reply to igor.v.tsimbalist from comment #4) > > Created attachment 43280 [details] > > updated patch > > - mem = gen_rtx_MEM (Pmode, plus_constant (Pmode, operands[0], > - 3 * GET_MODE_SIZE (Pmode))); > - reg_ssp = gen_reg_rtx (Pmode); > + mem = gen_rtx_MEM (word_mode, plus_constant (Pmode, operands[0], > + 3 * GET_MODE_SIZE (Pmode))); > > The first 3 fields are SP, FP and IP, which are in ptr_mode, not Pmode. ok. > /* Compute the numebr of frames to adjust. */ > + reg_adj = gen_rtx_SUBREG (Pmode, reg_ssp, 0); > > reg_ssp must be in word_mode, not in Pmode. reg_ssp is word_mode. It's reg_adj that is Pmode (it's increment to shadow stack). > Please show the assembly outputs of __builtin_setjmp and __builtin_longjmp > with the updated patch. A snippet for __builtin_longjmp for -mx32 movl $0, %eax rdsspq %rax subq buf+12(%rip), %rax je .L2 negl %eax shrl $2, %eax cmpl $255, %eax jbe .L3 .L4: incsspq %rax subl $255, %eax cmpl $255, %eax ja .L4 .L3: incsspq %rax .L2: Snippet for __builtin_setjmp for -mx32 movl %eax, buf(%rip) movl $.L8, buf+4(%rip) movl %esp, buf+8(%rip) movl $0, %eax rdsspq %rax movq %rax, buf+12(%rip)
(In reply to igor.v.tsimbalist from comment #6) > > > > reg_ssp must be in word_mode, not in Pmode. > > reg_ssp is word_mode. It's reg_adj that is Pmode (it's increment to shadow > stack). OK. > > Please show the assembly outputs of __builtin_setjmp and __builtin_longjmp > > with the updated patch. > > A snippet for __builtin_longjmp for -mx32 > > movl $0, %eax Please use xor. > rdsspq %rax > subq buf+12(%rip), %rax > je .L2 > negl %eax > shrl $2, %eax ^^^^^^^^^^^^^^^^^^^^^ Shouldn't be "shrl $3, %eax" since SSP is 64-bit? > cmpl $255, %eax > jbe .L3 > .L4: > incsspq %rax > subl $255, %eax > cmpl $255, %eax > ja .L4 > .L3: > incsspq %rax > .L2: > > Snippet for __builtin_setjmp for -mx32 > > movl %eax, buf(%rip) > movl $.L8, buf+4(%rip) > movl %esp, buf+8(%rip) > movl $0, %eax xor. > rdsspq %rax > movq %rax, buf+12(%rip)
Author: itsimbal Date: Fri Feb 2 10:06:39 2018 New Revision: 257326 URL: https://gcc.gnu.org/viewcvs?rev=257326&root=gcc&view=rev Log: PR84066 Wrong shadow stack register size is saved for x32 x32 is a 64-bit process with 32-bit software pointer and kernel may place x32 shadow stack above 4GB. We need to save and restore 64-bit shadow stack register for x32. builtin jmp buf size is 5 pointers. We have space to save 64-bit shadow stack pointers: 32-bit SP, 32-bit FP, 32-bit IP, 64-bit SSP for x32. PR target/84066 * gcc/config/i386/i386.md: Replace Pmode with word_mode in builtin_setjmp_setup and builtin_longjmp to support x32. * gcc/testsuite/gcc.target/i386/cet-sjlj-6a.c: New test. * gcc/testsuite/gcc.target/i386/cet-sjlj-6b.c: Likewise. Added: trunk/gcc/testsuite/gcc.target/i386/cet-sjlj-6a.c trunk/gcc/testsuite/gcc.target/i386/cet-sjlj-6b.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/i386.md trunk/gcc/testsuite/ChangeLog
Fixed.