Bug 110996 - RISC-V vector Fortran: SEGV ICE during parsing
Summary: RISC-V vector Fortran: SEGV ICE during parsing
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: 14.0
Assignee: Mikael Morin
URL:
Keywords: error-recovery
Depends on:
Blocks:
 
Reported: 2023-08-11 15:59 UTC by Jeremy Bennett
Modified: 2023-09-15 17:27 UTC (History)
2 users (show)

See Also:
Host: x86_64-linux-gnu
Target: riscv64-unknown-linux-gnu
Build: x86_64-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2023-08-17 00:00:00


Attachments
Fortran 90 source code for the test case (59 bytes, text/plain)
2023-08-11 15:59 UTC, Jeremy Bennett
Details
Draft patch, to be cleaned up. (3.94 KB, patch)
2023-08-26 20:04 UTC, Mikael Morin
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jeremy Bennett 2023-08-11 15:59:40 UTC
Created attachment 55726 [details]
Fortran 90 source code for the test case

This appears to be a failure in the Fortran parser, which only occurs if the RISC-V vector ISA extension is specified.

This is the reproducer (testcase.f90):

  CONTAINS
   SUBROUTINE a
   END
   SUBROUTINE b
   END
   SUBROUTINE c(d) e
SUBROUTINE f
      END
END

(which has a syntax error with the stray "e" at the end of the declaration of "SUBROUTINE c").

This is compiled with:

riscv64-unknown-linux-gnu-gfortran -march=rv64gcv -mabi=lp64d -c \
    -Ofast testcase.f90

NOTE. The error only occurs if the "v" extensions is specified in the "-march" string.

The output is:

testcase.f90:6:20:

    6 |    SUBROUTINE c(d) e
      |                    1
Error: Syntax error in SUBROUTINE statement at (1)
f951: internal compiler error: Segmentation fault
0x112a263 crash_signal
	/home/jeremy/gittrees/mustang/gcc/gcc/toplev.cc:314
0x7fdd1203c4af ???
	./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0xaaea0a resolve_symbol
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/resolve.cc:16643
0xadadc2 do_traverse_symtree
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/symbol.cc:4190
0xab95ed resolve_types
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/resolve.cc:17963
0xab9697 resolve_types
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/resolve.cc:17977
0xac08ec gfc_resolve(gfc_namespace*)
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/resolve.cc:18083
0xa9f250 resolve_all_program_units
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/parse.cc:6909
0xa9f250 gfc_parse_file()
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/parse.cc:7165
0xaf3533 gfc_be_parse_file
	/home/jeremy/gittrees/mustang/gcc/gcc/fortran/f95-lang.cc:229
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

System information
------------------

Using built-in specs.
COLLECT_GCC=riscv64-unknown-linux-gnu-gfortran
COLLECT_LTO_WRAPPER=/home/jeremy/gittrees/mustang/install/libexec/gcc/riscv64-unknown-linux-gnu/14.0.0/lto-wrapper
Target: riscv64-unknown-linux-gnu
Configured with: /home/jeremy/gittrees/mustang/gcc/configure --target=riscv64-unknown-linux-gnu --prefix=/home/jeremy/gittrees/mustang/install --with-sysroot=/home/jeremy/gittrees/mustang/install/sysroot --with-pkgversion=g68783211f66 --with-system-zlib --enable-shared --enable-tls --enable-languages=c,c++,fortran --disable-libmudflap --disable-libssp --disable-libquadmath --disable-libsanitizer --disable-nls --disable-bootstrap --src=/home/jeremy/gittrees/mustang/gcc --enable-multilib --with-abi=lp64d --with-arch=rv64gc --with-tune= --with-isa-spec=20191213 'CFLAGS_FOR_TARGET=-O2    -mcmodel=medany' 'CXXFLAGS_FOR_TARGET=-O2    -mcmodel=medany'
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20230811 (experimental) (g68783211f66)
Comment 1 JuzheZhong 2023-08-12 02:35:06 UTC
I don't think this is RISC-V target specific issue, at leas it should not be
the RVV specific issue:

[jzzhong@rios-cad5:/work/home/jzzhong/work/insn]$~/work/rvv-opensource/output/gcc-rv64/bin/riscv64-rivai-elf-gfortran -march=rv64gc -mabi=lp64d -Ofast pr110996.f90
pr110996.f90:6:20:

    6 |    SUBROUTINE c(d) e
      |                    1
Error: Syntax error in SUBROUTINE statement at (1)
f951: internal compiler error: Segmentation fault
0x17bd167 crash_signal
        ../../../riscv-gcc/gcc/toplev.cc:314
0xe9d659 resolve_symbol
        ../../../riscv-gcc/gcc/fortran/resolve.cc:16643
0xec5dfd do_traverse_symtree
        ../../../riscv-gcc/gcc/fortran/symbol.cc:4190
0xec5eb8 gfc_traverse_ns(gfc_namespace*, void (*)(gfc_symbol*))
        ../../../riscv-gcc/gcc/fortran/symbol.cc:4215
0xea0421 resolve_types
        ../../../riscv-gcc/gcc/fortran/resolve.cc:17963
0xea04d9 resolve_types
        ../../../riscv-gcc/gcc/fortran/resolve.cc:17977
0xea0915 gfc_resolve(gfc_namespace*)
        ../../../riscv-gcc/gcc/fortran/resolve.cc:18083
0xe6bd6b resolve_all_program_units
        ../../../riscv-gcc/gcc/fortran/parse.cc:6909
0xe6c558 gfc_parse_file()
        ../../../riscv-gcc/gcc/fortran/parse.cc:7165
0xed11d9 gfc_be_parse_file
        ../../../riscv-gcc/gcc/fortran/f95-lang.cc:229
Please submit a full bug report, with preprocessed source (by using -freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


It can be reproduced without 'v' extension in '-march'.
Comment 2 anlauf 2023-08-13 19:35:06 UTC
I don't see an ICE on x86_64-pc-linux-gnu, but valgrind shows several
invalid reads, even on e.g. a shorter version:

PROGRAM p
CONTAINS
  SUBROUTINE c(d) e
  SUBROUTINE f
  END
END

We also have a couple of other PRs on error recovery on weird syntax errors.
Comment 3 Jeremy Bennett 2023-08-14 14:56:51 UTC
@JuzheZhong I believe this is in someway related to RVV.  If I remove `v' from the march:

riscv64-unknown-linux-gnu-gfortran -march=rv64gc -mabi=lp64d -c     -Ofast testcase.f90

The output I get is correct:

testcase.f90:6:20:

    6 |    SUBROUTINE c(d) e
      |                    1
Error: Syntax error in SUBROUTINE statement at (1)

Why does adding `v' to the -march string cause a SEGV?
Comment 4 JuzheZhong 2023-08-15 01:54:48 UTC
(In reply to Jeremy Bennett from comment #3)
> @JuzheZhong I believe this is in someway related to RVV.  If I remove `v'
> from the march:
> 
> riscv64-unknown-linux-gnu-gfortran -march=rv64gc -mabi=lp64d -c     -Ofast
> testcase.f90
> 
> The output I get is correct:
> 
> testcase.f90:6:20:
> 
>     6 |    SUBROUTINE c(d) e
>       |                    1
> Error: Syntax error in SUBROUTINE statement at (1)
> 
> Why does adding `v' to the -march string cause a SEGV?

I didn't reproduce the issue. What I see is GCC has ICE even without 'v' in 
-march. And I have no idea for it.
Comment 5 Mikael Morin 2023-08-17 19:49:57 UTC
Confirmed on the reduced example from comment #2.

The problem arises because of the bogus subroutine statement:
   SUBROUTINE c(d) e

This is rejected and symbols C and D should be released (no symbol has been built for E at the time the statement is rejected).
D is released, but C is not, and it keeps a pointer to D through it's formal field. BOOM.

The reason C is not released is its reference count is 2 and the condition to break cycles at the beginning of gfc_release_symbol doesn't hold:

3118  if (sym->formal_ns != NULL && sym->refs == 2 && sym->formal_ns != sym->ns
3119      && (!sym->attr.entry || !sym->module))

Here sym->formal_ns is NULL because the symbol C has not been completely setup.
Comment 6 Mikael Morin 2023-08-22 08:00:11 UTC
(In reply to Mikael Morin from comment #5)
> Here sym->formal_ns is NULL because the symbol C has not been completely
> setup.

This makes the following an "obvious" fix:

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 8182ef29f43..5afb9f2e2d3 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -7496,6 +7496,7 @@ gfc_match_function_decl (void)
     sym->attr.module_procedure = 1;
 
   gfc_new_block = sym;
+  sym->formal_ns = gfc_current_ns;
 
   m = gfc_match_formal_arglist (sym, 0, 0);
   if (m == MATCH_NO)
@@ -7993,6 +7994,7 @@ gfc_match_subroutine (void)
     sym = sym->result;
 
   gfc_new_block = sym;
+  sym->formal_ns = gfc_current_ns;
 
   /* Check what next non-whitespace character is so we can tell if there
      is the required parens if we have a BIND(C).  */


However, that patch makes the situation worse by releasing the namespace for "c" too early, as we access it later in reject_statement (it is the "current" namespace).
Comment 7 JuzheZhong 2023-08-22 08:14:44 UTC
(In reply to Mikael Morin from comment #6)
> (In reply to Mikael Morin from comment #5)
> > Here sym->formal_ns is NULL because the symbol C has not been completely
> > setup.
> 
> This makes the following an "obvious" fix:
> 
> diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
> index 8182ef29f43..5afb9f2e2d3 100644
> --- a/gcc/fortran/decl.cc
> +++ b/gcc/fortran/decl.cc
> @@ -7496,6 +7496,7 @@ gfc_match_function_decl (void)
>      sym->attr.module_procedure = 1;
>  
>    gfc_new_block = sym;
> +  sym->formal_ns = gfc_current_ns;
>  
>    m = gfc_match_formal_arglist (sym, 0, 0);
>    if (m == MATCH_NO)
> @@ -7993,6 +7994,7 @@ gfc_match_subroutine (void)
>      sym = sym->result;
>  
>    gfc_new_block = sym;
> +  sym->formal_ns = gfc_current_ns;
>  
>    /* Check what next non-whitespace character is so we can tell if there
>       is the required parens if we have a BIND(C).  */
> 
> 
> However, that patch makes the situation worse by releasing the namespace for
> "c" too early, as we access it later in reject_statement (it is the
> "current" namespace).

Thanks for helping with this issue. 
Do you have a solution that we can fix RISC-V backend?
Or you will fix it in Fortran front-end?

Thanks.
Comment 8 Mikael Morin 2023-08-22 19:03:29 UTC
(In reply to JuzheZhong from comment #7)
> Do you have a solution that we can fix RISC-V backend?

No, this is not RISC-V specific.

> Or you will fix it in Fortran front-end?

Yes, the fix will have to be in the fortran front-end, but I haven't found it yet.
Comment 9 Mikael Morin 2023-08-26 20:04:17 UTC
Created attachment 55801 [details]
Draft patch, to be cleaned up.

I need to clean up this, but it passes the fortran testsuite in its current form.
Comment 10 GCC Commits 2023-09-12 08:24:40 UTC
The master branch has been updated by Mikael Morin <mikael@gcc.gnu.org>:

https://gcc.gnu.org/g:b9cbd1a2c2f50d4e305938d97916011bd5839ce0

commit r14-3864-gb9cbd1a2c2f50d4e305938d97916011bd5839ce0
Author: Mikael Morin <mikael@gcc.gnu.org>
Date:   Tue Sep 12 10:24:20 2023 +0200

    fortran: Undo new symbols in all namespaces [PR110996]
    
    Remove new symbols from all namespaces they may have been added to in case a
    statement mismatches in the end and all the symbols referenced in it have to
    be reverted.
    
    This fixes memory errors and random internal compiler errors caused by
    a new symbol left with dangling pointers but not properly removed from the
    namespace at statement rejection.
    
    Before this change, new symbols were removed from their own namespace
    (their ns field) only.  This change additionally removes them from the
    namespaces pointed to by respectively the gfc_current_ns global variable,
    and the symbols' formal_ns field.
    
            PR fortran/110996
    
    gcc/fortran/ChangeLog:
    
            * gfortran.h (gfc_release_symbol): Set return type to bool.
            * symbol.cc (gfc_release_symbol): Ditto.  Return whether symbol was
            freed.
            (delete_symbol_from_ns): New, outline code from...
            (gfc_restore_last_undo_checkpoint): ... here.  Delete new symbols
            from two more namespaces.
    
    gcc/testsuite/ChangeLog:
    
            * gfortran.dg/pr110996.f90: New test.
Comment 11 Mikael Morin 2023-09-12 08:33:51 UTC
Fixed for gfortran 14.  Closing.