Bug 31645 - Error on reading Byte Order Mark
Summary: Error on reading Byte Order Mark
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.3.0
: P3 enhancement
Target Milestone: 4.3.0
Assignee: Francois-Xavier Coudert
URL: http://gcc.gnu.org/ml/gcc-patches/200...
Keywords: patch
Depends on:
Blocks:
 
Reported: 2007-04-21 09:22 UTC by Francois-Xavier Coudert
Modified: 2007-04-29 12:31 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2007-04-21 09:24:13


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Francois-Xavier Coudert 2007-04-21 09:22:46 UTC
We should probably take care of files that begin with a byte order mark (BOM; see http://en.wikipedia.org/wiki/Byte_Order_Mark) because some editors (like windows notepad) use them. We currently say:

$ xxd bom.f
0000000: fffe 2020 2020 2020 7072 696e 7420 2a2c  ..      print *,
0000010: 2022 4865 6c6c 6f20 776f 726c 6422 0a20   "Hello world". 
0000020: 2020 2020 2065 6e64                           end
$ gfortran bom.f 
bom.f:1.1:

\xFF\xFE      print *, "Hello world"                                          
1
Error: Non-numeric character in statement label at (1)
bom.f:1.2:

\xFF\xFE      print *, "Hello world"                                          
 1
Error: Invalid character in name at (1)
Comment 1 Andrew Pinski 2007-04-21 16:23:30 UTC
Note you might also need to add support to the preprocessor also (which means adding it to the C family of languages which is a good thing).  You might want to support more than just the UTF-8 BOM but also the UTF-16 and UTF-32 one too.  
Comment 2 Francois-Xavier Coudert 2007-04-29 11:46:15 UTC
Subject: Bug 31645

Author: fxcoudert
Date: Sun Apr 29 11:45:57 2007
New Revision: 124274

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=124274
Log:
	PR fortran/31645

	* scanner.c (load_file): Discard the byte order mark if one is
	found on the first non-preprocessor line of a file.

	* testsuite/gfortran.dg/bom_error.f90: New test.
	* testsuite/gfortran.dg/bom_include.f90: New test.
	* testsuite/gfortran.dg/bom_UTF16-LE.f90: New test.
	* testsuite/gfortran.dg/bom_UTF16-BE.f90: New test.
	* testsuite/gfortran.dg/bom_UTF-8.f90: New test.
	* testsuite/gfortran.dg/bom_UTF-32.f90: New test.
	* testsuite/gfortran.dg/bom_UTF-8.F90: New test.
	* testsuite/gfortran.dg/bom_include.inc: New file.

Added:
    trunk/gcc/testsuite/gfortran.dg/bom_UTF-32.f90
    trunk/gcc/testsuite/gfortran.dg/bom_UTF-8.F90
    trunk/gcc/testsuite/gfortran.dg/bom_UTF-8.f90
    trunk/gcc/testsuite/gfortran.dg/bom_UTF16-BE.f90
    trunk/gcc/testsuite/gfortran.dg/bom_UTF16-LE.f90
    trunk/gcc/testsuite/gfortran.dg/bom_error.f90
    trunk/gcc/testsuite/gfortran.dg/bom_include.f90
    trunk/gcc/testsuite/gfortran.dg/bom_include.inc
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/scanner.c
    trunk/gcc/testsuite/ChangeLog

Comment 3 Francois-Xavier Coudert 2007-04-29 12:31:51 UTC
Fixed