This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [gfortran,patch] Ignore byte order mark at start of file


Brooks Moses wrote:
Index: gcc/fortran/scanner.c
===================================================================
@@ -1467,6 +1469,24 @@
if (feof (input) && len == 0)
break;
+ /* If this is the first line of the file, it can contain a byte
+ order mark (BOM), which we will ignore:
+ FF FE is UTF-16 little endian,
+ FE FF is UTF-16 big endian,
+ EF BB BF is UTF-8. */
+ if (first_line && ((line[0] == '\xFF' && line[1] == '\xFE')
+ || (line[0] == '\xFE' && line[1] == '\xFF')
+ || (line[0] == '\xEF' && line[1] == '\xBB'
+ && line[2] == '\xBF')))

This part assumes that line[1] and line[2] exist. However, line[] is allocated in load_line as having length maxlen, which is set to gfc_option.free_line_length if we have free-form source with limited line lengths, and there is no guarantee that free_line_length is 3 or higher.


Thus, these should be conditioned on line_len > 2 or 3 as appropriate.

Actually, loading the BOM into the line-buffer and then throwing away the first two characters is wrong for another reason: it would reduce the available line length in an unexpected way. If a 2 byte BOM is present and no other options are given, the first line in a free-form-file would be truncated after 130 characters instead of the expected 132.


- Tobi


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]