This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Eager newline handling for cpplib [patch]


Hi Zack,

The only problem I have with the patch is the use of 'magic' numbers to
control the switch in read_and_prescan. I would like to make the attached
modification to your patch before commiting it.

Dave

Zack Weinberg wrote:

> The appended patch implements "eager newline handling" in cpplib.
> This fixes at least two bugs involving backslash-newline in odd
> places.  It also squelches the number one performance bottleneck;
> cpplib's performance is now neck and neck with cccp.
>
> Some statistics: times to compile glibc (shared/static only) with the
> cc1 from egcs-2.93.11 19990311 (gcc2 ss-980929 experimental) on
> i586-linux [kernel 2.2.1, glibc 2.1, 32Mb]
>
> - cccp:                   1:47:31 - usr 4797.29 sys 581.91
> - cpplib without patch:   2:29:06 - usr 7342.79 sys 575.91
> - cpplib with patch:      1:48:37 - usr 4935.84 sys 568.37
>
> There are no regressions in the testsuite, in fact this fixes
> gcc.dg/990119-1.c and 990228-1.c.
>
> Patch is relative to my previous patch for cpplib ("cpplib
> initialization overhaul, revised") which is at
> http://egcs.cygnus.com/ml/egcs-patches/1999-03/msg00218.html.
>
> zw
>
> 1999-03-13 12:56 -0500  Zack Weinberg  <zack@rabi.columbia.edu>
>
>         "Eager" newline handling for cpplib.
>
>         * cppfile.c (read_and_prescan): Map backslash-newline to '\r'
>         (which cannot otherwise appear in the processed buffer) and
>         move it out of tokens that it appears in the middle of.
>         Improve performance.
>         (find_position): New function.
>
>         * cpplib.c: \r (one character) indicates backslash
>         newline, not \\\n (two characters).  It cannot appear in the
>         middle of a token.  Call CPP_BUMP_LINE (pfile) whenever
>         parsing moves past \n or \r.  Increment pfile->lineno whenever
>         a \n is placed into token_buffer.  Only one mark can exist at
>         a time, and CPP_BUMP_LINE must not be used while it is
>         active.  It is automatically cleared by cpp_pop_buffer and
>         parse_goto_mark.  \r is not in is_hor_space or is_space.
>
>         (NEWLINE_FIX, NEWLINE_FIX1, adjust_position,
>         update_position, count_newlines, parse_move_mark): Removed.
>         (parse_string, copy_comment): New functions.
>         (parse_name): Returns void.
>         (parse_set_mark, parse_clear_mark, parse_goto_mark): Take only
>         one argument, a cpp_reader *.  Change for new marking scheme.
>         (skip_comment): Handle CHILL line comments too.  Second
>         argument is now first character of comment marker; all callers
>         changed.  Issue error for unterminated block comment here.
>         (cpp_skip_hspace): Recognize CHILL comments.
>         (copy_rest_of_line): Likewise.  Call skip_comment and
>         parse_string directly, don't go through cpp_get_token.  Emit
>         "/**/" for block comments if -traditional (create_definition
>         needs this).
>         (do_define): Don't play with put_out_comments.
>         (cpp_push_buffer): Initialize ->mark to -1.
>         (cpp_buf_line_and_col): Just read out the values in the buffer
>         structure.
>         (output_line_command): Use cpp_buf_line_and_col.  Fix
>         formatting.  Remove stale code.
>         (cpp_get_token): Break out string parsing code to
>         parse_string.  Use skip_comment for CHILL comments too.  Use
>         copy_comment for put_out_comments instead of dinking with
>         marks.  Remove stale code.  Don't call output_line_command
>         unless it's necessary.
>
>         * cpplib.h (parse_marker): Removed.
>         (struct cpp_buffer): line_base is now a unsigned char *; add
>         `mark' [long], remove `marks' [struct parse_marker *].
>         (parse_set_mark, parse_clear_mark, parse_goto_mark): Update
>         prototypes.
>         (CPP_BUMP_LINE, CPP_BUMP_BUFFER_LINE): New macros.
>         * cppinit.c (is_hor_space, is_space): '\r' is not considered
>         whitespace.
>         * cppexp.c (cpp_parse_expression): Use cpp_skip_hspace, not
>         SKIP_WHITE_SPACE.
>         * cpphash.c (macarg): Disable line commands while expanding.


*** cppfiles.c	Tue Mar 16 14:08:55 1999
--- /home/brolley/comp/egcs/egcs/gcc/cppfiles.c	Tue Mar 16 14:23:15 1999
***************
*** 822,828 ****
       the end of a block. */
    U_CHAR intermed[PIPE_BUF + 2 + 2];
  
!   /* Table of characters that can't be handled in the inner loop. */
    U_CHAR speccase[256];
  
    offset = 0;
--- 822,835 ----
       the end of a block. */
    U_CHAR intermed[PIPE_BUF + 2 + 2];
  
!   /* Table of characters that can't be handled in the inner loop.
!      Keep these continguous to optimize the performance of the code generated
!      for the switch that uses them.  */
!   #define SPECCASE_EMPTY     0
!   #define SPECCASE_NUL       1
!   #define SPECCASE_CR        2
!   #define SPECCASE_BACKSLASH 3
!   #define SPECCASE_QUESTION  4
    U_CHAR speccase[256];
  
    offset = 0;
***************
*** 832,843 ****
    ibase = intermed + 2;
    deferred_newlines = 0;
  
!   memset (speccase, 0, 256);
!   speccase['\0'] = 1;
!   speccase['\r'] = 2;
!   speccase['\\'] = 3;
    if (CPP_OPTIONS (pfile)->trigraphs || CPP_OPTIONS (pfile)->warn_trigraphs)
!     speccase['?'] = 4;
  
    for (;;)
      {
--- 839,850 ----
    ibase = intermed + 2;
    deferred_newlines = 0;
  
!   memset (speccase, SPECCASE_EMPTY, sizeof (speccase));
!   speccase['\0'] = SPECCASE_NUL;
!   speccase['\r'] = SPECCASE_CR;
!   speccase['\\'] = SPECCASE_BACKSLASH;
    if (CPP_OPTIONS (pfile)->trigraphs || CPP_OPTIONS (pfile)->warn_trigraphs)
!     speccase['?'] = SPECCASE_QUESTION;
  
    for (;;)
      {
***************
*** 879,885 ****
  	  /* Deal with \-newline in the middle of a token. */
  	  if (deferred_newlines)
  	    {
! 	      while (! speccase[ip[span]]
  		     && ip[span] != '\n'
  		     && ip[span] != '\t'
  		     && ip[span] != ' ')
--- 886,892 ----
  	  /* Deal with \-newline in the middle of a token. */
  	  if (deferred_newlines)
  	    {
! 	      while (speccase[ip[span]] == SPECCASE_EMPTY
  		     && ip[span] != '\n'
  		     && ip[span] != '\t'
  		     && ip[span] != ' ')
***************
*** 895,912 ****
  	    }
  
  	  /* Copy as much as we can without special treatment. */
! 	  while (! speccase[ip[span]]) span++;
  	  memcpy (op, ip, span);
  	  op += span;
  	  ip += span;
  
  	  switch (speccase[*ip++])
  	    {
! 	    case 1:  /* \0 */
  	      ibase[-1] = op[-1];
  	      goto read_next;
  
! 	    case 2:  /* \r */
  	      if (*ip == '\n')
  		ip++;
  	      else if (*ip == '\0')
--- 902,919 ----
  	    }
  
  	  /* Copy as much as we can without special treatment. */
! 	  while (speccase[ip[span]] == SPECCASE_EMPTY) span++;
  	  memcpy (op, ip, span);
  	  op += span;
  	  ip += span;
  
  	  switch (speccase[*ip++])
  	    {
! 	    case SPECCASE_NUL:  /* \0 */
  	      ibase[-1] = op[-1];
  	      goto read_next;
  
! 	    case SPECCASE_CR:  /* \r */
  	      if (*ip == '\n')
  		ip++;
  	      else if (*ip == '\0')
***************
*** 920,926 ****
  	      *op++ = '\n';
  	      break;
  
! 	    case 3:  /* \ */
  	    backslash:
  	    {
  	      /* If we're at the end of the intermediate buffer,
--- 927,933 ----
  	      *op++ = '\n';
  	      break;
  
! 	    case SPECCASE_BACKSLASH:  /* \ */
  	    backslash:
  	    {
  	      /* If we're at the end of the intermediate buffer,
***************
*** 969,975 ****
  	    }
  	    break;
  
! 	    case 4: /* ? */
  	      {
  		unsigned int d;
  		/* If we're at the end of the intermediate buffer,
--- 976,982 ----
  	    }
  	    break;
  
! 	    case SPECCASE_QUESTION: /* ? */
  	      {
  		unsigned int d;
  		/* If we're at the end of the intermediate buffer,

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]