This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Undefined cse.c behaviour causes 3.4 regression on HPUX


It's a long story...

I recently ran into a segmentation fault in cc1 on hppa-hp-hpux11.00
compiling a large proprietary application with gcc 3.4.2.  The same
source code compiles and runs fine with gcc 3.3.3 and earlier versions.
Investigating the problem, it turns out that the cause was a micompilation
of the compiler itself.  The cc1 from stage1/ works fine, but those from
stage2/ and stage3 exhibit the segmentation fault on the same test case.

My thanks to Dave Anglin for his help identify the problematic code in
cse.c.  It turns out that we're using an unsafe idiom to represent a
qty_table data structure.  Currently, quantity numbers are denoted by
integers, where all values between zero and max_reg-1 represent the
initial values of registers, and all values greater than or equal to
max_reg denote constants being propagated by the CSE pass.  We only
need to record the integer_csts for quantity numbers greater than max_reg,
so the code currently uses the following trick:

  /* This array is undefined before max_reg, so only allocate
     the space actually needed and adjust the start.  */

  qty_table = xmalloc ((max_qty - max_reg) * sizeof (struct qty_table_elem));
  qty_table -= max_reg;

This tricks allows cse.c to only allocate the memory it needs, yet still
index this table via qty_table[REG_QTY (reg)].


Unfortunately, this idiom invokes undefined behaviour in the ISO C99
standard, and in turn causes an unexpected memory exception on HPUX.
I'll explain both, but in reverse order.

HP's HP-UX on the PA-Risc processor uses a segmented memory architecture,
where the two two bits of a 32-bit pointer are used to determine which
segment a particular reference is to.  A quirk of this architecture is
that when using base+displacement addressing the segment to use is
choosen based upon the upper two bits of the base register (and not
their sum).

OpenEye's large proprietary application above, a legacy  physics code to
perform a multi-grid finite difference solution to the Poisson-Boltzmann
electrostatics equations of large proteins, contains a hideously large
function of over 5000 lines.  This results in an extremely large value
of max_reg (much larger than found during bootstrap or in GCC's
testsuite).  On HP-UX, this large value of max_reg results in subtracting
a value from qty_table that's larger than the absolute value returned
by malloc in the heap segment.  Hence when we access qty_table[q], the
current "base" pointer is to a different segment and with dump core.

After a long discussion with Dave Anglin, it turns out that this wierd
memory organization used by HP-UX is explicitly allowed by ISO C99.
Section 6.5.6 paragraph 8 (concerning pointer arithmetic) contains the
lines:


  If both the pointer operand and the result point to elements of the
  same array object, or one past the last element of the array object,
  the evaluation shall not produce an overflow; otherwise, the behavior
  is *undefined*.

Hence GCC's current subtraction of a value from the pointer returned
by malloc/xmalloc invokes undefined behavior, which is ultimately the
cause of this regression.  If this file were only compiled by GCC, we
could potentially argue this is a quality of implementation issue in
the PA backend.  Unfortunately, cse.c is compiled by the "host" compiler
and as such should never invoke unsafe undefined behaviour.  This same
problematic code is present on mainline, but latent for the above test
case.


My initial thoughts on a solution where to change all references
to qty_table, from qty_table[q] to become qty_table[q-max_reg] to fix
the problem.  This avoids initially offseting the qty_table pointer
(and hence the undefined behaviour), but at the overhead of an extra
subtraction of a global variable from the many qty_table references.

However, David Anglin suggested an alternate strategy of subtracting
max_reg from the stored quantity values themselves.  Elaborating on
that theme, the patch below reorganizes the encoding used for quantity
numbers altogther.  The original register values, previously in the
range 0..max_reg-1, are now represented by negative values, and the
constants being propagated are assigned sequential values >= 0.  Then
instead of representing registers as "regno - max_reg", requiring a
memory reference to max_reg, I choose to assign them unique quantity
values of "-regno - 1", or equivalently "-(regno+1)".  On most CPUs,
this can be done with two fast arithmetic operations, and avoids any
memory reference to max_reg.  As an added bonus, checking a quantity
number for validity now tests for "qty >= 0" (instead of "qty >= max_reg")
which should also provide a small speed improvement.


The following patch has been tested against mainline on i686-pc-linux-gnu
with a full "make bootstrap", all default languages, and regression tested
with a top-level "make -k check" with no new failures.  It's also been
tested against the gcc-3_4-branch on hppa2.0w-hp-hpux11.00 where it
survives a full "make bootstrap", all default languages, and allows the
above application to build successfully.  I've also confirmed by hand
that there are absolutely no changes to the code generated for whetstone
with this patch.


Ok for mainline?  Regression tests for 3.4.x are currently running on the
above HP box, and I'll also attempt a full bootstrap and regression test
on x86 for gcc-3_4-branch.  Ok for the gcc-3_4-branch is testing passes?

I apologise there isn't a PR number for this, and that I'm unable to
contribute a testcase that exhibits this problem.  By its very nature,
this problem requires a single basic block with several thousand pseudos,
so the probability that I'll be able to reduce a non-confidential test
is low.  Hopefully, the above description/explanation makes it clear
that this is the correct solution.



2004-10-24  Roger Sayle  <roger@eyesopen.com>
	    John David Anglin  <dave.anglin@nrc-cnrc.gc.ca>

	* cse.c: Change encoding of quantity numbers to avoid undefined
	pointer arithmetic on qty_table.
	(REGNO_QTY_VALID_P): A quantity is now valid if it isn't negative.
	(get_cse_reg_info): Initialize reg_qty to a unique negative value.
	(new_basic_block): Assign "real" quantity numbers from zero.
	(delete_reg_equiv): Do nothing if quantity is invalid.  Reset the
	REG_QTY to its unique negative value.
	(merge_equiv_classes): Calculate need_rehash if quantity is valid.
	(cse_main): Don't include max_reg when determining max_qty.
	(cse_basic_block): Avoid subtracting a large offset from qty_table,
	which causes undefined C99 behaviour.  Only allocate needed memory.



Index: cse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/cse.c,v
retrieving revision 1.317
diff -c -3 -p -r1.317 cse.c
*** cse.c	6 Oct 2004 07:30:11 -0000	1.317
--- cse.c	24 Oct 2004 21:24:53 -0000
*************** Registers and "quantity numbers":
*** 84,94 ****
     `reg_qty' records what quantity a register is currently thought
     of as containing.

!    All real quantity numbers are greater than or equal to `max_reg'.
!    If register N has not been assigned a quantity, reg_qty[N] will equal N.

!    Quantity numbers below `max_reg' do not exist and none of the `qty_table'
!    entries should be referenced with an index below `max_reg'.

     We also maintain a bidirectional chain of registers for each
     quantity number.  The `qty_table` members `first_reg' and `last_reg',
--- 84,95 ----
     `reg_qty' records what quantity a register is currently thought
     of as containing.

!    All real quantity numbers are greater than or equal to zero.
!    If register N has not been assigned a quantity, reg_qty[N] will
!    equal -N - 1, which is always negative.

!    Quantity numbers below zero do not exist and none of the `qty_table'
!    entries should be referenced with a negative index.

     We also maintain a bidirectional chain of registers for each
     quantity number.  The `qty_table` members `first_reg' and `last_reg',
*************** struct table_elt
*** 546,552 ****
  /* Determine if the quantity number for register X represents a valid index
     into the qty_table.  */

! #define REGNO_QTY_VALID_P(N) (REG_QTY (N) != (int) (N))

  static struct table_elt *table[HASH_SIZE];

--- 547,553 ----
  /* Determine if the quantity number for register X represents a valid index
     into the qty_table.  */

! #define REGNO_QTY_VALID_P(N) (REG_QTY (N) >= 0)

  static struct table_elt *table[HASH_SIZE];

*************** get_cse_reg_info (unsigned int regno)
*** 844,850 ****
        p->reg_tick = 1;
        p->reg_in_table = -1;
        p->subreg_ticked = -1;
!       p->reg_qty = regno;
        p->regno = regno;
        p->next = cse_reg_info_used_list;
        cse_reg_info_used_list = p;
--- 845,851 ----
        p->reg_tick = 1;
        p->reg_in_table = -1;
        p->subreg_ticked = -1;
!       p->reg_qty = -regno - 1;
        p->regno = regno;
        p->next = cse_reg_info_used_list;
        cse_reg_info_used_list = p;
*************** new_basic_block (void)
*** 868,874 ****
  {
    int i;

!   next_qty = max_reg;

    /* Clear out hash table state for this pass.  */

--- 869,875 ----
  {
    int i;

!   next_qty = 0;

    /* Clear out hash table state for this pass.  */

*************** delete_reg_equiv (unsigned int reg)
*** 1012,1018 ****
    int p, n;

    /* If invalid, do nothing.  */
!   if (q == (int) reg)
      return;

    ent = &qty_table[q];
--- 1013,1019 ----
    int p, n;

    /* If invalid, do nothing.  */
!   if (! REGNO_QTY_VALID_P (reg))
      return;

    ent = &qty_table[q];
*************** delete_reg_equiv (unsigned int reg)
*** 1029,1035 ****
    else
      ent->first_reg = n;

!   REG_QTY (reg) = reg;
  }

  /* Remove any invalid expressions from the hash table
--- 1030,1036 ----
    else
      ent->first_reg = n;

!   REG_QTY (reg) = -reg - 1;
  }

  /* Remove any invalid expressions from the hash table
*************** merge_equiv_classes (struct table_elt *c
*** 1627,1633 ****

  	  if (REG_P (exp))
  	    {
! 	      need_rehash = (unsigned) REG_QTY (REGNO (exp)) != REGNO (exp);
  	      delete_reg_equiv (REGNO (exp));
  	    }

--- 1628,1634 ----

  	  if (REG_P (exp))
  	    {
! 	      need_rehash = REGNO_QTY_VALID_P (REGNO (exp));
  	      delete_reg_equiv (REGNO (exp));
  	    }

*************** cse_main (rtx f, int nregs, FILE *file)
*** 6739,6746 ****
        if (max_qty < 500)
  	max_qty = 500;

-       max_qty += max_reg;
-
        /* If this basic block is being extended by following certain jumps,
           (see `cse_end_of_basic_block'), we reprocess the code from the start.
           Otherwise, we start after this basic block.  */
--- 6740,6745 ----
*************** cse_basic_block (rtx from, rtx to, struc
*** 6801,6811 ****
    int num_insns = 0;
    int no_conflict = 0;

!   /* This array is undefined before max_reg, so only allocate
!      the space actually needed and adjust the start.  */
!
!   qty_table = xmalloc ((max_qty - max_reg) * sizeof (struct qty_table_elem));
!   qty_table -= max_reg;

    new_basic_block ();

--- 6800,6807 ----
    int num_insns = 0;
    int no_conflict = 0;

!   /* Allocate the space needed by qty_table.  */
!   qty_table = xmalloc (max_qty * sizeof (struct qty_table_elem));

    new_basic_block ();

*************** cse_basic_block (rtx from, rtx to, struc
*** 6916,6922 ****
  	{
  	  if (to == 0)
  	    {
! 	      free (qty_table + max_reg);
  	      return 0;
  	    }

--- 6912,6918 ----
  	{
  	  if (to == 0)
  	    {
! 	      free (qty_table);
  	      return 0;
  	    }

*************** cse_basic_block (rtx from, rtx to, struc
*** 6951,6957 ****
  	  /* If TO was the last insn in the function, we are done.  */
  	  if (insn == 0)
  	    {
! 	      free (qty_table + max_reg);
  	      return 0;
  	    }

--- 6947,6953 ----
  	  /* If TO was the last insn in the function, we are done.  */
  	  if (insn == 0)
  	    {
! 	      free (qty_table);
  	      return 0;
  	    }

*************** cse_basic_block (rtx from, rtx to, struc
*** 6960,6966 ****
  	  prev = prev_nonnote_insn (to);
  	  if (prev && BARRIER_P (prev))
  	    {
! 	      free (qty_table + max_reg);
  	      return insn;
  	    }

--- 6956,6962 ----
  	  prev = prev_nonnote_insn (to);
  	  if (prev && BARRIER_P (prev))
  	    {
! 	      free (qty_table);
  	      return insn;
  	    }

*************** cse_basic_block (rtx from, rtx to, struc
*** 6995,7001 ****

    gcc_assert (next_qty <= max_qty);

!   free (qty_table + max_reg);

    return to ? NEXT_INSN (to) : 0;
  }
--- 6991,6997 ----

    gcc_assert (next_qty <= max_qty);

!   free (qty_table);

    return to ? NEXT_INSN (to) : 0;
  }



Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]