This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[patch] reimplement -fstrict-volatile-bitfields


Background: on ARM and some other targets, the ABI requires that volatile bit-fields be accessed atomically in a mode that corresponds to the declared type of the field, which conflicts with GCC's normal behavior of doing accesses in a mode that might correspond to the size of a general-purpose register, the size of the bit-field, or the bit range corresponding to the C++ memory model. This is what the -fstrict-volatile-bitfields flag does, and it is the default on ARM and other targets where the ABI requires this behavior.

Both the original patch that added -fstrict-volatile-bitfields support and a subsequent followup patch that tried to unbreak handling of packed structures (where fields might not be properly aligned to do the single access otherwise mandated by -fstrict-volatile-bitfields) only handled bit-field reads, and not writes. Last year I submitted a patch we've had locally for some time to extend the existing implementation to writes, but it was rejected on the grounds that the current implementation is too broken to fix or extend. I didn't have time then to start over from scratch, and meanwhile, the bug reports about -fstrict-volatile-bitfields have continued to pile up. So let's try again to fix this, this time working from the ground up.

From last year's discussion, it seemed that there were two primary objections to the current implementation:

(1) It was seen as inappropriate that warnings about conflicts between unaligned fields and -fstrict-volatile-bitfields were being emitted during expand. It was suggested that any diagnostics ought to be emitted by the various language front ends instead.

(2) The way packed fields are being detected is buggy and an abstraction violation, and passing around a packedp flag to all the bit-field expand functions is ugly.

And, my own complaints about the current implementation:

(3) Users expect packed structures to work even on targets where -fstrict-volatile-bitfields is the default, so the compiler shouldn't generate code for accesses to unaligned fields that either faults at run time due to the unaligned access or silently produces an incorrect result (e.g., by only accessing part of the bit-field), with or without a warning at compile time.

(4) There's pointless divergence between the bit-field store and extract code that has led to a number of bugs.

I've come up with a new patch that tries to address all these issues.

For problem (1), I've eliminated the warnings from expand. I'm not opposed to adding them back to the front ends, as previously suggested, but given that they were only previously implemented for reads and not writes and that it was getting the different-warning-for-packed-fields behavior wrong in some cases, getting rid of the warnings is at least as correct as adding them for bit-field writes, too. ;-)

I've killed the packedp flag from item (2) completely too.

For problem (3), my reading of the ARM ABI document is that the requirements for atomic access to volatile bit-fields only apply to bit-fields that are aligned according to the ABI requirements. If a user has used GCC extensions to create a non-ABI-compliant packed structure where an atomic bit-field access of the correct size is not possible or valid on the target, then GCC ought to define some reasonable access behavior for those bit-fields too as a further extension -- whether or not it complies with the ABI requirements for unpacked structures -- rather than just generating invalid code. In particular, generating the access using whatever technique it would fall back to if -fstrict-volatile-bitfields didn't apply, in cases where it *cannot* be applied, seems perfectly reasonable to me.

To address problem (4), I've tried to make the code for handling -fstrict-volatile-bitfields similar in the read and write cases. I think there is probably more that can be done here in terms of refactoring some of the now-common code and changing the interfaces to be more consistent as well, but I think it would be more clear to separate changes that are just code cleanup from those that are intended to change behavior. I'm willing to work on code refactoring as a followup patch if the maintainers recommend or require that.

I've regression tested the attached patch on arm-none-eabi as well as bootstrapping and regression testing on x86_64-linux-gnu. I also did some spot testing on mipsisa32r2-sde-elf. I verified that all the new test cases pass on these targets with this patch. Without the patch, I saw these failures:

pr23623.c: ARM, x86_64, MIPS
pr48784-1.c: ARM, x86_64, MIPS
pr48784-2.c: none
pr56341-1.c: ARM, MIPS
pr56341-2.c: none
pr56997-1.c: ARM
pr56997-2.c: ARM, MIPS
pr56997-3.c: ARM, MIPS
volatile-bitfields-3.c: ARM, x86_64, MIPS

Here are some comments on specific parts of the patch.

The "meat" of the patch is rewriting the logic for -fstrict-volatile-bitfields in extract_fixed_bit_field, and making store_fixed_bit_field use parallel logic.

The change to make extract_bit_field_1 not skip over the
simple_mem_bitfield_p case was necessary to avoid introducing new failures of the pr56997-* test cases on x86_64. store_bit_field_1 already has parallel code, except for getting the field mode passed in explicitly as an argument.

The change to store_bit_field_1 and extract_bit_field_1 to make them skip both cheap register alternatives if -fstrict-volatile-bitfields applies, instead of just the first one, is for the volatile-bitfields-3.c test case. As noted above, this example was previously failing on all three targets I tried, and I was specifically using x86_64 to debug this one.

The hack to get_bit_range fixes a regression in the pr23623 test case that has been present since the bit range support was added, I think. There might be a better place to do this, but it seems clear to me that when -fstrict-volatile-bitfields applies it has to override the bit range that would be used for normal accesses. It seemed to me during my incremental testing and debugging that this change was independent of the others (e.g., it fixed the regression even with no other changes).

Anyway, what do you think of this patch? It does fix a whole lot of bugs, and seems like at least an incremental improvement over the current code. I realize this is probably going to take a couple iterations to get right. If there are additional cleanups or refactoring of the bit-field code required, in addition to the functional changes, can they be done as followup patches or do I need to work out the entire series of patches in advance?

-Sandra

2013-06-12  Sandra Loosemore  <sandra@codesourcery.com>

	PR middle-end/23623
	PR middle-end/48784
	PR middle-end/56341
	PR middle-end/56997

	gcc/
	* expr.h (extract_bit_field): Remove packedp parameter.
	* expmed.c (extract_fixed_bit_field): Remove packedp parameter
	from forward declaration.
	(store_bit_field_1): Skip both cheap register alternatives if
	-fstrict-volatile-bitfields applies, not just the first one.
	(store_fixed_bit_field): Adjust flag_strict_volatile_bitfields
	handling and consolidate code for the default case.
	(store_split_bit_field): Remove packedp arg from calls to
	extract_fixed_bit_field.
	(extract_bit_field_1): Remove packedp parameter.  Adjust
	flag_strict_volatile_bitfields handling to match
	store_bit_field_1.  Remove packedp argument from recursive calls
	and calls to extract_fixed_bit_field.
	(extract_bit_field): Remove packedp parameter and corresponding
	arg to extract_bit_field_1.
	(extract_fixed_bit_field): Remove packedp parameter.  Simplify
	flag_strict_volatile_bitfields handling and remove code to issue
	warnings.  Use narrow_bit_field_mem to match store_fixed_bit_field.
	(extract_split_bit_field): Remove packedp arg from call to
	extract_fixed_bit_field.
	* expr.c (emit_group_load_1): Adjust calls to extract_bit_field.
	(copy_blkmode_from_reg): Likewise.
	(copy_blkmode_to_reg): Likewise.
	(read_complex_part): Likewise.
	(get_bit_range): Handle flag_strict_volatile_bitfields.
	(store_field): Adjust calls to extract_bit_field.
	(expand_expr_real_1): Likewise.
	* calls.c (store_unaligned_arguments_into_pseudos): Adjust call
	to extract_bit_field.
	* config/tilegx/tilegx.c (tilegx_expand_unaligned_load): Adjust call
	to extract_bit_field.
	* config/tilepro/tilepro.c (tilepro_expand_unaligned_load): Adjust call
	to extract_bit_field.
	* doc/invoke.texi (Code Gen Options): Update documentation of
	-fstrict-volatile-bitfields to reflect new behavior.  Note 
	that AAPCS requires this.

	gcc/testsuite/
	* gcc.dg/pr23623.c: New test.
	* gcc.dg/pr48784-1.c: New test.
	* gcc.dg/pr48784-2.c: New test.
	* gcc.dg/pr56341-1.c: New test.
	* gcc.dg/pr56341-2.c: New test.
	* gcc.dg/pr56997-1.c: New test.
	* gcc.dg/pr56997-2.c: New test.
	* gcc.dg/pr56997-3.c: New test.
	* gcc.dg/volatile-bitfields-3.c: New test.
Index: gcc/expr.h
===================================================================
--- gcc/expr.h	(revision 199963)
+++ gcc/expr.h	(working copy)
@@ -704,7 +704,7 @@ extern void store_bit_field (rtx, unsign
 			     unsigned HOST_WIDE_INT,
 			     enum machine_mode, rtx);
 extern rtx extract_bit_field (rtx, unsigned HOST_WIDE_INT,
-			      unsigned HOST_WIDE_INT, int, bool, rtx,
+			      unsigned HOST_WIDE_INT, int, rtx,
 			      enum machine_mode, enum machine_mode);
 extern rtx extract_low_bits (enum machine_mode, enum machine_mode, rtx);
 extern rtx expand_mult (enum machine_mode, rtx, rtx, rtx, int);
Index: gcc/expmed.c
===================================================================
--- gcc/expmed.c	(revision 199963)
+++ gcc/expmed.c	(working copy)
@@ -54,7 +54,7 @@ static void store_split_bit_field (rtx, 
 				   rtx);
 static rtx extract_fixed_bit_field (enum machine_mode, rtx,
 				    unsigned HOST_WIDE_INT,
-				    unsigned HOST_WIDE_INT, rtx, int, bool);
+				    unsigned HOST_WIDE_INT, rtx, int);
 static rtx mask_rtx (enum machine_mode, int, int, int);
 static rtx lshift_value (enum machine_mode, rtx, int, int);
 static rtx extract_split_bit_field (rtx, unsigned HOST_WIDE_INT,
@@ -810,15 +810,15 @@ store_bit_field_1 (rtx str_rtx, unsigned
     return true;
 
   /* If OP0 is a memory, try copying it to a register and seeing if a
-     cheap register alternative is available.  */
-  if (MEM_P (op0))
+     cheap register alternative is available.  Do not try these tricks if
+     -fstrict-volatile-bitfields is in effect, since they may not respect
+     the mode of the access.  */
+  if (MEM_P (op0)
+      && !(MEM_VOLATILE_P (op0)
+	   && flag_strict_volatile_bitfields > 0))
     {
-      /* Do not use unaligned memory insvs for volatile bitfields when
-	 -fstrict-volatile-bitfields is in effect.  */
-      if (!(MEM_VOLATILE_P (op0)
-	    && flag_strict_volatile_bitfields > 0)
-	  && get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
-					   fieldmode)
+      if (get_best_mem_extraction_insn (&insv, EP_insv, bitsize, bitnum,
+					fieldmode)
 	  && store_bit_field_using_insv (&insv, op0, bitsize, bitnum, value))
 	return true;
 
@@ -933,19 +933,35 @@ store_fixed_bit_field (rtx op0, unsigned
 	 a word, we won't be doing the extraction the normal way.
 	 We don't want a mode bigger than the destination.  */
 
-      mode = GET_MODE (op0);
-      if (GET_MODE_BITSIZE (mode) == 0
-	  || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
-	mode = word_mode;
-
       if (MEM_VOLATILE_P (op0)
           && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
 	  && GET_MODE_BITSIZE (GET_MODE (op0)) <= maxbits
 	  && flag_strict_volatile_bitfields > 0)
-	mode = GET_MODE (op0);
+	{
+	  unsigned int modesize;
+
+	  mode = GET_MODE (op0);
+
+	  /* Check for oversized or unaligned field that we can't write
+	     with a single access of the required type, and fall back to
+	     the default behavior.  */
+	  modesize = GET_MODE_BITSIZE (mode);
+	  if (bitnum % BITS_PER_UNIT + bitsize > modesize
+	      || (STRICT_ALIGNMENT
+		  && bitnum % GET_MODE_ALIGNMENT (mode) + bitsize > modesize))
+	    mode = get_best_mode (bitsize, bitnum, 0, 0,
+				  MEM_ALIGN (op0), word_mode, true);
+	}
       else
-	mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
-			      MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+	{
+	  mode = GET_MODE (op0);
+	  if (GET_MODE_BITSIZE (mode) == 0
+	      || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode))
+	    mode = word_mode;
+
+	  mode = get_best_mode (bitsize, bitnum, bitregion_start, bitregion_end,
+				MEM_ALIGN (op0), mode, MEM_VOLATILE_P (op0));
+	}
 
       if (mode == VOIDmode)
 	{
@@ -1128,7 +1144,7 @@ store_split_bit_field (rtx op0, unsigned
 		 endianness compensation) to fetch the piece we want.  */
 	      part = extract_fixed_bit_field (word_mode, value, thissize,
 					      total_bits - bitsize + bitsdone,
-					      NULL_RTX, 1, false);
+					      NULL_RTX, 1);
 	    }
 	}
       else
@@ -1140,7 +1156,7 @@ store_split_bit_field (rtx op0, unsigned
 			    & (((HOST_WIDE_INT) 1 << thissize) - 1));
 	  else
 	    part = extract_fixed_bit_field (word_mode, value, thissize,
-					    bitsdone, NULL_RTX, 1, false);
+					    bitsdone, NULL_RTX, 1);
 	}
 
       /* If OP0 is a register, then handle OFFSET here.
@@ -1301,8 +1317,7 @@ extract_bit_field_using_extv (const extr
 
 static rtx
 extract_bit_field_1 (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
-		     unsigned HOST_WIDE_INT bitnum,
-		     int unsignedp, bool packedp, rtx target,
+		     unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
 		     enum machine_mode mode, enum machine_mode tmode,
 		     bool fallback_p)
 {
@@ -1430,24 +1445,35 @@ extract_bit_field_1 (rtx str_rtx, unsign
      If that's wrong, the solution is to test for it and set TARGET to 0
      if needed.  */
 
-  /* If the bitfield is volatile, we need to make sure the access
-     remains on a type-aligned boundary.  */
+  /* Get the mode of the field to use for atomic access or subreg
+     conversion.  */
+  mode1 = mode;
+
+  /* For the -fstrict-volatile-bitfields case, we must use the mode of the
+     field instead.  */
   if (GET_CODE (op0) == MEM
       && MEM_VOLATILE_P (op0)
       && GET_MODE_BITSIZE (GET_MODE (op0)) > 0
       && flag_strict_volatile_bitfields > 0)
-    goto no_subreg_mode_swap;
+    {
+      if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
+	mode1 = GET_MODE (op0);
+      else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
+	mode1 = GET_MODE (target);
+      else
+	mode1 = tmode;
+    }
 
   /* Only scalar integer modes can be converted via subregs.  There is an
      additional problem for FP modes here in that they can have a precision
      which is different from the size.  mode_for_size uses precision, but
      we want a mode based on the size, so we must avoid calling it for FP
      modes.  */
-  mode1 = mode;
-  if (SCALAR_INT_MODE_P (tmode))
+  else if (SCALAR_INT_MODE_P (tmode))
     {
       enum machine_mode try_mode = mode_for_size (bitsize,
 						  GET_MODE_CLASS (tmode), 0);
+
       if (try_mode != BLKmode)
 	mode1 = try_mode;
     }
@@ -1475,8 +1501,6 @@ extract_bit_field_1 (rtx str_rtx, unsign
       return convert_extracted_bit_field (op0, mode, tmode, unsignedp);
     }
 
- no_subreg_mode_swap:
-
   /* Handle fields bigger than a word.  */
 
   if (bitsize > BITS_PER_WORD)
@@ -1517,7 +1541,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
 	  rtx result_part
 	    = extract_bit_field_1 (op0, MIN (BITS_PER_WORD,
 					     bitsize - i * BITS_PER_WORD),
-				   bitnum + bit_offset, 1, false, target_part,
+				   bitnum + bit_offset, 1, target_part,
 				   mode, word_mode, fallback_p);
 
 	  gcc_assert (target_part);
@@ -1593,14 +1617,14 @@ extract_bit_field_1 (rtx str_rtx, unsign
     }
 
   /* If OP0 is a memory, try copying it to a register and seeing if a
-     cheap register alternative is available.  */
-  if (MEM_P (op0))
+     cheap register alternative is available.  Do not try these tricks if
+     -fstrict-volatile-bitfields is in effect, since they may not respect
+     the mode of the access.  */
+  if (MEM_P (op0)
+      && !(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0))
     {
-      /* Do not use extv/extzv for volatile bitfields when
-         -fstrict-volatile-bitfields is in effect.  */
-      if (!(MEM_VOLATILE_P (op0) && flag_strict_volatile_bitfields > 0)
-	  && get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
-					   tmode))
+      if (get_best_mem_extraction_insn (&extv, pattern, bitsize, bitnum,
+					tmode))
 	{
 	  rtx result = extract_bit_field_using_extv (&extv, op0, bitsize,
 						     bitnum, unsignedp,
@@ -1621,7 +1645,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
 	{
 	  xop0 = copy_to_reg (xop0);
 	  rtx result = extract_bit_field_1 (xop0, bitsize, bitpos,
-					    unsignedp, packedp, target,
+					    unsignedp, target,
 					    mode, tmode, false);
 	  if (result)
 	    return result;
@@ -1641,7 +1665,7 @@ extract_bit_field_1 (rtx str_rtx, unsign
   gcc_assert (int_mode != BLKmode);
 
   target = extract_fixed_bit_field (int_mode, op0, bitsize, bitnum,
-				    target, unsignedp, packedp);
+				    target, unsignedp);
   return convert_extracted_bit_field (target, mode, tmode, unsignedp);
 }
 
@@ -1652,7 +1676,6 @@ extract_bit_field_1 (rtx str_rtx, unsign
 
    STR_RTX is the structure containing the byte (a REG or MEM).
    UNSIGNEDP is nonzero if this is an unsigned bit field.
-   PACKEDP is nonzero if the field has the packed attribute.
    MODE is the natural mode of the field value once extracted.
    TMODE is the mode the caller would like the value to have;
    but the value may be returned with type MODE instead.
@@ -1664,10 +1687,10 @@ extract_bit_field_1 (rtx str_rtx, unsign
 
 rtx
 extract_bit_field (rtx str_rtx, unsigned HOST_WIDE_INT bitsize,
-		   unsigned HOST_WIDE_INT bitnum, int unsignedp, bool packedp,
-		   rtx target, enum machine_mode mode, enum machine_mode tmode)
+		   unsigned HOST_WIDE_INT bitnum, int unsignedp, rtx target,
+		   enum machine_mode mode, enum machine_mode tmode)
 {
-  return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp, packedp,
+  return extract_bit_field_1 (str_rtx, bitsize, bitnum, unsignedp,
 			      target, mode, tmode, true);
 }
 
@@ -1675,8 +1698,6 @@ extract_bit_field (rtx str_rtx, unsigned
    from bit BITNUM of OP0.
 
    UNSIGNEDP is nonzero for an unsigned bit field (don't sign-extend value).
-   PACKEDP is true if the field has the packed attribute.
-
    If TARGET is nonzero, attempts to store the value there
    and return TARGET, but this is not guaranteed.
    If TARGET is not used, create a pseudo-reg of mode TMODE for the value.  */
@@ -1685,7 +1706,7 @@ static rtx
 extract_fixed_bit_field (enum machine_mode tmode, rtx op0,
 			 unsigned HOST_WIDE_INT bitsize,
 			 unsigned HOST_WIDE_INT bitnum, rtx target,
-			 int unsignedp, bool packedp)
+			 int unsignedp)
 {
   enum machine_mode mode;
 
@@ -1698,12 +1719,24 @@ extract_fixed_bit_field (enum machine_mo
       if (MEM_VOLATILE_P (op0)
 	  && flag_strict_volatile_bitfields > 0)
 	{
+	  unsigned int modesize;
+
 	  if (GET_MODE_BITSIZE (GET_MODE (op0)) > 0)
 	    mode = GET_MODE (op0);
 	  else if (target && GET_MODE_BITSIZE (GET_MODE (target)) > 0)
 	    mode = GET_MODE (target);
 	  else
 	    mode = tmode;
+
+	  /* Check for oversized or unaligned field that we can't read
+	     with a single access of the required type, and fall back to
+	     the default behavior.  */
+	  modesize = GET_MODE_BITSIZE (mode);
+	  if (bitnum % BITS_PER_UNIT + bitsize > modesize
+	      || (STRICT_ALIGNMENT
+		  && bitnum % GET_MODE_ALIGNMENT (mode) + bitsize > modesize))
+	    mode = get_best_mode (bitsize, bitnum, 0, 0,
+				  MEM_ALIGN (op0), word_mode, true);
 	}
       else
 	mode = get_best_mode (bitsize, bitnum, 0, 0,
@@ -1714,61 +1747,7 @@ extract_fixed_bit_field (enum machine_mo
 	   boundaries.  */
 	return extract_split_bit_field (op0, bitsize, bitnum, unsignedp);
 
-      unsigned int total_bits = GET_MODE_BITSIZE (mode);
-      HOST_WIDE_INT bit_offset = bitnum - bitnum % total_bits;
-
-      /* If we're accessing a volatile MEM, we can't apply BIT_OFFSET
-	 if it results in a multi-word access where we otherwise wouldn't
-	 have one.  So, check for that case here.  */
-      if (MEM_P (op0)
-	  && MEM_VOLATILE_P (op0)
-	  && flag_strict_volatile_bitfields > 0
-	  && bitnum % BITS_PER_UNIT + bitsize <= total_bits
-	  && bitnum % GET_MODE_BITSIZE (mode) + bitsize > total_bits)
-	{
-	  if (STRICT_ALIGNMENT)
-	    {
-	      static bool informed_about_misalignment = false;
-
-	      if (packedp)
-		{
-		  if (bitsize == total_bits)
-		    warning_at (input_location, OPT_fstrict_volatile_bitfields,
-				"multiple accesses to volatile structure"
-				" member because of packed attribute");
-		  else
-		    warning_at (input_location, OPT_fstrict_volatile_bitfields,
-				"multiple accesses to volatile structure"
-				" bitfield because of packed attribute");
-
-		  return extract_split_bit_field (op0, bitsize, bitnum,
-						  unsignedp);
-		}
-
-	      if (bitsize == total_bits)
-		warning_at (input_location, OPT_fstrict_volatile_bitfields,
-			    "mis-aligned access used for structure member");
-	      else
-		warning_at (input_location, OPT_fstrict_volatile_bitfields,
-			    "mis-aligned access used for structure bitfield");
-
-	      if (! informed_about_misalignment)
-		{
-		  informed_about_misalignment = true;
-		  inform (input_location,
-			  "when a volatile object spans multiple type-sized"
-			  " locations, the compiler must choose between using"
-			  " a single mis-aligned access to preserve the"
-			  " volatility, or using multiple aligned accesses"
-			  " to avoid runtime faults; this code may fail at"
-			  " runtime if the hardware does not allow this"
-			  " access");
-		}
-	    }
-	  bit_offset = bitnum - bitnum % BITS_PER_UNIT;
-	}
-      op0 = adjust_bitfield_address (op0, mode, bit_offset / BITS_PER_UNIT);
-      bitnum -= bit_offset;
+      op0 = narrow_bit_field_mem (op0, mode, bitsize, bitnum, &bitnum);
     }
 
   mode = GET_MODE (op0);
@@ -1939,7 +1918,7 @@ extract_split_bit_field (rtx op0, unsign
 	 whose meaning is determined by BYTES_PER_UNIT.
 	 OFFSET is in UNITs, and UNIT is in bits.  */
       part = extract_fixed_bit_field (word_mode, word, thissize,
-				      offset * unit + thispos, 0, 1, false);
+				      offset * unit + thispos, 0, 1);
       bitsdone += thissize;
 
       /* Shift this part into place for the result.  */
Index: gcc/expr.c
===================================================================
--- gcc/expr.c	(revision 199963)
+++ gcc/expr.c	(working copy)
@@ -1704,7 +1704,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, r
 		  && (!REG_P (tmps[i]) || GET_MODE (tmps[i]) != mode))
 		tmps[i] = extract_bit_field (tmps[i], bytelen * BITS_PER_UNIT,
 					     (bytepos % slen0) * BITS_PER_UNIT,
-					     1, false, NULL_RTX, mode, mode);
+					     1, NULL_RTX, mode, mode);
 	    }
 	  else
 	    {
@@ -1714,7 +1714,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, r
 	      mem = assign_stack_temp (GET_MODE (src), slen);
 	      emit_move_insn (mem, src);
 	      tmps[i] = extract_bit_field (mem, bytelen * BITS_PER_UNIT,
-					   0, 1, false, NULL_RTX, mode, mode);
+					   0, 1, NULL_RTX, mode, mode);
 	    }
 	}
       /* FIXME: A SIMD parallel will eventually lead to a subreg of a
@@ -1755,7 +1755,7 @@ emit_group_load_1 (rtx *tmps, rtx dst, r
 	tmps[i] = src;
       else
 	tmps[i] = extract_bit_field (src, bytelen * BITS_PER_UNIT,
-				     bytepos * BITS_PER_UNIT, 1, false, NULL_RTX,
+				     bytepos * BITS_PER_UNIT, 1, NULL_RTX,
 				     mode, mode);
 
       if (shift)
@@ -2198,7 +2198,7 @@ copy_blkmode_from_reg (rtx target, rtx s
 	 bitpos for the destination store (left justified).  */
       store_bit_field (dst, bitsize, bitpos % BITS_PER_WORD, 0, 0, copy_mode,
 		       extract_bit_field (src, bitsize,
-					  xbitpos % BITS_PER_WORD, 1, false,
+					  xbitpos % BITS_PER_WORD, 1,
 					  NULL_RTX, copy_mode, copy_mode));
     }
 }
@@ -2275,7 +2275,7 @@ copy_blkmode_to_reg (enum machine_mode m
       store_bit_field (dst_word, bitsize, xbitpos % BITS_PER_WORD,
 		       0, 0, word_mode,
 		       extract_bit_field (src_word, bitsize,
-					  bitpos % BITS_PER_WORD, 1, false,
+					  bitpos % BITS_PER_WORD, 1,
 					  NULL_RTX, word_mode, word_mode));
     }
 
@@ -3020,7 +3020,7 @@ read_complex_part (rtx cplx, bool imag_p
     }
 
   return extract_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0,
-			    true, false, NULL_RTX, imode, imode);
+			    true, NULL_RTX, imode, imode);
 }
 
 /* A subroutine of emit_move_insn_1.  Yet another lowpart generator.
@@ -4508,6 +4508,16 @@ get_bit_range (unsigned HOST_WIDE_INT *b
 
   gcc_assert (TREE_CODE (exp) == COMPONENT_REF);
 
+  /* If -fstrict-volatile-bitfields was given and this is a volatile
+     access, then we must ignore any DECL_BIT_FIELD_REPRESENTATIVE and
+     do the access in the mode of the field.  */
+  if (TREE_THIS_VOLATILE (exp)
+      && flag_strict_volatile_bitfields > 0)
+    {
+      *bitstart = *bitend = 0;
+      return;
+    }
+
   field = TREE_OPERAND (exp, 1);
   repr = DECL_BIT_FIELD_REPRESENTATIVE (field);
   /* If we do not have a DECL_BIT_FIELD_REPRESENTATIVE there is no
@@ -6502,7 +6512,7 @@ store_field (rtx target, HOST_WIDE_INT b
 	      temp_target = gen_reg_rtx (mode);
 	      temp_target
 	        = extract_bit_field (temp, size * BITS_PER_UNIT, 0, 1,
-				     false, temp_target, mode, mode);
+				     temp_target, mode, mode);
 	      temp = temp_target;
 	    }
 	}
@@ -9695,8 +9705,8 @@ expand_expr_real_1 (tree exp, rtx target
 	    else if (SLOW_UNALIGNED_ACCESS (mode, align))
 	      temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode),
 					0, TYPE_UNSIGNED (TREE_TYPE (exp)),
-					true, (modifier == EXPAND_STACK_PARM
-					       ? NULL_RTX : target),
+					(modifier == EXPAND_STACK_PARM
+					 ? NULL_RTX : target),
 					mode, mode);
 	  }
 	return temp;
@@ -9890,7 +9900,6 @@ expand_expr_real_1 (tree exp, rtx target
 	HOST_WIDE_INT bitsize, bitpos;
 	tree offset;
 	int volatilep = 0, must_force_mem;
-	bool packedp = false;
 	tree tem = get_inner_reference (exp, &bitsize, &bitpos, &offset,
 					&mode1, &unsignedp, &volatilep, true);
 	rtx orig_op0, memloc;
@@ -9901,11 +9910,6 @@ expand_expr_real_1 (tree exp, rtx target
 	   infinitely recurse.  */
 	gcc_assert (tem != exp);
 
-	if (TYPE_PACKED (TREE_TYPE (TREE_OPERAND (exp, 0)))
-	    || (TREE_CODE (TREE_OPERAND (exp, 1)) == FIELD_DECL
-		&& DECL_PACKED (TREE_OPERAND (exp, 1))))
-	  packedp = true;
-
 	/* If TEM's type is a union of variable size, pass TARGET to the inner
 	   computation, since it will need a temporary and TARGET is known
 	   to have to do.  This occurs in unchecked conversion in Ada.  */
@@ -10133,7 +10137,7 @@ expand_expr_real_1 (tree exp, rtx target
 	    if (MEM_P (op0) && REG_P (XEXP (op0, 0)))
 	      mark_reg_pointer (XEXP (op0, 0), MEM_ALIGN (op0));
 
-	    op0 = extract_bit_field (op0, bitsize, bitpos, unsignedp, packedp,
+	    op0 = extract_bit_field (op0, bitsize, bitpos, unsignedp,
 				     (modifier == EXPAND_STACK_PARM
 				      ? NULL_RTX : target),
 				     ext_mode, ext_mode);
Index: gcc/calls.c
===================================================================
--- gcc/calls.c	(revision 199963)
+++ gcc/calls.c	(working copy)
@@ -1026,7 +1026,7 @@ store_unaligned_arguments_into_pseudos (
 	    int bitsize = MIN (bytes * BITS_PER_UNIT, BITS_PER_WORD);
 
 	    args[i].aligned_regs[j] = reg;
-	    word = extract_bit_field (word, bitsize, 0, 1, false, NULL_RTX,
+	    word = extract_bit_field (word, bitsize, 0, 1, NULL_RTX,
 				      word_mode, word_mode);
 
 	    /* There is no need to restrict this code to loading items
Index: gcc/config/tilegx/tilegx.c
===================================================================
--- gcc/config/tilegx/tilegx.c	(revision 199963)
+++ gcc/config/tilegx/tilegx.c	(working copy)
@@ -1872,7 +1872,7 @@ tilegx_expand_unaligned_load (rtx dest_r
       rtx extracted =
 	extract_bit_field (gen_lowpart (DImode, wide_result),
 			   bitsize, bit_offset % BITS_PER_UNIT,
-			   !sign, false, gen_lowpart (DImode, dest_reg),
+			   !sign, gen_lowpart (DImode, dest_reg),
 			   DImode, DImode);
 
       if (extracted != dest_reg)
Index: gcc/config/tilepro/tilepro.c
===================================================================
--- gcc/config/tilepro/tilepro.c	(revision 199963)
+++ gcc/config/tilepro/tilepro.c	(working copy)
@@ -1676,7 +1676,7 @@ tilepro_expand_unaligned_load (rtx dest_
       rtx extracted =
 	extract_bit_field (gen_lowpart (SImode, wide_result),
 			   bitsize, bit_offset % BITS_PER_UNIT,
-			   !sign, false, gen_lowpart (SImode, dest_reg),
+			   !sign, gen_lowpart (SImode, dest_reg),
 			   SImode, SImode);
 
       if (extracted != dest_reg)
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 199963)
+++ gcc/doc/invoke.texi	(working copy)
@@ -20887,16 +20887,16 @@ instruction, even though that accesses b
 any portion of the bit-field, or memory-mapped registers unrelated to
 the one being updated.
 
-If the target requires strict alignment, and honoring the field
-type would require violating this alignment, a warning is issued.
-If the field has @code{packed} attribute, the access is done without
-honoring the field type.  If the field doesn't have @code{packed}
-attribute, the access is done honoring the field type.  In both cases,
-GCC assumes that the user knows something about the target hardware
-that it is unaware of.
+In some cases, such as when the @code{packed} attribute is applied to a 
+structure field, it may not be possible to access the field with a single
+read or write that is correctly aligned for the target machine.  In this
+case GCC falls back to generating multiple accesses rather than code that 
+will fault or truncate the result at run time, regardless of whether
+@option{-fstrict-volatile-bitfields} is in effect.
 
 The default value of this option is determined by the application binary
-interface for the target processor.
+interface for the target processor.  In particular, on ARM targets using
+AAPCS, @option{-fstrict-volatile-bitfields} is the default.
 
 @item -fsync-libcalls
 @opindex fsync-libcalls
Index: gcc/testsuite/gcc.dg/pr23623.c
===================================================================
--- gcc/testsuite/gcc.dg/pr23623.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr23623.c	(revision 0)
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-fstrict-volatile-bitfields -fdump-rtl-final" } */
+
+/* With -fstrict-volatile-bitfields, the volatile accesses to bf2.b
+   and bf3.b must do unsigned int reads/writes.  The non-volatile
+   accesses to bf1.b are not so constrained.  */
+
+extern struct
+{
+  unsigned int b : 1;
+} bf1;
+
+extern volatile struct
+{
+  unsigned int b : 1;
+} bf2;
+
+extern struct
+{
+  volatile unsigned int b : 1;
+} bf3;
+
+void writeb(void)
+{
+  bf1.b = 1;
+  bf2.b = 1;	/* volatile read + volatile write */
+  bf3.b = 1;	/* volatile read + volatile write */
+}
+
+extern unsigned int x1, x2, x3;
+
+void readb(void)
+{
+  x1 = bf1.b;
+  x2 = bf2.b;   /* volatile write */
+  x3 = bf3.b;   /* volatile write */
+}
+
+/* There should be 6 volatile MEMs total, but scan-rtl-dump-times counts
+   the number of match variables and not the number of matches.  Since
+   the parenthesized subexpression in the regexp introduces an extra match
+   variable, we need to give a count of 12 instead of 6 here.  */
+/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:SI" 12 "final" } } */
+/* { dg-final { cleanup-rtl-dump "final" } } */
+
Index: gcc/testsuite/gcc.dg/pr48784-1.c
===================================================================
--- gcc/testsuite/gcc.dg/pr48784-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr48784-1.c	(revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <stdlib.h>
+
+#pragma pack(1)
+volatile struct S0 {
+   signed a : 7;
+   unsigned b : 28;  /* b can't be fetched with an aligned 32-bit access, */
+                     /* but it certainly can be fetched with an unaligned access */
+} g = {0,0xfffffff};
+
+int main() {
+  unsigned b = g.b;
+  if (b != 0xfffffff)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr48784-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr48784-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr48784-2.c	(revision 0)
@@ -0,0 +1,18 @@
+/* { dg-do run } */
+/* { dg-options "-fno-strict-volatile-bitfields" } */
+
+#include <stdlib.h>
+
+#pragma pack(1)
+volatile struct S0 {
+   signed a : 7;
+   unsigned b : 28;  /* b can't be fetched with an aligned 32-bit access, */
+                     /* but it certainly can be fetched with an unaligned access */
+} g = {0,0xfffffff};
+
+int main() {
+  unsigned b = g.b;
+  if (b != 0xfffffff)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56341-1.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56341-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56341-1.c	(revision 0)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <stdlib.h>
+#include <stdio.h>
+
+struct test0
+{
+  unsigned char b1[2];
+} __attribute__((packed, aligned(2)));
+
+struct test1
+{
+  volatile unsigned long a1;
+  unsigned char b1[4];
+} __attribute__((packed, aligned(2)));
+
+struct test2
+{
+  struct test0 t0;
+  struct test1 t1;
+  struct test0 t2;
+} __attribute__((packed, aligned(2)));
+
+struct test2 xx;
+struct test2 *x1 = &xx;
+
+#define MAGIC 0x12345678
+
+void test0 (struct test2* x1)
+{
+  x1->t1.a1 = MAGIC;
+}
+
+int main()
+{
+  test0 (x1);
+  if (xx.t1.a1 != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56341-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56341-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56341-2.c	(revision 0)
@@ -0,0 +1,41 @@
+/* { dg-do run } */
+/* { dg-options "-fno-strict-volatile-bitfields" } */
+
+#include <stdlib.h>
+#include <stdio.h>
+
+struct test0
+{
+  unsigned char b1[2];
+} __attribute__((packed, aligned(2)));
+
+struct test1
+{
+  volatile unsigned long a1;
+  unsigned char b1[4];
+} __attribute__((packed, aligned(2)));
+
+struct test2
+{
+  struct test0 t0;
+  struct test1 t1;
+  struct test0 t2;
+} __attribute__((packed, aligned(2)));
+
+struct test2 xx;
+struct test2 *x1 = &xx;
+
+#define MAGIC 0x12345678
+
+void test0 (struct test2* x1)
+{
+  x1->t1.a1 = MAGIC;
+}
+
+int main()
+{
+  test0 (x1);
+  if (xx.t1.a1 != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56997-1.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56997-1.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56997-1.c	(revision 0)
@@ -0,0 +1,46 @@
+/* Test volatile access to unaligned field.  */
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+#define test_type uint16_t
+#define MAGIC 0x102u
+
+typedef struct s{
+ unsigned char Prefix;
+ test_type Type;
+}__attribute((__packed__)) ss;
+
+volatile ss v;
+ss g;
+
+void __attribute__((noinline))
+foo (test_type u)
+{
+  v.Type = u;
+}
+
+test_type __attribute__((noinline))
+bar (void)
+{
+  return v.Type;
+}
+
+int main()
+{
+  test_type temp;
+  foo(MAGIC);
+  memcpy(&g, (void *)&v, sizeof(g));
+  if (g.Type != MAGIC)
+    abort ();
+
+  g.Type = MAGIC;
+  memcpy((void *)&v, &g, sizeof(v));
+  temp = bar();
+  if (temp != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56997-2.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56997-2.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56997-2.c	(revision 0)
@@ -0,0 +1,46 @@
+/* Test volatile access to unaligned field.  */
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+#define test_type uint32_t
+#define MAGIC 0x1020304u
+
+typedef struct s{
+ unsigned char Prefix;
+ test_type Type;
+}__attribute((__packed__)) ss;
+
+volatile ss v;
+ss g;
+
+void __attribute__((noinline))
+foo (test_type u)
+{
+  v.Type = u;
+}
+
+test_type __attribute__((noinline))
+bar (void)
+{
+  return v.Type;
+}
+
+int main()
+{
+  test_type temp;
+  foo(MAGIC);
+  memcpy(&g, (void *)&v, sizeof(g));
+  if (g.Type != MAGIC)
+    abort ();
+
+  g.Type = MAGIC;
+  memcpy((void *)&v, &g, sizeof(v));
+  temp = bar();
+  if (temp != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/pr56997-3.c
===================================================================
--- gcc/testsuite/gcc.dg/pr56997-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/pr56997-3.c	(revision 0)
@@ -0,0 +1,46 @@
+/* Test volatile access to unaligned field.  */
+/* { dg-do run } */
+/* { dg-options "-fstrict-volatile-bitfields" } */
+
+#include <string.h>
+#include <stdlib.h>
+#include <stdint.h>
+
+#define test_type uint64_t
+#define MAGIC 0x102030405060708ul
+
+typedef struct s{
+ unsigned char Prefix;
+ test_type Type;
+}__attribute((__packed__)) ss;
+
+volatile ss v;
+ss g;
+
+void __attribute__((noinline))
+foo (test_type u)
+{
+  v.Type = u;
+}
+
+test_type __attribute__((noinline))
+bar (void)
+{
+  return v.Type;
+}
+
+int main()
+{
+  test_type temp;
+  foo(MAGIC);
+  memcpy(&g, (void *)&v, sizeof(g));
+  if (g.Type != MAGIC)
+    abort ();
+
+  g.Type = MAGIC;
+  memcpy((void *)&v, &g, sizeof(v));
+  temp = bar();
+  if (temp != MAGIC)
+    abort ();
+  return 0;
+}
Index: gcc/testsuite/gcc.dg/volatile-bitfields-3.c
===================================================================
--- gcc/testsuite/gcc.dg/volatile-bitfields-3.c	(revision 0)
+++ gcc/testsuite/gcc.dg/volatile-bitfields-3.c	(revision 0)
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-fstrict-volatile-bitfields -fdump-rtl-final" } */
+
+/* On x86_64, this test case used to incorrectly generate SImode
+   accesses to the field y instead of HImode.  */
+
+#include <stdint.h>
+
+struct x
+{
+  int32_t x: 8;
+  char : 0;
+  int16_t y : 8;
+} z;
+
+int foo1 (volatile struct x *p)
+{
+  return p->x;  /* SImode read */
+}
+
+void foo2 (volatile struct x *p, int i)
+{
+  p->x = i;	/* SImode read + write */
+}
+
+int bar1 (volatile struct x *p)
+{
+  return p->y;  /* HImode read */
+}
+
+void bar2 (volatile struct x *p, int i)
+{
+  p->y = i;	/* HImode read + write */
+}
+
+/* There should be 3 SImode and 3 HImode volatile MEMs, but
+   scan-rtl-dump-times counts the number of match variables and not the
+   number of matches.  Since the parenthesized subexpression in the regexp
+   introduces an extra match variable, we need to double the count.  */
+/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:SI" 6 "final" } } */
+/* { dg-final { scan-rtl-dump-times "mem/v(/.)*:HI" 6 "final" } } */
+/* { dg-final { cleanup-rtl-dump "final" } } */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]