This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Patch ARM] Fix PR54252


Hi,

This is a fix for a pretty serious regression in GCC 4.7 onwards where
GCC is likely to put out wrong alignment specifiers in case of the neon
intrinsics. These specifiers appear to be much larger than the alignment specifiers allowed by the architecture for the memory sizes allowed by the instructions.


The part of the backend which was emitting the alignment specifiers
wasn't really wrong in what it was doing , it's just that the
information in terms of MEM_SIZE for the memory being accesses was wrong when neon_dereference_pointer constructed these MEM_REFs in the first place.


There are 2 fundamental problems in the way in which the builtin
expanders and neon_dereference_pointer construct these memory references.

The first problem is that neon_dereference_pointer in the case that
reg_mode and mem_mode are identical doesn't take into account the number of bytes that elem_type actually uses. The logic below in
neon_dereference_pointer essentially specifies that the memory accessed
by the intrinsic is an array of type elem_type with number of elements
equal to the number of elements in the vector.


The second problem and something more fundamental in neon_dereference_pointer is that it attempts to figure out the
underlying type of the element being accessed by looking at the actual
parameter for the load or the store. However this is not necessarily
guaranteed to work always as the underlying type could by itself by an
array type causing the logic in neon_dereference_pointer to end up
constructing a multi-dimensional array of the basic type. The way I
spotted this was to construct a testcase from the original PR but using
the vld3q_lane_f32 style intrinsics. In these cases the memory reference produced appeared to be loading a 2 dimensional array of 6 float values instead of just 3 float values. Ouch !


The correct method ought to be to use the underlying type from the
formal parameter which is what this patch attempts to do.

Tested cross with no regressions on arm-linux-gnueabi with the relevant configury, tested with a number of handwritten tests and observed size of the memory accesses look sane.

Applied on trunk and will wait for a few days before backporting to 4.7 branch.

regards,
Ramana

2012-08-29  Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
	    Richard Earnshaw  <richard.earnshaw@arm.com>

	PR target/54252
	* config/arm/arm.c (neon_dereference_pointer): Adjust nelems by
	element size. Use elem_type from the formal parameter. New parameter fcode.
	(neon_expand_args): Adjust call to neon_dereference_pointer.










Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]