This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: get_ref_base_and_extent() semantics
On Fri, Dec 12, 2008 at 3:05 PM, Martin Jambor <mjambor@suse.cz> wrote:
> Hi,
>
> On Fri, Dec 12, 2008 at 12:26:38PM +0100, Richard Guenther wrote:
>> On Fri, Dec 12, 2008 at 11:59 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> On Fri, Dec 12, 2008 at 12:29 AM, Martin Jambor <mjambor@suse.cz> wrote:
>> >> > Hi,
>> >> >
>> >> > today I have encountered an unpleasant problem with the function
>> >> > get_ref_base_and_extent() when it claimed a known and constant offset
>> >> > for the expression insn_4(D)->u.fld[arg.82_3].rt_rtvec. (arg being a
>> >> > default_def parameter of the function, insn is an rtx). Moreover, it
>> >> > also returned constant size and max_size, all three equal to 64
>> >> > (bits).
>> >> >
>> >> > This is certainly wrong (I believe the function got confused by
>> >> > unions) but after looking at the function and what it did in debugger,
>> >> > I grew unsure what the expected behavior is. The two alternatives I
>> >> > consider possibly correct are returned offset equal to -1 or,
>> >> > alternatively, offset equal to the offset of the array (+ offset of
>> >> > rt_rtvec which is zero) and max_size equal either to the size of the
>> >> > array or -1 if it is unknown.
>> >> >
>> >> > At the moment, the function never returns offset equal to -1, the
>> >> > comment above it does not even mention such possibility and, from the
>> >> > limited research I did, its callers do not expect and check for it.
>> >> > However, at the same time, the special handling of non-constants in
>> >> > array indices after the label "done" does not trigger in this
>> >> > particular case (here I suspect the unions come to play because it is
>> >> > the last part of the conjunction that evaluates to false).
>> >> >
>> >> > Which (or what else) is the correct semantics of the function? Both
>> >> > make sense to me (I would prefer the former but I suspect other users
>> >> > rely on the latter). What would be the correct fix, when a union
>> >> > field is itself a record ending with a variable-length array? How
>> >> > much would I pessimize things if I just returned -1 max_size if both a
>> >> > non-constant index and a union was encountered?
>> >> >
>> >> > Or did I miss something else?
>> >>
>> >> The function is supposed to return -1 for max_size in case the access
>> >> accesses a variable array index of an array of unknown size.
>> >>
>> >> insn_4(D)->u.fld[arg.82_3].rt_rtvec accesses struct rtvec_def -- in which
>> >> case the access _does_ have known size (an int plus one rtx pointer).
>> >> But maybe I am missing context here?
>> >
>> > Hmm, isn't this the usual problem with arrays running past the end of
>> > sturcture?
>>
>> No, get_ref_base_and_extent handles those fine (it is supposed to set
>> max_size to -1 in this case).
>>
>> > I guess we need to special case these for needs of SRA as you can't
>> > easilly know size of the array even if it is explicitely written in the
>> > program.
>>
>> Of course, just bail out if max_size is -1. The problem Martin is seeing
>> is that it isn't -1 for some reason. Martin, are you asking
>> get_ref_base_and_extent
>> for only a part of an access?
>
> I don't know what you mean by a partial access, the gimple statement
> this expression comes from is:
>
> D.10261_5 = insn_4(D)->u.fld[arg.82_3].rt_rtvec;
>
> I simply grabbed the rhs and fed it to get_ref_base_and_extent().
>
> I agree the function should return -1 max_size in the case of a
> variable indexing into an array that might have unknown size. The
> problem is how the function identifies such arrays. The idea is
> simple, these are arrays at the end of the whole aggregate and thus
> the hitherto calculated offset + max_size give exactly that, the size
> of the whole aggregate.
>
> However, this does not work if we have unions within the structure. A
> simple example would be:
>
> struct a
> {
> union
> {
> struct
> {
> int a;
> int var_array[1];
> } x;
> struct
> {
> int a,b,c,d,e,f,g,h,i,j;
> } y;
> } z;
> } a;
>
> In this particular case, the ofset of var_array + its max_size (the
> size of its element, as the function computes it, this is something I
> also do not really understand) is not equal to the size of the whole
> structure, because the other union component is way larger. Therefore
> the final part of the following conjunction evaluates to false:
>
> if (seen_variable_array_ref
> && maxsize != -1
> && host_integerp (TYPE_SIZE (TREE_TYPE (exp)), 1)
> && bit_offset + maxsize
> == (signed)TREE_INT_CST_LOW (TYPE_SIZE (TREE_TYPE (exp))))
> maxsize = -1;
>
> Note that exp TREE_TYPE (exp) is now the type of the all-encompassing
> most-ouward aggregate. And thus maxsize is not set to -1.
>
> I believe a simple solution would be to ahve a nother flag to be set
> when processing component_ref with a union as its zeroth argument and
> always return maxsize of -1 if both seen_variable_array_ref and this
> flag is set. It is not optimal because the union field might be a
> structure with an array in it but not as the last field and so it
> cannot have variable size but well... ...if you think this case really
> matters I can try to handle it as well.
Hm, yeah - we probably need to handle all the weird cases somehow. But
we can in this case do even better and simply enlarge the max-size of the
fld[] access to its _union_ size and then come back and the end and see
the access is touching the last bit of the struct. I see you posted a patch -
lemme have a look there.
Richard.
> Thanks,
>
> Martin
>