Bug 39074 - [4.3 Regression] PTA constraint processing for *x = y is wrong
Summary: [4.3 Regression] PTA constraint processing for *x = y is wrong
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.4.0
: P2 major
Target Milestone: 4.4.0
Assignee: Richard Biener
URL:
Keywords: alias, wrong-code
Depends on:
Blocks:
 
Reported: 2009-02-02 13:47 UTC by Richard Biener
Modified: 2009-06-17 12:34 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.0.4 4.4.0
Known to fail: 4.1.3 4.2.4 4.3.3
Last reconfirmed: 2009-02-02 14:03:04


Attachments
patch to warn about uninitialized pointer dereferences (1.04 KB, text/plain)
2009-02-02 14:07 UTC, Richard Biener
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Biener 2009-02-02 13:47:12 UTC
if the solution of x includes ANYTHING we fail to add the solution of y to all
variables.  Testcase (fails on a-i branch):

int i;
void __attribute__((noinline))
foo(long b, long q)
{
  int *y;
  int **a = &y, **x;
  int ***p;
  if (b)
    p = (int ***)q;
  else
    p = &a;
  x = *p;
  *x = &i;
  *y = 0;
}
extern void abort (void);
int main()
{
  i = 1;
  foo (0, 0);
  if (i != 0)
    abort ();
  return 0;
}

I guess we should warn for dereferenced pointers whos points-to set is empty
with "warning: dereferencing uninitialized pointer".
Comment 1 Richard Biener 2009-02-02 14:03:04 UTC
We could for example warn for

/* { dg-do compile } */
/* { dg-options "-O -Wuninitialized" } */

int i;
int __attribute((const,noinline))
foo (int **p)
{
  return i;
}
int bar(int *q)
{
  int *p;
  *q = 0;
  int j = foo(&p);
  return *p + j;  /* { dg-warning "dereferencing uninitialized" } */
}
Comment 2 Richard Biener 2009-02-02 14:07:18 UTC
Created attachment 17227 [details]
patch to warn about uninitialized pointer dereferences

This patch causes a warning for both testcases, the program and the PTA bug.
Comment 3 Daniel Berlin 2009-02-02 19:42:40 UTC
Eyeballing this, I think y should not end up empty anyway.

Shouldn't it have i in it's points-to set?
Comment 4 Richard Biener 2009-02-03 09:17:25 UTC
Yes, but as the store to y is via *x and x points to { ANYTHING } (via the
non-pointer (int ***)q) only (as x already includes ANYTHING we do not add
a for the second constraint), so for *x = &i we fail to add a to y.

For reference, here are the constraints:

a = &y
p_4 = &ANYTHING
p_1 = p_4
p_1 = &a
x_6 = *p_1
derefaddrtmp.9 = &i
*x_6 = derefaddrtmp.9
y.0_7 = y

and the solutions:

a = { y }
y = same as y.0_7
p_4 = { ANYTHING }
p_1 = { ANYTHING a }
x_6 = { ANYTHING }
i = { }
derefaddrtmp.9 = { i }
y.0_7 = { }

while correct would be if everything would point to at least i (through
the effective *ANYTHING = &i constraint)
Comment 5 Daniel Berlin 2009-02-03 14:16:41 UTC
Subject: Re:  PTA constraint processing for *x = 
	y is wrong

There used to be a *ANYTHING = ANYTHING constraint + ANYTHING
containing all the variables pointing to ANYTHING that would have
taken care of this.
You certainly don't want to dynamically add all variables at solving
time yourself, since that can't be optimized.

On Tue, Feb 3, 2009 at 4:17 AM, rguenth at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #4 from rguenth at gcc dot gnu dot org  2009-02-03 09:17 -------
> Yes, but as the store to y is via *x and x points to { ANYTHING } (via the
> non-pointer (int ***)q) only (as x already includes ANYTHING we do not add
> a for the second constraint), so for *x = &i we fail to add a to y.
>
> For reference, here are the constraints:
>
> a = &y
> p_4 = &ANYTHING
> p_1 = p_4
> p_1 = &a
> x_6 = *p_1
> derefaddrtmp.9 = &i
> *x_6 = derefaddrtmp.9
> y.0_7 = y
>
> and the solutions:
>
> a = { y }
> y = same as y.0_7
> p_4 = { ANYTHING }
> p_1 = { ANYTHING a }
> x_6 = { ANYTHING }
> i = { }
> derefaddrtmp.9 = { i }
> y.0_7 = { }
>
> while correct would be if everything would point to at least i (through
> the effective *ANYTHING = &i constraint)
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39074
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
Comment 6 rguenther@suse.de 2009-02-03 14:24:38 UTC
Subject: Re:  PTA constraint processing for *x
 = y is wrong

On Tue, 3 Feb 2009, dberlin at dberlin dot org wrote:

> Subject: Re:  PTA constraint processing for *x = 
>         y is wrong
> 
> There used to be a *ANYTHING = ANYTHING constraint + ANYTHING
> containing all the variables pointing to ANYTHING that would have
> taken care of this.
> You certainly don't want to dynamically add all variables at solving
> time yourself, since that can't be optimized.

This is the reason it "works" for ESCAPED, there we have an
*ESCAPED = ESCAPED constraint.  It doesn't work for CALLUSED though,
I have a simple fix (CALLUSED is not big usually, so just not using
it as a placeholder fixes the issue here).

For the ANYTHING problem I have just dealt with it in do_ds_constraint
(I'll post an updated patch soon after testing finished).

Richard.
Comment 7 Daniel Berlin 2009-02-04 00:29:46 UTC
Subject: Re:  PTA constraint processing for *x = 
	y is wrong

On Tue, Feb 3, 2009 at 9:24 AM, rguenther at suse dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #6 from rguenther at suse dot de  2009-02-03 14:24 -------
> Subject: Re:  PTA constraint processing for *x
>  = y is wrong
>
> On Tue, 3 Feb 2009, dberlin at dberlin dot org wrote:
>
>> Subject: Re:  PTA constraint processing for *x =
>>         y is wrong
>>
>> There used to be a *ANYTHING = ANYTHING constraint + ANYTHING
>> containing all the variables pointing to ANYTHING that would have
>> taken care of this.
>> You certainly don't want to dynamically add all variables at solving
>> time yourself, since that can't be optimized.
>
> This is the reason it "works" for ESCAPED, there we have an
> *ESCAPED = ESCAPED constraint.  It doesn't work for CALLUSED though,
> I have a simple fix (CALLUSED is not big usually, so just not using
> it as a placeholder fixes the issue here).
>
> For the ANYTHING problem I have just dealt with it in do_ds_constraint
> (I'll post an updated patch soon after testing finished).

My onl concern is practicality.
The last time I did this solely at solving time it was ridiculously
slow on large cases, since the solver is much better at difference
propagation.
Comment 8 rguenther@suse.de 2009-02-04 09:35:24 UTC
Subject: Re:  PTA constraint processing for *x
 = y is wrong

On Wed, 4 Feb 2009, dberlin at dberlin dot org wrote:

> ------- Comment #7 from dberlin at gcc dot gnu dot org  2009-02-04 00:29 -------
> Subject: Re:  PTA constraint processing for *x = 
>         y is wrong
> 
> On Tue, Feb 3, 2009 at 9:24 AM, rguenther at suse dot de
> <gcc-bugzilla@gcc.gnu.org> wrote:
> >
> >
> > ------- Comment #6 from rguenther at suse dot de  2009-02-03 14:24 -------
> > Subject: Re:  PTA constraint processing for *x
> >  = y is wrong
> >
> > On Tue, 3 Feb 2009, dberlin at dberlin dot org wrote:
> >
> >> Subject: Re:  PTA constraint processing for *x =
> >>         y is wrong
> >>
> >> There used to be a *ANYTHING = ANYTHING constraint + ANYTHING
> >> containing all the variables pointing to ANYTHING that would have
> >> taken care of this.
> >> You certainly don't want to dynamically add all variables at solving
> >> time yourself, since that can't be optimized.
> >
> > This is the reason it "works" for ESCAPED, there we have an
> > *ESCAPED = ESCAPED constraint.  It doesn't work for CALLUSED though,
> > I have a simple fix (CALLUSED is not big usually, so just not using
> > it as a placeholder fixes the issue here).
> >
> > For the ANYTHING problem I have just dealt with it in do_ds_constraint
> > (I'll post an updated patch soon after testing finished).
> 
> My onl concern is practicality.
> The last time I did this solely at solving time it was ridiculously
> slow on large cases, since the solver is much better at difference
> propagation.

Do you remember what testcase(s) this was?  I can certainly time
removing the shortcutting against handling *ANYTHING (and I'll try
to come up with a testcase that is not fixed with just removing
the shortcutting).

Richard.
Comment 9 Richard Biener 2009-02-04 12:16:43 UTC
Testcase that is not fixed with removing the short-cutting:

int i;
long __attribute__((noinline,const)) bar(int ***p) { return (long)p; }
void __attribute__((noinline))
foo(void)
{
  int *y;
  int **a = &y, **x;
  int ***p;
  long b;
  b = bar(&a);
  p = (int ***)b;
  x = *p;
  *x = &i;
  *y = 0;
}
extern void abort (void);
int main()
{
  i = 1;
  foo ();
  if (i != 0)
    abort ();
  return 0;
}
Comment 10 Richard Biener 2009-02-04 12:26:13 UTC
This one fails on trunk (where we fall back to anything for empty points-to
sets, so just add some unrelated &j and the vops are wrong):

int i;
long __attribute__((noinline,const)) bar(int ***p) { return (long)p; }
void __attribute__((noinline))
foo(void)
{
  int j;
  int *y = &j;
  int **a = &y, **x;
  int ***p;
  long b;
  b = bar(&a);
  p = (int ***)b;
  x = *p;
  *x = &i;
  *y = 0;
}
extern void abort (void);
int main()
{
  i = 1;
  foo ();
  if (i != 0)
    abort ();
  return 0;
}
Comment 11 Richard Biener 2009-02-04 12:31:43 UTC
This one fails also on the branches that have PTA.

int i;
long __attribute__((noinline,const)) bar(int ***p) { return (long)p; }
extern void abort (void);
int main()
{
  int j;
  int *y = &j;
  int **a = &y, **x;
  int ***p;
  long b;
  b = bar(&a);
  p = (int ***)b;
  x = *p;
  *x = &i;
  i = 1;
  *y = 0;
  if (i != 0)
    abort ();
  return 0;
}
Comment 12 Richard Biener 2009-02-04 12:37:29 UTC
Which makes it a regression.
Comment 13 Daniel Berlin 2009-02-04 16:09:21 UTC
Subject: Re:  PTA constraint processing for *x = 
	y is wrong

I don't remember offhand.  At one point during 4.2 we used to compute
the anything set exactly, and it led to massive issues. Of course,
most of those were because the anything set had hundreds or thousands
of SFT's :).

I'm happy to go with your idea for fixing since fixing shortcutting
won't fix it, except for two things:
1. ANYTHING is really limited to all addressable variables (IE address
taken and escaping), instead of all variables.  It was never meant to
represent "completely unknown" (IE user has set pointer to  (char *)
0xdeadbeef).
ISTM the set you union in should be based on CALLUSED and ESCAPED or
something thereof, or at least should be computable with constraints
during solving, and unioned in when it changes.

The way off the top of my head to do this is to simply stop using
&ANYTHING, and use ANYTHING directly, and then have ANYTHING =
CALLUSED and ANYTHING = ESCAPED.

Assuming you hate that idea for some reason
2. It's probably a lot faster to make a bitmap of these and update it,
then union in the bitmap, than to iterate over all varinfos all the
time.

ISTM you are trying to avoid doing #1 by adding all variables, even
though this is going to give you worse than necessary results.

Intel actually iterates points-to solving in order to compute the set
of nonlocal locations without explicitly adding the set everywhere.

See the description of nloc in section 4.1 of  "On the Importance of
Points=To Analysis and Other
Memory Disambiguation Methods For C Programs"

On Wed, Feb 4, 2009 at 4:35 AM, rguenther at suse dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #8 from rguenther at suse dot de  2009-02-04 09:35 -------
> Subject: Re:  PTA constraint processing for *x
>  = y is wrong
>
> On Wed, 4 Feb 2009, dberlin at dberlin dot org wrote:
>
>> ------- Comment #7 from dberlin at gcc dot gnu dot org  2009-02-04 00:29 -------
>> Subject: Re:  PTA constraint processing for *x =
>>         y is wrong
>>
>> On Tue, Feb 3, 2009 at 9:24 AM, rguenther at suse dot de
>> <gcc-bugzilla@gcc.gnu.org> wrote:
>> >
>> >
>> > ------- Comment #6 from rguenther at suse dot de  2009-02-03 14:24 -------
>> > Subject: Re:  PTA constraint processing for *x
>> >  = y is wrong
>> >
>> > On Tue, 3 Feb 2009, dberlin at dberlin dot org wrote:
>> >
>> >> Subject: Re:  PTA constraint processing for *x =
>> >>         y is wrong
>> >>
>> >> There used to be a *ANYTHING = ANYTHING constraint + ANYTHING
>> >> containing all the variables pointing to ANYTHING that would have
>> >> taken care of this.
>> >> You certainly don't want to dynamically add all variables at solving
>> >> time yourself, since that can't be optimized.
>> >
>> > This is the reason it "works" for ESCAPED, there we have an
>> > *ESCAPED = ESCAPED constraint.  It doesn't work for CALLUSED though,
>> > I have a simple fix (CALLUSED is not big usually, so just not using
>> > it as a placeholder fixes the issue here).
>> >
>> > For the ANYTHING problem I have just dealt with it in do_ds_constraint
>> > (I'll post an updated patch soon after testing finished).
>>
>> My onl concern is practicality.
>> The last time I did this solely at solving time it was ridiculously
>> slow on large cases, since the solver is much better at difference
>> propagation.
>
> Do you remember what testcase(s) this was?  I can certainly time
> removing the shortcutting against handling *ANYTHING (and I'll try
> to come up with a testcase that is not fixed with just removing
> the shortcutting).
>
> Richard.
>
>
> --
>
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39074
>
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
>
Comment 14 rguenther@suse.de 2009-02-04 16:26:25 UTC
Subject: Re:  [4.2/4.3/4.4 Regression] PTA
 constraint processing for *x = y is wrong

On Wed, 4 Feb 2009, dberlin at dberlin dot org wrote:

> ------- Comment #13 from dberlin at gcc dot gnu dot org  2009-02-04 16:09 -------
> Subject: Re:  PTA constraint processing for *x = 
>         y is wrong
> 
> I don't remember offhand.  At one point during 4.2 we used to compute
> the anything set exactly, and it led to massive issues. Of course,
> most of those were because the anything set had hundreds or thousands
> of SFT's :).
> 
> I'm happy to go with your idea for fixing since fixing shortcutting
> won't fix it, except for two things:
> 1. ANYTHING is really limited to all addressable variables (IE address
> taken and escaping), instead of all variables.  It was never meant to
> represent "completely unknown" (IE user has set pointer to  (char *)
> 0xdeadbeef).

Yes, is there a bitmap / array in the PTA graph that I can iterate
over instead of all vars?

> ISTM the set you union in should be based on CALLUSED and ESCAPED or
> something thereof, or at least should be computable with constraints
> during solving, and unioned in when it changes.

Ah, you mean whenever I see *ANYTHING = x union x into a new
STORE_TO_ANYTHING solution and have an explicit
*ANYTHING = STORE_TO_ANYTHING constraint (which I of course need
to handle properly in do_ds_constraint)?  That may be indeed faster.

> The way off the top of my head to do this is to simply stop using
> &ANYTHING, and use ANYTHING directly, and then have ANYTHING =
> CALLUSED and ANYTHING = ESCAPED.

I don't think CALLUSED or ESCAPED are related here.  You can store
non-addressables into *ANYTHING.

> Assuming you hate that idea for some reason
> 2. It's probably a lot faster to make a bitmap of these and update it,
> then union in the bitmap, than to iterate over all varinfos all the
> time.
> 
> ISTM you are trying to avoid doing #1 by adding all variables, even
> though this is going to give you worse than necessary results.
> 
> Intel actually iterates points-to solving in order to compute the set
> of nonlocal locations without explicitly adding the set everywhere.
> 
> See the description of nloc in section 4.1 of  "On the Importance of
> Points=To Analysis and Other
> Memory Disambiguation Methods For C Programs"

I'll have a look there.

Richard.
Comment 15 Daniel Berlin 2009-02-04 16:37:55 UTC
Subject: Re:  [4.2/4.3/4.4 Regression] PTA 
	constraint processing for *x = y is wrong

On Wed, Feb 4, 2009 at 11:26 AM, rguenther at suse dot de
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #14 from rguenther at suse dot de  2009-02-04 16:26 -------
> Subject: Re:  [4.2/4.3/4.4 Regression] PTA
>  constraint processing for *x = y is wrong
>
> On Wed, 4 Feb 2009, dberlin at dberlin dot org wrote:
>
>> ------- Comment #13 from dberlin at gcc dot gnu dot org  2009-02-04 16:09 -------
>> Subject: Re:  PTA constraint processing for *x =
>>         y is wrong
>>
>> I don't remember offhand.  At one point during 4.2 we used to compute
>> the anything set exactly, and it led to massive issues. Of course,
>> most of those were because the anything set had hundreds or thousands
>> of SFT's :).
>>
>> I'm happy to go with your idea for fixing since fixing shortcutting
>> won't fix it, except for two things:
>> 1. ANYTHING is really limited to all addressable variables (IE address
>> taken and escaping), instead of all variables.  It was never meant to
>> represent "completely unknown" (IE user has set pointer to  (char *)
>> 0xdeadbeef).
>
> Yes, is there a bitmap / array in the PTA graph that I can iterate
> over instead of all vars?
Not right now.

>
>> ISTM the set you union in should be based on CALLUSED and ESCAPED or
>> something thereof, or at least should be computable with constraints
>> during solving, and unioned in when it changes.
>
> Ah, you mean whenever I see *ANYTHING = x union x into a new
> STORE_TO_ANYTHING solution and have an explicit
> *ANYTHING = STORE_TO_ANYTHING constraint (which I of course need
> to handle properly in do_ds_constraint)?  That may be indeed faster.
Something like that.
It is going to be faster than doing it one by one all the time.

>> The way off the top of my head to do this is to simply stop using
>> &ANYTHING, and use ANYTHING directly, and then have ANYTHING =
>> CALLUSED and ANYTHING = ESCAPED.
>
> I don't think CALLUSED or ESCAPED are related here.  You can store
> non-addressables into *ANYTHING.
How?
If they are non-addressable, that implies they are not pointed to.
I think you are going off the rails here :)
If you really want to union ANYTHING into things, the simplest way is
to change from doing:

ANYTHING = &ANYTHING
a = &y
p_4 = &ANYTHING
p_1 = p_4
p_1 = &a
x_6 = *p_1
derefaddrtmp.11 = &i
*x_6 = derefaddrtmp.11
y.0_7 = y

to
ANYTHING = *ANYTHING
ANYTHING = <all pointers pre-built or whatever we decided on>
a = &y
p_4 = ANYTHING
p_1 = p_4
p_1 = &a
x_6 = *p_1
derefaddrtmp.11 = &i
*x_6 = derefaddrtmp.11
y.0_7 = y

Then p_4 will get the entire anything set, and it will propagate
around just like you wanted.
Comment 16 Richard Biener 2009-02-06 09:17:31 UTC
Subject: Bug 39074

Author: rguenth
Date: Fri Feb  6 09:17:19 2009
New Revision: 143983

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=143983
Log:
2009-02-06  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/39074
	* tree-ssa-structalias.c (storedanything_id, var_storedanything,
	storedanything_tree): New.
	(do_ds_constraint): Simplify ANYTHING shortcutting.  Update
	the STOREDANYTHING solution if the lhs solution contains
	ANYTHING.
	(build_pred_graph): Add edges from STOREDANYTHING to all
	non-direct nodes.
	(get_constraint_for_1): CONSTRUCTOR
	is a zero-initializer.  Generate &NOTHING for it.
	(init_base_vars): Initialize STOREDANYTHING.
	(compute_points_to_sets): Free substitution info after
	building the succ graph.
	(ipa_pta_execute): Likewise.

	* gcc.dg/torture/pr39074.c: New testcase.
	* gcc.dg/torture/pr39074-2.c: Likewise.
	* gcc.dg/torture/pr39074-3.c: Likewise.

Added:
    branches/alias-improvements/gcc/testsuite/gcc.dg/torture/pr39074-2.c
    branches/alias-improvements/gcc/testsuite/gcc.dg/torture/pr39074-3.c
    branches/alias-improvements/gcc/testsuite/gcc.dg/torture/pr39074.c
Modified:
    branches/alias-improvements/gcc/ChangeLog.alias
    branches/alias-improvements/gcc/tree-ssa-structalias.c

Comment 17 Richard Biener 2009-02-19 10:12:39 UTC
Subject: Bug 39074

Author: rguenth
Date: Thu Feb 19 10:12:25 2009
New Revision: 144292

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=144292
Log:
2009-02-19  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/39207
	PR tree-optimization/39074
	* tree-ssa-structalias.c (storedanything_id, var_storedanything,
	storedanything_tree): New.
	(do_ds_constraint): Simplify ANYTHING shortcutting.  Update
	the STOREDANYTHING solution if the lhs solution contains
	ANYTHING.
	(build_succ_graph): Add edges from STOREDANYTHING to all
	non-direct nodes.
	(init_base_vars): Initialize STOREDANYTHING.
	(compute_points_to_sets): Free substitution info after
	building the succ graph.
	(ipa_pta_execute): Likewise.

	* gcc.dg/torture/pr39074.c: New testcase.
	* gcc.dg/torture/pr39074-2.c: Likewise.
	* gcc.dg/torture/pr39074-3.c: Likewise.

	* tree-ssa-structalias.c (struct variable_info): Add may_have_pointers
	field.
	(do_ds_constraint): Do not add to special var or non-pointer
	field solutions.
	(type_could_have_pointers): Split out from ...
	(could_have_pointers): ... here.  For arrays use the element type.
	(create_variable_info_for): Initialize may_have_pointers.
	(new_var_info): Likewise.
	(handle_lhs_call): Make the HEAP variable unknown-sized.
	(intra_create_variable_infos): Use a type with pointers for
	PARM_NOALIAS, make it unknown-sized.

Added:
    trunk/gcc/testsuite/gcc.dg/torture/pr39074-2.c
    trunk/gcc/testsuite/gcc.dg/torture/pr39074-3.c
    trunk/gcc/testsuite/gcc.dg/torture/pr39074.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-structalias.c

Comment 18 Richard Biener 2009-02-19 10:13:17 UTC
Fixed on the trunk as well.
Comment 19 Joseph S. Myers 2009-03-31 21:09:14 UTC
Closing 4.2 branch.
Comment 20 Richard Biener 2009-06-17 12:34:43 UTC
WONTFIX for 4.3.  Alias fixes are considered too risky at this stage.