This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC]: VUSE Bypassing


As a potential stopgap solution (IE until we have real partial vuses and defs) for 4.0 structure/pointer-to-structure aliasing and partial use stuff, i implemented VUSE bypassing. I was told (by various people) the main thing stopping us from killing some RTL passes is that we end up with too man spurious structure field and pointer to structure field kills at the tree level.
The patch i've attached is only for performance testing. I would integrate it directly into the renamer and do some speedups before i actually submitted it, assuming we wanted it.
If someone could run SPEC with this on x86-64, i'd appreciate it.


For the curious as to what this thing does, basically, given
7
a_2 = V_MAY_DEF (a_1)
a.f = 5

a_3 = V_MAY_DEF (a_2)
a.g = 6

VUSE (a_3)
b = a.f

we rewrite it to

a_2 = V_MAY_DEF (a_1)
a.f = 5

a_3 = V_MAY_DEF (a_2)
a.g = 6

VUSE (a_2)
b = a.f

We do the same thing for pointers to structures as well. This enables the optimizers to propagate and eliminate a lot more things.
The bypasser, in it's current form, is a separate pass, and won't bypass PHI nodes.
However, even in this form, it can produce significant results if you have a lot of spurious kills.


An example:
struct a
{
	int a;
	int b;
};

int main(int argc, char **argv)
{
	struct a *foo;
	struct a a, b;
	foo = &a;
	foo->a = 5;
	printf ("%d\n", foo->a);
	foo->b = 6;
	printf ("%d\n", foo->b);
	printf ("%d\n", foo->a);
	if (argc)
	{
		foo = &b;
		foo->a = 6;
		printf ("%d\n", foo->b);
		foo->b = 7;
		printf ("%d\n", foo->b);
		printf ("%d\n", foo->a);
	}
	printf ("%d\n", foo->a);
	printf ("%d\n", foo->b);
	foo->a = 8;
	printf ("%d\n", foo->b);
	printf ("%d\n", foo->a);
	foo->b = 9;
	printf ("%d\n", foo->b);
	printf ("%d\n", foo->a);
}

Without vuse bypassing, this becomes:

;; Function main (main)

main (argc, argv)
{
  int temp.8;
  int temp.7;
  int temp.6;
  int temp.5;
  int temp.4;
  struct a b;
  struct a a;
  struct a * foo;
  int D.1127;
  int D.1126;

<bb 0>:
  a.a = 5;
  printf (&"%d\n"[0], 5);
  a.b = 6;
  printf (&"%d\n"[0], 6);
  printf (&"%d\n"[0], a.a);
  if (argc != 0) goto <L0>; else goto <L3>;

<L3>:;
  foo = &a;
  goto <bb 2> (<L1>);

<L0>:;
  b.a = 6;
  printf (&"%d\n"[0], b.b);
  b.b = 7;
  printf (&"%d\n"[0], 7);
  printf (&"%d\n"[0], b.a);
  foo = &b;

<L1>:;
  printf (&"%d\n"[0], foo->a);
  printf (&"%d\n"[0], foo->b);
  foo->a = 8;
  printf (&"%d\n"[0], foo->b);
  printf (&"%d\n"[0], 8);
  foo->b = 9;
  printf (&"%d\n"[0], 9);
  printf (&"%d\n"[0], foo->a);
  return;

}

With vuse bypassing, we get:

;; Function main (main)

main (argc, argv)
{
  int temp.4;
  struct a b;
  struct a a;
  struct a * foo;
  int D.1127;
  int D.1126;

<bb 0>:
  a.a = 5;
  printf (&"%d\n"[0], 5);
  a.b = 6;
  printf (&"%d\n"[0], 6);
  printf (&"%d\n"[0], 5);
  if (argc != 0) goto <L0>; else goto <L3>;

<L3>:;
  foo = &a;
  goto <bb 2> (<L1>);

<L0>:;
  b.a = 6;
  printf (&"%d\n"[0], b.b);
  b.b = 7;
  printf (&"%d\n"[0], 7);
  printf (&"%d\n"[0], 6);
  foo = &b;

<L1>:;
  printf (&"%d\n"[0], foo->a);
  D.1127 = foo->b;
  printf (&"%d\n"[0], D.1127);
  printf (&"%d\n"[0], D.1127);
  printf (&"%d\n"[0], 8);
  printf (&"%d\n"[0], 9);
  printf (&"%d\n"[0], 8);
  return;

}

Notice that almost *all* loads have been removed, and almost everything is now constant propagated, as it should be. (It's interesting to note that the RTL optimizers do nothing to the testcase above. So those are exactly the differences you also see at the assembly level).

Give it a whirl, see if it helps.

Attachment: vusebypass.diff
Description: Binary data



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]