This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Early inlining and function references from static const struct (bug?)


PS: I framed the issue between the inline_param and fixup_cfg passes
because I was only looking at the tree passes, but the really relevant
passes are tree-einline (obviously) and ipa-inline, which happens
between tree-inline_param2 and tree-fixup_cfg2. So, restating the
problem: if early inline is not happening, late inline will miss the
chance to inline the reference from the static const struct member.

On Thu, Feb 4, 2016 at 1:08 PM, Carlos Pita <carlosjosepita@gmail.com> wrote:
> Hi all,
>
> I've been trying to understand some bizarre interaction between
> optimizing passes I've observed while compiling a heavily nested
> inlined numerical code of mine. I managed to reduce the issue down to
> this simple code:
>
> ``` test.c
>
> typedef struct F {
>   int (*call)(int);
> } F;
>
> static int g(F f, int x) {
>   x = f.call(x);
>   x = f.call(x);
>   x = f.call(x);
>   x = f.call(x);
>   x = f.call(x);
>   x = f.call(x);
>   x = f.call(x);
>   x = f.call(x);
>   return x;
> }
>
> static int sq(int x) {
>   return x * x;
> }
>
> static const F f = {sq};
>
> void dosomething(int);
>
> int h(int x) {
>   dosomething(g(f, x));
>   dosomething(g((F){sq}, x));
> }
>
> ```
>
> Here we have a driver function h calling the workhorse g which
> delegates some simple task to the inline-wannabe f. The distinctive
> aspect of the above scheme is that f is referenced from a struct
> member. The first call to g passes a static const struct while the
> second call passes a compound literal (alternatively, a local version
> of the struct will have the same effect regarding what follows).
>
> Now, say I compile this code with:
>
> gcc -O3 -fdump-tree-all --param early-inlining-insns=10 -c test.c
>
> The einline pass will not be able to inline calls to g with such a low
> value for early-inlining-insns.
>
> The inline_param2 pass still shows:
>
> ```
> h (int x)
> {
>   struct F D.1847;
>   int _4;
>   int _8;
>
>   <bb 2>:
>   _4 = g (f, x_2(D));
>   dosomething (_4);
>   D.1847.call = sq;
>   _8 = g (D.1847, x_2(D));
>   dosomething (_8);
>   return;
>
> }
>
> ```
>
> The next tree pass is fixup_cfg4, which does the inline but just for
> the second all to g:
>
> ```
> h (int x)
> {
>   ....
>
>   <bb 2>:
>   f = f;
>   f$call_7 = MEM[(struct F *)&f];
>   x_19 = f$call_7 (x_2(D));
>   x_20 = f$call_7 (x_19);
>   x_21 = f$call_7 (x_20);
>   x_22 = f$call_7 (x_21);
>   x_23 = f$call_7 (x_22);
>   x_24 = f$call_7 (x_23);
>   x_25 = f$call_7 (x_24);
>   x_26 = f$call_7 (x_25);
>   _43 = x_26;
>   _4 = _43;
>   dosomething (_4);
>   D.1847.call = sq;
>   f = D.1847;
>   f$call_10 = MEM[(struct F *)&f];
>   _33 = x_2(D) * x_2(D);
>   _45 = _33;
>   x_11 = _45;
>   _32 = x_11 * x_11;
>   _46 = _32;
>   x_12 = _46;
>   _31 = x_12 * x_12;
>   _47 = _31;
>   x_13 = _47;
>   _30 = x_13 * x_13;
>   _48 = _30;
>   x_14 = _48;
>   _29 = x_14 * x_14;
>   _49 = _29;
>   x_15 = _49;
>   _28 = x_15 * x_15;
>   _50 = _28;
>   x_16 = _50;
>   _27 = x_16 * x_16;
>   _51 = _27;
>   x_17 = _51;
>   _3 = x_17 * x_17;
>   _52 = _3;
>   x_18 = _52;
>   _53 = x_18;
>   _8 = _53;
>   dosomething (_8);
>   return;
>
> }
> ```
>
> Now, say I recompile the code with a larger early-inlining-insns, so
> that einline is able to early inline both calls to g:
>
> gcc -O3 -fdump-tree-all --param early-inlining-insns=50 -c test.c
>
> After inline_param2 (that is, before fixup_cfg4), we now have:
>
> ```
> h (int x)
> {
>   int x;
>   int x;
>
>   <bb 2>:
>   x_13 = sq (x_2(D));
>   x_14 = sq (x_13);
>   x_15 = sq (x_14);
>   x_16 = sq (x_15);
>   x_17 = sq (x_16);
>   x_18 = sq (x_17);
>   x_19 = sq (x_18);
>   x_20 = sq (x_19);
>   dosomething (x_20);
>   x_5 = sq (x_2(D));
>   x_6 = sq (x_5);
>   x_7 = sq (x_6);
>   x_8 = sq (x_7);
>   x_9 = sq (x_8);
>   x_10 = sq (x_9);
>   x_11 = sq (x_10);
>   x_12 = sq (x_11);
>   dosomething (x_12);
>   return;
>
> }
> ```
>
> And fixup_cfg4 is able to do its job for both calls:
>
> ```
> h (int x)
> {
>   ....
>
>   <bb 2>:
>   _36 = x_2(D) * x_2(D);
>   _37 = _36;
>   x_13 = _37;
>   _35 = x_13 * x_13;
>   _38 = _35;
>   x_14 = _38;
>   _34 = x_14 * x_14;
>   _39 = _34;
>   x_15 = _39;
>   _33 = x_15 * x_15;
>   _40 = _33;
>   x_16 = _40;
>   _32 = x_16 * x_16;
>   _41 = _32;
>   x_17 = _41;
>   _31 = x_17 * x_17;
>   _42 = _31;
>   x_18 = _42;
>   _30 = x_18 * x_18;
>   _43 = _30;
>   x_19 = _43;
>   _29 = x_19 * x_19;
>   _44 = _29;
>   x_20 = _44;
>   dosomething (x_20);
>   _28 = x_2(D) * x_2(D);
>   _45 = _28;
>   x_5 = _45;
>   _27 = x_5 * x_5;
>   _46 = _27;
>   x_6 = _46;
>   _26 = x_6 * x_6;
>   _47 = _26;
>   x_7 = _47;
>   _25 = x_7 * x_7;
>   _48 = _25;
>   x_8 = _48;
>   _24 = x_8 * x_8;
>   _49 = _24;
>   x_9 = _49;
>   _23 = x_9 * x_9;
>   _50 = _23;
>   x_10 = _50;
>   _22 = x_10 * x_10;
>   _51 = _22;
>   x_11 = _51;
>   _21 = x_11 * x_11;
>   _52 = _21;
>   x_12 = _52;
>   dosomething (x_12);
>   return;
>
> }
> ```
>
> The bottom line is that I get full inlining if einline manages to
> early inline both g calls, but I get incomplete inlining otherwise. I
> guess the problem is that fixup_cfg4 is not able to infer that
> f$call_7 is just sq in disguise when f is the global static const
> struct but it is able to get it when it's a local or literal one. In
> case einline expands the code early the successive passes will make
> fixup_cfg4 see just sq in both cases, making inlining of sq a trivial
> matter. But if einline hits its hard limits, fixup_cfg4 will have to
> figure out that f$call is sq by itself.
>
> I'm not sure whether this should be considered a proper bug or more of
> a quirk of the inlining system one must learn to live with. In the
> first case, I'll report it if you ask me to do it. In the second case,
> I would like to ask for some advice about the best way to cope with
> this scenario (besides blindly incrementing early-inlining-insns); I
> can provide more background regarding my real use case if necessary.
>
> Cheers
> --
> Carlos


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]