This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: missed loop optimizations from loop induction variable copies


On Tue, Sep 22, 2009 at 1:52 PM, Rahul Kharche <rahul@icerasemi.com> wrote:
> The following causes missed loop optimizations in O2 from creating
> unnecessary loop induction variables. Or, is a case of IV opts not
> able to coalesce copies of induction variables. A previous post related
> to this was made in PR41026 which had a type promoted loop index
> variable
> copied. I believe this example makes the problem more obvious.
>
> struct struct_t {
> ?int* data;
> };
>
> void testAutoIncStruct (struct struct_t* sp, int start, int end)
> {
> ? ?int i;
> ? ?for (i = 0; i+start < end; i++)
> ? ? ?{
> ? ? ? ?sp->data[i+start] = 0;
> ? ? ?}
> }
>
> With GCC v4.4.1 release) and gcc -O2 -fdump-tree-all on the above case
> we get the following dump from IVOpts
>
> testAutoIncStruct (struct struct_t * sp, int start, int end)
> {
> ?unsigned int D.1283;
> ?unsigned int D.1284;
> ?int D.1282;
> ?unsigned int ivtmp.32;
> ?int * pretmp.17;
> ?int i;
> ?int * D.1245;
> ?unsigned int D.1244;
> ?unsigned int D.1243;
>
> <bb 2>:
> ?if (start_3(D) < end_5(D))
> ? ?goto <bb 3>;
> ?else
> ? ?goto <bb 6>;
>
> <bb 3>:
> ?pretmp.17_22 = sp_6(D)->data;
> ?D.1282_23 = start_3(D) + 1;
> ?ivtmp.32_25 = (unsigned int) D.1282_23;
> ?D.1283_27 = (unsigned int) end_5(D);
> ?D.1284_28 = D.1283_27 + 1;
>
> <bb 4>:
> ?# start_20 = PHI <start_4(5), start_3(D)(3)>
> ?# ivtmp.32_7 = PHI <ivtmp.32_24(5), ivtmp.32_25(3)>
> ?D.1243_9 = (unsigned int) start_20;
> ?D.1244_10 = D.1243_9 * 4;
> ?D.1245_11 = pretmp.17_22 + D.1244_10;
> ?*D.1245_11 = 0;
> ?start_26 = (int) ivtmp.32_7;
> ?start_4 = start_26;
> ?ivtmp.32_24 = ivtmp.32_7 + 1;
> ?if (ivtmp.32_24 != D.1284_28)
> ? ?goto <bb 5>;
> ?else
> ? ?goto <bb 6>;
>
> <bb 5>:
> ?goto <bb 4>;
>
> <bb 6>:
> ?return;
>
> }
>
> IVOpts cannot identify start_26, start_4 and ivtmp_32_7 to be copies.
> The root cause is that expression 'i + start' is identified as a common
> expression between the test in the header and the index operation in the
> latch. This is unified by copy propagation or FRE prior to loop
> optimizations
> and creates a new induction variable.
>
> If we disable tree copy propagation and FRE with
> gcc -O2 -fno-tree-copy-prop -fno-tree-fre -fdump-tree-all we get
>
> testAutoIncStruct (struct struct_t * sp, int start, int end)
> {
> ?unsigned int D.1287;
> ?unsigned int D.1288;
> ?unsigned int D.1289;
> ?int D.1290;
> ?unsigned int D.1284;
> ?unsigned int D.1285;
> ?unsigned int D.1286;
> ?int * pretmp.17;
> ?int i;
> ?int * D.1245;
> ?unsigned int D.1244;
> ?unsigned int D.1243;
> ?int D.1242;
> ?int * D.1241;
>
> <bb 2>:
> ?if (start_3(D) < end_5(D))
> ? ?goto <bb 3>;
> ?else
> ? ?goto <bb 6>;
>
> <bb 3>:
> ?pretmp.17_23 = sp_6(D)->data;
> ?D.1287_27 = (unsigned int) end_5(D);
> ?D.1288_28 = (unsigned int) start_3(D);
> ?D.1289_29 = D.1287_27 - D.1288_28;
> ?D.1290_30 = (int) D.1289_29;
>
> <bb 4>:
> ?# i_20 = PHI <i_12(5), 0(3)>
> ?D.1241_7 = pretmp.17_23;
> ?D.1284_26 = (unsigned int) start_3(D);
> ?D.1285_25 = (unsigned int) i_20;
> ?D.1286_24 = D.1284_26 + D.1285_25;
> ?MEM[base: pretmp.17_23, index: D.1286_24, step: 4] = 0;
> ?i_12 = i_20 + 1;
> ?if (i_12 != D.1290_30)
> ? ?goto <bb 5>;
> ?else
> ? ?goto <bb 6>;
>
> <bb 5>:
> ?goto <bb 4>;
>
> <bb 6>:
> ?return;
>
> }
>
> The correct single induction variable as been identified here. This is
> not
> a loop header copying problem either. If we disable loop header copying,
> we
> still get multiple induction variables created. In fact in the above
> case
> loop header copying correctly enables post-increment mode on our port.
>
> testAutoIncStruct (struct struct_t * sp, int start, int end)
> {
> ?unsigned int D.1282;
> ?unsigned int ivtmp.31;
> ?unsigned int ivtmp.29;
> ?int i;
> ?int * D.1245;
> ?unsigned int D.1244;
> ?unsigned int D.1243;
> ?int D.1242;
> ?int * D.1241;
>
> <bb 2>:
> ?ivtmp.29_18 = (unsigned int) start_3(D);
> ?D.1282_21 = (unsigned int) start_3(D);
> ?ivtmp.31_22 = D.1282_21 * 4;
> ?goto <bb 4>;
>
> <bb 3>:
> ?D.1241_7 = sp_6(D)->data;
> ?D.1244_10 = ivtmp.31_19;
> ?D.1245_11 = D.1241_7 + D.1244_10;
> ?*D.1245_11 = 0;
> ?ivtmp.29_17 = ivtmp.29_8 + 1;
> ?ivtmp.31_20 = ivtmp.31_19 + 4;
>
> <bb 4>:
> ?# ivtmp.29_8 = PHI <ivtmp.29_18(2), ivtmp.29_17(3)>
> ?# ivtmp.31_19 = PHI <ivtmp.31_22(2), ivtmp.31_20(3)>
> ?D.1242_23 = (int) ivtmp.29_8;
> ?if (D.1242_23 < end_5(D))
> ? ?goto <bb 3>;
> ?else
> ? ?goto <bb 5>;
>
> <bb 5>:
> ?return;
>
> }
>
> Does this imply we try and not copy propagate or FRE potential induction
> variables? Or is this simply a missed case in IVOpts?

Well, the situation is more complicated.  Initially FRE removes a
full redundancy which loop-header copying turns into a loop
carried dependency.  So for example moving loop header copying
before FRE avoids this issue as well (which would be a good idea
for other reasons as well).  So, at the point of FRE there is no
way to avoid the situation with the current pass ordering.

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]