This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug target/63679] [5.0 Regression][AArch64] Failure to constant fold.
- From: "pinskia at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Thu, 20 Nov 2014 16:49:52 +0000
- Subject: [Bug target/63679] [5.0 Regression][AArch64] Failure to constant fold.
- Auto-submitted: auto-generated
- References: <bug-63679-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63679
--- Comment #8 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Tejas Belagod from comment #7)
> I tried this, but it still doesn't seem to fold for aarch64.
>
> So, here is the DOM trace for aarch64:
>
> Optimizing statement a = *.LC0;
Why do we get LC0 in the first place? It seems like it is happening because of
some cost model issue with MOVECOST.
> LKUP STMT a = *.LC0 with .MEM_3(D)
> LKUP STMT *.LC0 = a with .MEM_3(D)
> Optimizing statement vectp_a.5_1 = &a;
> LKUP STMT vectp_a.5_1 = &a
> ==== ASGN vectp_a.5_1 = &a
> Optimizing statement vect__6.6_13 = MEM[(int *)vectp_a.5_1];
> Replaced 'vectp_a.5_1' with constant '&aD.2604'
> LKUP STMT vect__6.6_13 = MEM[(int *)&a] with .MEM_4
> 2>>> STMT vect__6.6_13 = MEM[(int *)&a] with .MEM_4
> Optimizing statement vect_sum_7.7_6 = vect__6.6_13;
> LKUP STMT vect_sum_7.7_6 = vect__6.6_13
> ==== ASGN vect_sum_7.7_6 = vect__6.6_13
> Optimizing statement vectp_a.4_7 = vectp_a.5_1 + 16;
> Replaced 'vectp_a.5_1' with constant '&aD.2604'
> LKUP STMT vectp_a.4_7 = &a pointer_plus_expr 16
> 2>>> STMT vectp_a.4_7 = &a pointer_plus_expr 16
> ==== ASGN vectp_a.4_7 = &MEM[(void *)&a + 16B]
> Optimizing statement ivtmp_8 = 1;
> LKUP STMT ivtmp_8 = 1
> ==== ASGN ivtmp_8 = 1
> Optimizing statement vect__6.6_10 = MEM[(int *)vectp_a.4_7];
> Replaced 'vectp_a.4_7' with constant '&MEM[(voidD.39 *)&aD.2604 + 16B]'
> Folded to: vect__6.6_10 = MEM[(int *)&a + 16B];
> LKUP STMT vect__6.6_10 = MEM[(int *)&a + 16B] with .MEM_4
> 2>>> STMT vect__6.6_10 = MEM[(int *)&a + 16B] with .MEM_4
> Optimizing statement vect_sum_7.7_17 = vect_sum_7.7_6 + vect__6.6_10;
> Replaced 'vect_sum_7.7_6' with variable 'vect__6.6_13'
> gimple_simplified to vect_sum_7.7_17 = vect__6.6_10 + vect__6.6_13;
> Folded to: vect_sum_7.7_17 = vect__6.6_10 + vect__6.6_13;
> LKUP STMT vect_sum_7.7_17 = vect__6.6_10 plus_expr vect__6.6_13
> 2>>> STMT vect_sum_7.7_17 = vect__6.6_10 plus_expr vect__6.6_13
> ...
>
> In x86's case, by this time, the constant vectors have been propagated and
> folded into a constant vector:
>
> Optimizing statement vect_cst_.12_23 = { 0, 1, 2, 3 };
> LKUP STMT vect_cst_.12_23 = { 0, 1, 2, 3 }
> ==== ASGN vect_cst_.12_23 = { 0, 1, 2, 3 }
> Optimizing statement vect_cst_.11_32 = { 4, 5, 6, 7 };
> LKUP STMT vect_cst_.11_32 = { 4, 5, 6, 7 }
> ==== ASGN vect_cst_.11_32 = { 4, 5, 6, 7 }
> Optimizing statement vectp.14_2 = &a[0];
> LKUP STMT vectp.14_2 = &a[0]
> ==== ASGN vectp.14_2 = &a[0]
> Optimizing statement MEM[(int *)vectp.14_2] = vect_cst_.12_23;
> Replaced 'vectp.14_2' with constant '&aD.1831[0]'
> Replaced 'vect_cst_.12_23' with constant '{ 0, 1, 2, 3 }'
> Folded to: MEM[(int *)&a] = { 0, 1, 2, 3 };
> LKUP STMT MEM[(int *)&a] = { 0, 1, 2, 3 } with .MEM_3(D)
> LKUP STMT { 0, 1, 2, 3 } = MEM[(int *)&a] with .MEM_3(D)
> LKUP STMT { 0, 1, 2, 3 } = MEM[(int *)&a] with .MEM_25
> 2>>> STMT { 0, 1, 2, 3 } = MEM[(int *)&a] with .MEM_25
> Optimizing statement vectp.14_21 = vectp.14_2 + 16;
> Replaced 'vectp.14_2' with constant '&aD.1831[0]'
> LKUP STMT vectp.14_21 = &a[0] pointer_plus_expr 16
> 2>>> STMT vectp.14_21 = &a[0] pointer_plus_expr 16
> ==== ASGN vectp.14_21 = &MEM[(void *)&a + 16B]
> Optimizing statement MEM[(int *)vectp.14_21] = vect_cst_.11_32;
> Replaced 'vectp.14_21' with constant '&MEM[(voidD.41 *)&aD.1831 + 16B]'
> Replaced 'vect_cst_.11_32' with constant '{ 4, 5, 6, 7 }'
> Folded to: MEM[(int *)&a + 16B] = { 4, 5, 6, 7 };
> LKUP STMT MEM[(int *)&a + 16B] = { 4, 5, 6, 7 } with .MEM_25
> LKUP STMT { 4, 5, 6, 7 } = MEM[(int *)&a + 16B] with .MEM_25
> LKUP STMT { 4, 5, 6, 7 } = MEM[(int *)&a + 16B] with .MEM_19
> 2>>> STMT { 4, 5, 6, 7 } = MEM[(int *)&a + 16B] with .MEM_19
> Optimizing statement vectp_a.5_22 = &a;
> LKUP STMT vectp_a.5_22 = &a
> ==== ASGN vectp_a.5_22 = &a
> Optimizing statement vect__13.6_20 = MEM[(int *)vectp_a.5_22];
> Replaced 'vectp_a.5_22' with constant '&aD.1831'
> LKUP STMT vect__13.6_20 = MEM[(int *)&a] with .MEM_19
> FIND: { 0, 1, 2, 3 }
> Replaced redundant expr '# VUSE <.MEM_19>
> MEM[(intD.6 *)&aD.1831]' with '{ 0, 1, 2, 3 }'
> ==== ASGN vect__13.6_20 = { 0, 1, 2, 3 }
> Optimizing statement vect_sum_14.7_13 = vect__13.6_20;
> Replaced 'vect__13.6_20' with constant '{ 0, 1, 2, 3 }'
> LKUP STMT vect_sum_14.7_13 = { 0, 1, 2, 3 }
> ==== ASGN vect_sum_14.7_13 = { 0, 1, 2, 3 }
> ....
>
> While the MEM[vect_ptr + CST] gets replaced correctly by 'a', it doesn't
> seem to figure out that the literal pool load 'a = *LC0' is nothing but
>
> vect_cst_.12_23 = { 0, 1, 2, 3 }; and vect_cst_.11_32 = { 4, 5, 6, 7 };
>
> which is the only major difference between how the const vector is
> initialized in x86 and aarch64. Is DOM not able to understand 'a = *LC0'?