Greetings everyone, I have pinged the mailing list about this issue about a month ago (I am not the original poster), and heard no response since. For some addressing modes, it seems that the complexity is calculated in sub-optimal manner. More information and a patch are contained in the link below. https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608895.html
It seems that commit f9f69dd changed the way complexity is calculated, so now the complexity equals zero across the board, for each candidate. Here is one testcase: void daxpy(float *vector1, float *vector2, int n, float fp_const) { for (int i = 0; i < n; ++i) vector1[i] += fp_const * vector2[i]; } void dgefa(float *vector, int m, int n, int l) { for (int i = 0; i < n - 1; ++i) { for (int j = i + 1; j < n; ++j) { float t = vector[m * j + l]; daxpy(&vector[m * i + i + 1], &vector[m * j + i + 1], n - (i + 1), t); } } } Following tables compare IV candidates. ===== Before f9f69dd ===== Group 1: cand cost compl. inv.expr. inv.vars 1 13 1 3; NIL; 2 13 2 4; NIL; 4 9 1 5; NIL; 5 1 0 NIL; NIL; 7 9 1 3; NIL; ===== Before f9f69dd ===== ===== After f9f69dd ===== Group 1: cand cost compl. inv.expr. inv.vars 1 10 0 4; NIL; 2 10 0 5; NIL; 4 6 0 6; NIL; 5 1 0 NIL; NIL; 7 6 0 4; NIL; ===== After f9f69dd ===== This leads to different candidate being selected in our test case. ===== Before f9f69dd ===== Selected IV set for loop 3 at fp_foo.c:3, 10 avg niters, 2 IVs: Candidate 4: Var befor: ivtmp.17_52 Var after: ivtmp.17_103 Incr POS: before exit test IV struct: Type: unsigned long Base: (unsigned long) (vector_27(D) + _10) Step: 4 Object: (void *) vector_27(D) Biv: N Overflowness wrto loop niter: Overflow Candidate 5: Var befor: ivtmp.18_99 Var after: ivtmp.18_98 Incr POS: before exit test IV struct: Type: unsigned long Base: (unsigned long) (vector_27(D) + _14) Step: 4 Object: (void *) vector_27(D) Biv: N Overflowness wrto loop niter: Overflow ===== Before f9f69dd ===== ===== After f9f69dd ===== Selected IV set for loop 3 at fp_foo.c:3, 10 avg niters, 1 IVs: Candidate 4: Var befor: ivtmp.17_52 Var after: ivtmp.17_103 Incr POS: before exit test IV struct: Type: unsigned long Base: (unsigned long) (vector_27(D) + _10) Step: 4 Object: (void *) vector_27(D) Biv: N Overflowness wrto loop niter: Overflow ===== After f9f69dd ===== Finally, here is the difference in assembly: ===== Before f9f69dd ===== .L83: lwc1 $f5,0($3) lwc1 $f8,0($2) lwc1 $f7,4($2) lwc1 $f6,8($2) lwc1 $f9,12($2) lwc1 $f10,16($2) maddf.s $f8,$f0,$f5 lwc1 $f11,20($2) lwc1 $f12,24($2) lwc1 $f13,28($2) ld $12,72($sp) swc1 $f8,0($2) lwc1 $f14,4($3) maddf.s $f7,$f0,$f14 swc1 $f7,4($2) lwc1 $f15,8($3) maddf.s $f6,$f0,$f15 swc1 $f6,8($2) lwc1 $f16,12($3) maddf.s $f9,$f0,$f16 swc1 $f9,12($2) lwc1 $f17,16($3) maddf.s $f10,$f0,$f17 swc1 $f10,16($2) lwc1 $f18,20($3) maddf.s $f11,$f0,$f18 swc1 $f11,20($2) lwc1 $f19,24($3) maddf.s $f12,$f0,$f19 swc1 $f12,24($2) lwc1 $f20,28($3) maddf.s $f13,$f0,$f20 swc1 $f13,28($2) daddiu $2,$2,32 bne $2,$12,.L83 daddiu $3,$3,32 ... ===== Before f9f69dd ===== ===== After f9f69dd ===== .L93: dsubu $18,$2,$4 lwc1 $f13,0($2) daddu $19,$18,$5 daddiu $16,$2,4 lwc1 $f14,0($19) dsubu $17,$16,$4 daddu $25,$17,$5 lwc1 $f15,4($2) daddiu $19,$2,12 daddiu $20,$2,8 maddf.s $f13,$f1,$f14 dsubu $16,$19,$4 daddiu $17,$2,16 dsubu $18,$20,$4 daddu $19,$16,$5 daddiu $16,$2,20 lwc1 $f10,8($2) daddu $20,$18,$5 lwc1 $f16,12($2) dsubu $18,$17,$4 lwc1 $f17,16($2) dsubu $17,$16,$4 lwc1 $f18,20($2) daddiu $16,$2,24 lwc1 $f20,24($2) daddu $18,$18,$5 swc1 $f13,0($2) daddu $17,$17,$5 lwc1 $f19,0($25) daddiu $25,$2,28 lwc1 $f11,28($2) daddiu $2,$2,32 dsubu $16,$16,$4 dsubu $25,$25,$4 maddf.s $f15,$f1,$f19 daddu $16,$16,$5 daddu $25,$25,$5 swc1 $f15,-28($2) lwc1 $f21,0($20) ld $20,48($sp) maddf.s $f10,$f1,$f21 swc1 $f10,-24($2) lwc1 $f22,0($19) maddf.s $f16,$f1,$f22 swc1 $f16,-20($2) lwc1 $f23,0($18) maddf.s $f17,$f1,$f23 swc1 $f17,-16($2) lwc1 $f0,0($17) maddf.s $f18,$f1,$f0 swc1 $f18,-12($2) lwc1 $f7,0($16) maddf.s $f20,$f1,$f7 swc1 $f20,-8($2) lwc1 $f12,0($25) maddf.s $f11,$f1,$f12 bne $2,$20,.L93 swc1 $f11,-4($2) ... ===== After f9f69dd =====
*** Bug 109430 has been marked as a duplicate of this bug. ***
Another thing that has come to attention is the register pressure costs being calculated improperly. More information and a patch are submitted at the link below. https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608896.html
There is a patch regarding this issue submitted to the GCC mailing list: https://gcc.gnu.org/pipermail/gcc-patches/2024-March/647966.html. Unfortunately, there hasn't been a response to it for quite some time now, even after a few pings from the original poster.