The following testcase demonstrates where reassociation/regrouping of expressions could result in greater parallelism for processors that have multiple arithmetic execution units. int myfunction (int a, int b, int c, int d, int e, int f, int g, int h) { int ret; ret = a + b + c + d + e + f + g + h; return ret; } Compiling with -O3 results in a series of dependent add instructions to accumulate the sum. add 4,3,4 add 4,4,5 add 4,4,6 add 4,4,7 add 4,4,8 add 4,4,9 add 4,4,10 If we regrouped to (a+b)+(c+d)+... we can do multiple adds in parallel on different execution units.
*** This bug has been marked as a duplicate of 44382 ***