Created attachment 54497 [details] Assembly code generated by test case Looking a bit more at the code generated for the test code of PR108839. For the test $ cat u2.c void foo(double *const restrict dx, double *dy, double da, long int n) { long int m = n % 4; for (unsigned long i = 0; i < m; i++ ) dy[i] = dy[i] + da * dx[i]; } a recently-ish trunk gives, with $ gcc -S -O3 -funroll-all-loops -fno-tree-vectorize u2.c far too much unrolling for a loop which can only be executed, at most, four times (see attachment). The range information about m does not appear to be propagated to the unroll passes.
(In reply to Thomas Koenig from comment #0) > The range information about m does not appear to be propagated to > the unroll passes. Most likely because range information is not propagated at all to rtl level. In this case even just non-zero bits might be enough...
Confirmed. Though maybe the tree level unroller could improve this situtation such that it just does the unroll here 4 times.