Take the following example:
void *memset(void *b, int c, unsigned long len)
unsigned long i;
for (i = 0; i < len; i++)
((unsigned char *)b)[i] = c;
The zero-extension of GPR4 isn't needed, and in fact, -O1 doesn't
(the subf here is superfluous though).
Still happens on mainline: -O2 still has the superfluous sign-extend,
but now the -O1 code is perfect.