This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR25500, pessimization on SSE code caused by count_type_elements (expr.c)

     case VECTOR_TYPE:
-      return TYPE_VECTOR_SUBPARTS (type);
+      return TYPE_MODE (type) == BLKmode ? TYPE_VECTOR_SUBPARTS (type) : 1;

While I applaud the results of this patch, this doesn't really make
sense to me.  A vector has as many elements as it has, whether or not it
fits in a machine register.
No, because if it fits in a machine register, we can copy N elements with a single instruction. So it makes sense, in that case, to count the vector as a whole. For example, we should not count a V16QImode vector with 15 zeros and 1 non-zero element as "mostly zero".

To be fair, I can see now that this patch is not complete, because in cases like a 16-element SImode vector being decomposed to 4 V4SImode vectors, we would have to return 4 instead of 16. If treating that case correctly would make you change your mind (it is quite rare, but may happen indeed and there are provisions for it in tree-vect-generic.c), I can provide an updated patch.
That's a philosophical concern, but more
practically, won't gimplify_init_constructor break?
No, because it does not invoke count_type_elements for VECTOR_TYPEs.
Is it possible to instead teach the SRA code to handle this case specially?
I haven't seen Andrew's patch. When I tried to add a "full_count == 1" test to hard-code element-by-element copying in that case, however, it failed because count_type_elements was *not* returning 1. I guess Andrew is checking for TREE_CHAIN (TYPE_FIELDS (type)) == NULL instead.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]