[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled

rguenther at suse dot de gcc-bugzilla@gcc.gnu.org
Fri Sep 18 10:30:42 GMT 2020


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789

--- Comment #14 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
> 
> --- Comment #13 from Kewen Lin <linkw at gcc dot gnu.org> ---
> >   2) on Power, the conversion from unsigned char to unsigned short is nop
> > conversion, when we counting scalar cost, it's counted, then add costs 32
> > totally onto scalar cost. Meanwhile, the conversion from unsigned short to
> > signed short should be counted but it's not (need to check why further). 
> 
> UH to SH conversion is true when calling vect_nop_conversion_p, so it's not
> even put into the cost vector. 
> 
> tree_nop_conversion_p's comments saying:
> 
> /* Return true iff conversion from INNER_TYPE to OUTER_TYPE generates
>    no instruction.  */
> 
> I may miss something here, but UH to SH conversion does need one explicit
> extend instruction *extsh*, the precision/mode equality check looks wrong for
> this conversion.

Well, it isn't a RTL predicate and it only needs extension because
there's never a HImode pseudo but always SImode subregs.


More information about the Gcc-bugs mailing list