[Bug target/96789] x264: sub4x4_dct() improves when vectorization is disabled
rguenther at suse dot de
gcc-bugzilla@gcc.gnu.org
Fri Sep 18 10:30:42 GMT 2020
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
--- Comment #14 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 18 Sep 2020, linkw at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96789
>
> --- Comment #13 from Kewen Lin <linkw at gcc dot gnu.org> ---
> > 2) on Power, the conversion from unsigned char to unsigned short is nop
> > conversion, when we counting scalar cost, it's counted, then add costs 32
> > totally onto scalar cost. Meanwhile, the conversion from unsigned short to
> > signed short should be counted but it's not (need to check why further).
>
> UH to SH conversion is true when calling vect_nop_conversion_p, so it's not
> even put into the cost vector.
>
> tree_nop_conversion_p's comments saying:
>
> /* Return true iff conversion from INNER_TYPE to OUTER_TYPE generates
> no instruction. */
>
> I may miss something here, but UH to SH conversion does need one explicit
> extend instruction *extsh*, the precision/mode equality check looks wrong for
> this conversion.
Well, it isn't a RTL predicate and it only needs extension because
there's never a HImode pseudo but always SImode subregs.
More information about the Gcc-bugs
mailing list