Bug 115994 - Vectorizer failed to do vectorizaton for .sat_trunc when nunits_in / nunits_out > 2
Summary: Vectorizer failed to do vectorizaton for .sat_trunc when nunits_in / nunits_o...
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 15.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks: vectorizer
  Show dependency treegraph
 
Reported: 2024-07-19 02:11 UTC by Hongtao Liu
Modified: 2024-07-19 05:40 UTC (History)
1 user (show)

See Also:
Host:
Target: x86_64-*-* i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-07-19 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hongtao Liu 2024-07-19 02:11:37 UTC
in vectorizable_call

 3324  nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in);
 3325  nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out);
 3326  if (known_eq (nunits_in * 2, nunits_out))
 3327    modifier = NARROW;
 3328  else if (known_eq (nunits_out, nunits_in))
 3329    modifier = NONE;
 3330  else if (known_eq (nunits_out * 2, nunits_in))
 3331    modifier = WIDEN;
 3332  else
 3333    return false;


x86 AVX512 supports vpmovusqb/vpmovusqw/vpmovusdb, since current vectorizer will keep same vector length, then nunits_in / nunits_out will be greater than 2 and failed vectorization for .sat_trunc.
Comment 1 Hongtao Liu 2024-07-19 04:19:59 UTC
Also in vect_recog_sat_trunc_pattern 

4700      tree v_itype = get_vectype_for_scalar_type (vinfo, itype);
4701      tree v_otype = get_vectype_for_scalar_type (vinfo, otype);
4702      internal_fn fn = IFN_SAT_TRUNC;
4703
4704      if (v_itype != NULL_TREE && v_otype != NULL_TREE
4705        && direct_internal_fn_supported_p (fn, tree_pair (v_otype, v_itype),
4706                                           OPTIMIZE_FOR_BOTH))
4707        {
4708          gcall *call = gimple_build_call_internal (fn, 1, ops[0]);
4709          tree out_ssa = vect_recog_temp_ssa_var (otype, NULL);


it's supposed to check for something like sstruncv8siv8hi2, but it actually checks for sstruncv8siv16hi2 since get_vectype_for_scalar_type return same-size vector type not same-nunit vector type.
Comment 2 Richard Biener 2024-07-19 05:40:16 UTC
Confirmed.  The need for this didn't arise sofar.