This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug target/82170] gcc optimizes int range-checking poorly on x86-64


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82170

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
More complete testcase:
extern inline int f1 (long long n) { return -__INT_MAX__ - 1 <= n && n <=
__INT_MAX__; }
extern inline int f2 (long long n) { return n == (int) n; }
extern inline int f3 (unsigned long long n) { return n <= ~0U; }
extern inline int f4 (unsigned long long n) { return n == (unsigned int) n; }
extern inline int f5 (long long n) { return -__SHRT_MAX__ - 1 <= n && n <=
__SHRT_MAX__; }
extern inline int f6 (long long n) { return n == (short) n; }
extern inline int f7 (unsigned long long n) { return n <= (unsigned short) ~0U;
}
extern inline int f8 (unsigned long long n) { return n == (unsigned short) n; }
extern inline int f9 (long long n) { return -__SCHAR_MAX__ - 1 <= n && n <=
__SCHAR_MAX__; }
extern inline int f10 (long long n) { return n == (signed char) n; }
extern inline int f11 (unsigned long long n) { return n <= (unsigned char) ~0U;
}
extern inline int f12 (unsigned long long n) { return n == (unsigned char) n; }
extern inline int f13 (int n) { return -__SHRT_MAX__ - 1 <= n && n <=
__SHRT_MAX__; }
extern inline int f14 (int n) { return n == (short) n; }
extern inline int f15 (unsigned int n) { return n <= (unsigned short) ~0U; }
extern inline int f16 (unsigned int n) { return n == (unsigned short) n; }
extern inline int f17 (int n) { return -__SCHAR_MAX__ - 1 <= n && n <=
__SCHAR_MAX__; }
extern inline int f18 (int n) { return n == (signed char) n; }
extern inline int f19 (unsigned int n) { return n <= (unsigned char) ~0U; }
extern inline int f20 (unsigned int n) { return n == (unsigned char) n; }
extern inline int f21 (short int n) { return -__SCHAR_MAX__ - 1 <= n && n <=
__SCHAR_MAX__; }
extern inline int f22 (short int n) { return n == (signed char) n; }
extern inline int f23 (unsigned short int n) { return n <= (unsigned char) ~0U;
}
extern inline int f24 (unsigned short int n) { return n == (unsigned char) n; }
extern void foo (void);
void s1 (long long n) { if (f1 (n)) foo (); }
void s2 (long long n) { if (f2 (n)) foo (); }
void s3 (unsigned long long n) { if (f3 (n)) foo (); }
void s4 (unsigned long long n) { if (f4 (n)) foo (); }
void s5 (long long n) { if (f5 (n)) foo (); }
void s6 (long long n) { if (f6 (n)) foo (); }
void s7 (unsigned long long n) { if (f7 (n)) foo (); }
void s8 (unsigned long long n) { if (f8 (n)) foo (); }
void s9 (long long n) { if (f9 (n)) foo (); }
void s10 (long long n) { if (f10 (n)) foo (); }
void s11 (unsigned long long n) { if (f11 (n)) foo (); }
void s12 (unsigned long long n) { if (f12 (n)) foo (); }
void s13 (int n) { if (f13 (n)) foo (); }
void s14 (int n) { if (f14 (n)) foo (); }
void s15 (unsigned int n) { if (f15 (n)) foo (); }
void s16 (unsigned int n) { if (f16 (n)) foo (); }
void s17 (int n) { if (f17 (n)) foo (); }
void s18 (int n) { if (f18 (n)) foo (); }
void s19 (unsigned int n) { if (f19 (n)) foo (); }
void s20 (unsigned int n) { if (f20 (n)) foo (); }
void s21 (short int n) { if (f21 (n)) foo (); }
void s22 (short int n) { if (f22 (n)) foo (); }
void s23 (unsigned short int n) { if (f23 (n)) foo (); }
void s24 (unsigned short int n) { if (f24 (n)) foo (); }

So, this seems to be an instruction selection thing.  Comparing each pair of
functions shows that at least for instruction counts the latter is often, but
not always, shorter.

One question is if we want to canonicalize this during gimple fold (either the
n ==/!= (narrower type) n, or n + narrower_type_min_as_unsigned_wider <=/>
narrower_type_max_as_unsigned_wider to the other form if single use); if we do
and it is the former form, we'd also need to adjust the range discovery code
that reassoc uses in the range optimization.

Then there is a question how do we want to generate optimal sequence.  Do we
e.g. want to hook into the expansion (somewhere in do_store_flag and
do_compare_and_jump), check for this pattern (using TER info) and perhaps try
to expand both sequences, compute costs of both and see what is cheaper?

Hardcoding just one way I'm afraid is not going to be always a win.

Or shall the combiner be able to do something?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]