Bug 85516 - [missed-optimization] gcc does not convert multiple compares against constants to a shift+bitmask test
Summary: [missed-optimization] gcc does not convert multiple compares against constant...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 8.0
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2018-04-24 15:03 UTC by Avi Kivity
Modified: 2021-11-29 15:29 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-04-25 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Avi Kivity 2018-04-24 15:03:33 UTC
Consider

=== example starts ===
enum class E { a, b, c, d, e, f, g, h, i, j };

bool check_mask(E v) {
    return v == E::b || v == E::e || v == E::f || v == E::j;
}
=== example ends ===

This can be compiled to just three instructions (x86):

  mov $0x232, %eax
  bt %edi, %eax
  setc %al

but instead gcc compiles it to:

  cmpl $1, %edi
  sete %al
  cmpl $4, %edi
  sete %dl
  orb %dl, %al
  jne .L1
  subl $5, %edi
  andl $-5, %edi
  sete %al
 .L1:

which is three times as large and contains a possibly unpredictable branch. More bits in the mask will presumably generate larger code.
Comment 1 Jakub Jelinek 2018-04-24 15:25:42 UTC
GCC can do this for switch expansion right now,
enum class E { a, b, c, d, e, f, g, h, i, j };

bool check_mask(E v) {
  switch (v) {
    case E::b: case E::e: case E::f: case E::j: return true;
    default: return false;
  }
}

	xorl	%eax, %eax
	cmpl	$9, %edi
	ja	.L1
	movl	$1, %eax
	movl	%edi, %ecx
	salq	%cl, %rax
	testl	$562, %eax
	setne	%al
.L1:
	ret
Comment 2 Avi Kivity 2018-04-24 15:29:04 UTC
Interesting. Still missing optimizations - v cannot be larger than 9 (UB), and bt is faster than mov+shift+and.
Comment 3 Jakub Jelinek 2018-04-24 15:35:11 UTC
UB only with -fstrict-enums which isn't the default.
Comment 4 Richard Biener 2018-04-25 07:26:33 UTC
I expected fold_truth_andor to eventually catch this via fold_range_test.  Probably the "bad" association in the testcase prevents this.

Anyway, this is a job for reassoc / if-conversion.
Comment 5 Andrew Pinski 2021-07-24 09:53:23 UTC
Note on the trunk we get the full optimized version with the switch now:
check_mask(E):
        cmpl    $9, %edi
        ja      .L3
        movl    $562, %eax
        btq     %rdi, %rax
        setc    %al
        ret
.L3:
        xorl    %eax, %eax
        ret
Comment 6 Avi Kivity 2021-11-29 15:28:19 UTC
gcc 11.2 produces optimized code with the original example:

check_mask(E):
  cmpl $9, %edi
  ja .L3
  movl %edi, %ecx
  movl $562, %eax
  shrq %cl, %rax
  andl $1, %eax
  ret
.L3:
  xorl %eax, %eax
  ret
Comment 7 Avi Kivity 2021-11-29 15:29:29 UTC
Marking as fixed, since at least 11.2.