RFC: PATCH: Add -m8bit-idiv for x86

Uros Bizjak ubizjak@gmail.com
Wed Sep 15 19:16:00 GMT 2010


On Tue, Sep 14, 2010 at 11:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>> This patch generates 2 idivbs since the optimization is done at RTL
>>>>> expansion. Is there a way to delay this until later when 2 idivls are
>>>>> optimized into 1 idivl and before IRA since this optimization needs
>>>>> a scratch register.
>>>>
>>>> Splitter with && can_create_pseudo_p () split constraint will limit
>>>> splits to pre-regalloc passes, or ...
>>>
>>> try_split doesn't allow any insn of the result matches the original pattern
>>> to avoid infinite loop.
>>
>> So, switch the places of div and mod RTXes in the parallel and provide
>> another divl_1 insn pattern that matches this new parallel.
>>
>
> Here is the updated patch.  I added 2 splitters for each divmod pattern.
> It splits 32bit divmod into
>
> if (dividend and divisor are in [0-255])
>  use 8bit unsigned integer divide
> else
>  use 32bit integer divide
>
> before IRA. It works quite well.  OK for trunk if there are no regressions
> on Linux./ia32 and Linux/x86-64?

> +m8bit-idiv
> +Target Report Var(flag_8bit_idiv) Init(-1) Save
> +Expand 32bit integer divide into control flow with 8bit unsigned integer divide

Please redefine -m8bit-idiv as target mask:

Target Report Mask(USE_8BIT_IDIV) Save

Also, please do not forget to update:

  /* Flag options.  */
  static struct ix86_target_opts flag_opts[] =

in i386.c.

You will be able to use TARGET_USE_8BIT_IDIV automatically, and
hopefully it can be also used as a per file/function target attribute.

> +(define_split
> +  [(set (match_operand:SWIM248 0 "register_operand" "=a")
> +	(div:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
> +		     (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
> +   (set (match_operand:SWIM248 1 "register_operand" "=&d")
> +	(mod:SWIM248 (match_dup 2) (match_dup 3)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "<MODE>mode == SImode
> +   && flag_8bit_idiv
> +   && TARGET_QIMODE_MATH
> +   && can_create_pseudo_p ()
> +   && !optimize_insn_for_size_p ()"
> +  [(const_int 0)]
> +  "ix86_split_idivmod (DIV, <MODE>mode, operands); DONE;")

No need for mode macro, just use SImode explicitly in the splitter.
And due to previous change, flag_8but_idiv can be substituted with
TARGET_USE_8BIT_IDIV define.

> +(define_split
> +  [(set (match_operand:SWIM248 0 "register_operand" "=a")
> +	(udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
> +		      (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
> +   (set (match_operand:SWIM248 1 "register_operand" "=&d")
> +	(umod:SWIM248 (match_dup 2) (match_dup 3)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "reload_completed"
> +  [(set (match_dup 1) (const_int 0))
> +   (parallel [(set (match_dup 0)
> +		   (udiv:SWIM248 (match_dup 2) (match_dup 3)))
> +	      (set (match_dup 1)
> +		   (umod:SWIM248 (match_dup 2) (match_dup 3)))
> +	      (use (match_dup 1))
> +	      (clobber (reg:CC FLAGS_REG))])]
> +  "")

Please omit empty splitter constraints.

> +void
> +ix86_split_idivmod (enum rtx_code code, enum machine_mode mode,
> +		    rtx operands[])

No need for rtx_code, just use "bool unsigned":

+void
+ix86_split_idivmod (enum machine_mode mode, rtx operands[], bool unsigned)

> +  switch (mode)
> +    {
> +    case SImode:
> +      gen_divmod4_1 = code == DIV ? gen_divmodsi4_1 : gen_udivmodsi4_1;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }

gcc_assert (mode == SImode);

gen_divmod4_1 = unsigned ? gen_udivmodsi...

Hm.... no DImode?

> +  if (code == DIV)
> +    {
> +      div = gen_rtx_DIV (SImode, operands[2], operands[3]);
> +      mod = gen_rtx_MOD (SImode, operands[2], operands[3]);
> +    }
> +  else
> +    {
> +      div = gen_rtx_UDIV (SImode, operands[2], operands[3]);
> +      mod = gen_rtx_UMOD (SImode, operands[2], operands[3]);
> +    }

if (unsigned)
...

> +This option will enable GCC to expand 32bit integer divide into control
> +flow with 8bit unsigned integer divide.

IMO, you should expand this comment a bit, at least explaining the
reason for this (non-obvious) option and describing some more "control
flow with 8bit ...". If you provide a thorough explanation and a good
reasoning for this option, then it will be used much more.

> 2010-09-14  H.J. Lu  <hongjiu.lu@intel.com>
>
>        * config/i386/i386-protos.h (ix86_split_idivmod): New.

New prototype.

Also, I agree with Andi, this conversion should be also triggered from
profile information.

Thanks,
Uros.



More information about the Gcc-patches mailing list