RFC: PATCH: Add -m8bit-idiv for x86
Uros Bizjak
ubizjak@gmail.com
Wed Sep 15 19:16:00 GMT 2010
On Tue, Sep 14, 2010 at 11:33 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>>>>> This patch generates 2 idivbs since the optimization is done at RTL
>>>>> expansion. Is there a way to delay this until later when 2 idivls are
>>>>> optimized into 1 idivl and before IRA since this optimization needs
>>>>> a scratch register.
>>>>
>>>> Splitter with && can_create_pseudo_p () split constraint will limit
>>>> splits to pre-regalloc passes, or ...
>>>
>>> try_split doesn't allow any insn of the result matches the original pattern
>>> to avoid infinite loop.
>>
>> So, switch the places of div and mod RTXes in the parallel and provide
>> another divl_1 insn pattern that matches this new parallel.
>>
>
> Here is the updated patch. I added 2 splitters for each divmod pattern.
> It splits 32bit divmod into
>
> if (dividend and divisor are in [0-255])
> use 8bit unsigned integer divide
> else
> use 32bit integer divide
>
> before IRA. It works quite well. OK for trunk if there are no regressions
> on Linux./ia32 and Linux/x86-64?
> +m8bit-idiv
> +Target Report Var(flag_8bit_idiv) Init(-1) Save
> +Expand 32bit integer divide into control flow with 8bit unsigned integer divide
Please redefine -m8bit-idiv as target mask:
Target Report Mask(USE_8BIT_IDIV) Save
Also, please do not forget to update:
/* Flag options. */
static struct ix86_target_opts flag_opts[] =
in i386.c.
You will be able to use TARGET_USE_8BIT_IDIV automatically, and
hopefully it can be also used as a per file/function target attribute.
> +(define_split
> + [(set (match_operand:SWIM248 0 "register_operand" "=a")
> + (div:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
> + (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
> + (set (match_operand:SWIM248 1 "register_operand" "=&d")
> + (mod:SWIM248 (match_dup 2) (match_dup 3)))
> + (clobber (reg:CC FLAGS_REG))]
> + "<MODE>mode == SImode
> + && flag_8bit_idiv
> + && TARGET_QIMODE_MATH
> + && can_create_pseudo_p ()
> + && !optimize_insn_for_size_p ()"
> + [(const_int 0)]
> + "ix86_split_idivmod (DIV, <MODE>mode, operands); DONE;")
No need for mode macro, just use SImode explicitly in the splitter.
And due to previous change, flag_8but_idiv can be substituted with
TARGET_USE_8BIT_IDIV define.
> +(define_split
> + [(set (match_operand:SWIM248 0 "register_operand" "=a")
> + (udiv:SWIM248 (match_operand:SWIM248 2 "register_operand" "0")
> + (match_operand:SWIM248 3 "nonimmediate_operand" "rm")))
> + (set (match_operand:SWIM248 1 "register_operand" "=&d")
> + (umod:SWIM248 (match_dup 2) (match_dup 3)))
> + (clobber (reg:CC FLAGS_REG))]
> + "reload_completed"
> + [(set (match_dup 1) (const_int 0))
> + (parallel [(set (match_dup 0)
> + (udiv:SWIM248 (match_dup 2) (match_dup 3)))
> + (set (match_dup 1)
> + (umod:SWIM248 (match_dup 2) (match_dup 3)))
> + (use (match_dup 1))
> + (clobber (reg:CC FLAGS_REG))])]
> + "")
Please omit empty splitter constraints.
> +void
> +ix86_split_idivmod (enum rtx_code code, enum machine_mode mode,
> + rtx operands[])
No need for rtx_code, just use "bool unsigned":
+void
+ix86_split_idivmod (enum machine_mode mode, rtx operands[], bool unsigned)
> + switch (mode)
> + {
> + case SImode:
> + gen_divmod4_1 = code == DIV ? gen_divmodsi4_1 : gen_udivmodsi4_1;
> + break;
> + default:
> + gcc_unreachable ();
> + }
gcc_assert (mode == SImode);
gen_divmod4_1 = unsigned ? gen_udivmodsi...
Hm.... no DImode?
> + if (code == DIV)
> + {
> + div = gen_rtx_DIV (SImode, operands[2], operands[3]);
> + mod = gen_rtx_MOD (SImode, operands[2], operands[3]);
> + }
> + else
> + {
> + div = gen_rtx_UDIV (SImode, operands[2], operands[3]);
> + mod = gen_rtx_UMOD (SImode, operands[2], operands[3]);
> + }
if (unsigned)
...
> +This option will enable GCC to expand 32bit integer divide into control
> +flow with 8bit unsigned integer divide.
IMO, you should expand this comment a bit, at least explaining the
reason for this (non-obvious) option and describing some more "control
flow with 8bit ...". If you provide a thorough explanation and a good
reasoning for this option, then it will be used much more.
> 2010-09-14 H.J. Lu <hongjiu.lu@intel.com>
>
> * config/i386/i386-protos.h (ix86_split_idivmod): New.
New prototype.
Also, I agree with Andi, this conversion should be also triggered from
profile information.
Thanks,
Uros.
More information about the Gcc-patches
mailing list