[PATCH] Simple optimization for MASK_STORE.

Yuri Rumyantsev ysrumyan@gmail.com
Thu Nov 5 15:49:00 GMT 2015


Hi All!

I prepared another patch which performs insertion additional check on
zero mask for masked stores if only parameter
PARAM_ZERO_TEST_FOR_MASK_STORE has non-zero value. My attempt to use
approach proposed by Richard with simpler alternative for comparison -
use scalar type for 256-bit was not successful and I returned to
vectori comparison with scalar Boolean result.

ChangeLog:
2015-11-05  Yuri Rumyantsev  <ysrumyan@gmail.com>

* config/i386/i386.c: Add conditional initialization of
PARAM_ZERO_TEST_FOR_MASK_STORE.
(ix86_expand_branch): Implement vector comparison with boolean result.
* config/i386/i386.h: New macros TARGET_OPTIMIZE_MASK_STORE.
* config/i386/sse.md (define_expand "cbranch<mode>4): Add define-expand
for vector comparion with eq/ne only.
* config/i386/x86-tune.def: New macros X86_TUNE_OPTIMIZE_MASK_STORE.
* fold-const.c (fold_relational_const): Add handling of vector
comparison with boolean result.
* params.def (PARAM_ZERO_TEST_FOR_MASK_STORE): New DEFPARAM.
* params.h (ENABLE_ZERO_TEST_FOR_MASK_STORE): New macros.
* tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
comparison of vector operands with boolean result for EQ/NE only.
(verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
(verify_gimple_cond): Likewise.
* tree-ssa-forwprop.c (forward_propagate_into_comparison_1): Do not
combine vector comparison with boolean result and VEC_COND_EXPR that
has vector result.
* tree-vect-loop.c (is_valid_sink): New function.
(optimize_mask_stores): Likewise.
* tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
has_mask_store field of vect_info.
* tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
vectorized loops having masked stores.
* tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
correspondent macros.
(optimize_mask_stores): Add prototype.

gcc/testsuite/ChangeLog:
* gcc.target/i386/avx2-vect-mask-store-move1.c: New test.
* gcc.target/i386/avx2-vect-mask-store-move2.c: Likewise.

2015-11-02 18:24 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
> Hi Richard,
>
> I've come back to this optimization and try to implement your proposal
> for comparison:
>> Btw, you didn't try the simpler alternative of
>>
>> tree type = type_for_mode (int_mode_for_mode (TYPE_MODE (vectype)));
>> build2 (EQ_EXPR, boolean_type_node,
>>  build1 (VIEW_CONVERT, type, op0), build1 (VIEW_CONVERT, type, op1));
>>
>> ?  That is, use the GIMPLE level equivalent of
>>  (cmp (subreg:TI reg:V4SI) (subreg:TI reg:V4SI))
>
> using the following code:
>
>       vectype = TREE_TYPE (mask);
>       ext_mode = mode_for_size (GET_MODE_BITSIZE (TYPE_MODE (vectype)),
> MODE_INT, 0);
>       ext_type = lang_hooks.types.type_for_mode (ext_mode , 1);
>
> but I've got zero type for it. Should I miss something?
>
> Any help will be appreciated.
> Yuri.
>
>
> 2015-08-13 14:40 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>> On Thu, Aug 13, 2015 at 1:32 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
>>> Hi Richard,
>>>
>>> Did you have a chance to look at updated patch?
>>
>> Having a quick look now.  Btw, you didn't try the simpler alternative of
>>
>>  tree type = type_for_mode (int_mode_for_mode (TYPE_MODE (vectype)));
>>  build2 (EQ_EXPR, boolean_type_node,
>>    build1 (VIEW_CONVERT, type, op0), build1 (VIEW_CONVERT, type, op1));
>>
>> ?  That is, use the GIMPLE level equivalent of
>>
>>  (cmp (subreg:TI reg:V4SI) (subreg:TI reg:V4SI))
>>
>> ?  That should be supported by the expander already, though again not sure if
>> the target(s) have compares that match this.
>>
>> Btw, the tree-cfg.c hook wasn't what was agreed on - the restriction
>> on EQ/NE_EXPR
>> is missing.  Operand type equality is tested anyway.
>>
>> Why do you need to restrict forward_propagate_into_comparison_1?
>>
>> Otherwise this looks better, but can you try with the VIEW_CONVERT as well?
>>
>> Thanks,
>> Richard.
>>
>>
>>> Thanks.
>>> Yuri.
>>>
>>> 2015-08-06 14:07 GMT+03:00 Yuri Rumyantsev <ysrumyan@gmail.com>:
>>>> HI All,
>>>>
>>>> Here is updated patch which implements Richard proposal to use vector
>>>> comparison with boolean result instead of target hook. Support for it
>>>> was added to ix86_expand_branch.
>>>>
>>>> Any comments will be appreciated.
>>>>
>>>> Bootstrap and regression testing did not show any new failures.
>>>>
>>>> ChangeLog:
>>>> 2015-08-06  Yuri Rumyantsev  <ysrumyan@gmail.com>
>>>>
>>>> * config/i386/i386.c (ix86_expand_branch): Implement vector
>>>> comparison with boolean result.
>>>> * config/i386/sse.md (define_expand "cbranch<mode>4): Add define
>>>> for vector comparion.
>>>> * fold-const.c (fold_relational_const): Add handling of vector
>>>> comparison with boolean result.
>>>> * params.def (PARAM_ZERO_TEST_FOR_STORE_MASK): New DEFPARAM.
>>>> * params.h (ENABLE_ZERO_TEST_FOR_STORE_MASK): new macros.
>>>> * tree-cfg.c (verify_gimple_comparison): Add test for vector
>>>> comparion with boolean result.
>>>> * tree-ssa-forwprop.c (forward_propagate_into_comparison_1): Do not
>>>> propagate vector comparion with boolean result.
>>>> * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
>>>> has_mask_store field of vect_info.
>>>> * tree-vectorizer.c: Include files ssa.h, cfghooks.h and params.h.
>>>> (is_valid_sink): New function.
>>>> (optimize_mask_stores): New function.
>>>> (vectorize_loops): Invoke optimaze_mask_stores for loops having masked
>>>> stores.
>>>> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
>>>> correspondent macros.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.
>>>>
>>>>
>>>> 2015-07-27 11:48 GMT+03:00 Richard Biener <richard.guenther@gmail.com>:
>>>>> On Fri, Jul 24, 2015 at 9:11 PM, Jeff Law <law@redhat.com> wrote:
>>>>>> On 07/24/2015 03:16 AM, Richard Biener wrote:
>>>>>>>>
>>>>>>>> Is there any rationale given anywhere for the transformation into
>>>>>>>> conditional expressions?  ie, is there any reason why we can't have a
>>>>>>>> GIMPLE_COND where the expression is a vector condition?
>>>>>>>
>>>>>>>
>>>>>>> No rationale for equality compare which would have the semantic of
>>>>>>> having all elements equal or not equal.  But you can't define a sensible
>>>>>>> ordering (that HW implements) for other compare operators and you
>>>>>>> obviously need a single boolean result, not a vector of element comparison
>>>>>>> results.
>>>>>>
>>>>>> Right.  EQ/NE only as others just don't have any real meaning.
>>>>>>
>>>>>>
>>>>>>> I've already replied that I'm fine allowing ==/!= whole-vector compares.
>>>>>>> But one needs to check whether expansion does anything sensible
>>>>>>> with them (either expand to integer subreg compares or add optabs
>>>>>>> for the compares).
>>>>>>
>>>>>> Agreed, EQ/NE for whole vector compares only would be fine for me too under
>>>>>> the same conditions.
>>>>>
>>>>> Btw, you can already do this on GIMPLE by doing
>>>>>
>>>>>   TImode vec_as_int = VIEW_CONVERT_EXPR <TImode> (vec_2);
>>>>>   if (vec_as_int == 0)
>>>>> ...
>>>>>
>>>>> which is what the RTL will look like in the end.  So not sure if making this
>>>>> higher-level in GIMPLE is good or required.
>>>>>
>>>>> Richard.
>>>>>
>>>>>> jeff
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.5
Type: application/octet-stream
Size: 19246 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20151105/3c6111b2/attachment.obj>


More information about the Gcc-patches mailing list