This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] PR target/70155: Use SSE for TImode load/store
- From: Uros Bizjak <ubizjak at gmail dot com>
- To: "H.J. Lu" <hjl dot tools at gmail dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Ilya Enkovich <enkovich dot gnu at gmail dot com>, Richard Biener <rguenther at suse dot de>
- Date: Mon, 25 Apr 2016 17:10:20 +0200
- Subject: Re: [PATCH] PR target/70155: Use SSE for TImode load/store
- Authentication-results: sourceware.org; auth=none
- References: <20160425125145 dot GA5326 at intel dot com> <CAFULd4ZFMuMR0UVtxYL6teuO-CABSJv6QbqXhkU5DX5F_aGdYQ at mail dot gmail dot com> <CAMe9rOrzOZYGSwq9g9aj3xSjSHnGuBJh_rmFTai0V7eUPkaSTw at mail dot gmail dot com>
On Mon, Apr 25, 2016 at 4:47 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Apr 25, 2016 at 7:18 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
>> On Mon, Apr 25, 2016 at 2:51 PM, H.J. Lu <hongjiu.lu@intel.com> wrote:
>>> Tested on Linux/x86-64. OK for trunk?
>>
>>> + /* FIXME: Since the CSE pass may change dominance info, which isn't
>>> + expected by the fwprop pass, call free_dominance_info to
>>> + invalidate dominance info. Otherwise, the fwprop pass may crash
>>> + when dominance info is changed. */
>>> + if (TARGET_64BIT)
>>> + free_dominance_info (CDI_DOMINATORS);
>>> +
>>
>> Please resolve the above problem first, target-dependent sources are
>> not the place to apply band-aids for middle-end problems. The thread
>> with the proposed fix died in [1].
>>
>> [1] https://gcc.gnu.org/ml/gcc/2016-03/msg00143.html
>
> free_dominance_info (CDI_DOMINATORS) has been called in other
> places to avoid this middle-end issue. I don't know when the middle-end
> will be fixed. I don't think this target optimization should be penalized by
> the middle-end issue.
Let's ask Richard if he is OK with the workaround...
One more thing:
@@ -3551,6 +3874,7 @@ convert_scalars_to_vector ()
basic_block bb;
bitmap candidates;
int converted_insns = 0;
+ rtx zero, minus_one;
bitmap_obstack_initialize (NULL);
candidates = BITMAP_ALLOC (NULL);
@@ -3585,22 +3909,40 @@ convert_scalars_to_vector ()
if (dump_file)
fprintf (dump_file, "There are no candidates for optimization.\n");
+ if (TARGET_64BIT)
+ {
+ zero = gen_reg_rtx (V1TImode);
+ minus_one = gen_reg_rtx (V1TImode);
+ }
+ else
+ {
+ zero = NULL_RTX;
+ minus_one = NULL_RTX;
+ }
+
Do we *really* need to crate registers here? They are only used as a
temporary in scalar_chain_64::convert_insn, and I think that any CSE
worth its name should find out when the same immediate is loaded to
different temporaries.
BTW: I'd really like if Ilya can review the functionality of the
patch, he is an expert in this conversion stuff.
Uros.