This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH i386 1/8] [AVX512] Adjust register classes.


On Wed, Jul 24, 2013 at 8:27 PM, Kirill Yukhin <kirill.yukhin@gmail.com> wrote:
> Hello,
> By this patch I am starting series of patches toward Intel (R) AVX-512 and SHA (see [1])
> extensions enabling in GCC.
> I've already submitted corresponding patches to BinUtils (see [2],[3]).
>
> This patch adds comand-line options for avx512* use and relevant cpuid bits
> detection. Vector registers are now 512-bit wide, so support for new modes
> (e.g. V16SF) is added. AVX512F introduce new 16 registers zmm16-zmm31. Some
> instructions are now have EVEX encoding and can now use those new registers while
> old instructions can't. We introduce new register class for them. We also add
> new constraint "v" which allows zmm0-zmm31. We can't extend "x" constraint
> because it's exposed in inline asm, and so may break some inline asm if we
> assign e. g. xmm21 to non-evex encodable instruction.  Idea is to replace all
> uses of "x" for evex-encodable instructions with "v". And allow only scalar and
> 512-bit modes for registers 16+ in ix86_hard_regno_mode_ok. We update move
> instructions to use evex-encodable versions of instructions for AVX512F to
> allow usage of new registers. Main problem is with vector mov<mode>_internal
> in sse.md. In AVX512F we have some instructions reading/writing e. g. ymm16+
> (for exmaple vinsert64x4/vextract64x4),but there in no ymm mov instruction
> with evex encoding, so we have to use insert/extract instead.

Some comments while going through the mega-patch:

+(define_insn "*movxi_internal_avx512f"
+  [(set (match_operand:XI 0 "nonimmediate_operand" "=x,x ,m")
+ (match_operand:XI 1 "vector_move_operand"  "C ,xm,x"))]
+  "TARGET_AVX512F && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return standard_sse_constant_opcode (insn, operands[1]);
+    case 1:
+    case 2:
+      return "vmovdqu32\t{%1, %0|%0, %1}";
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "type" "sselog1,ssemov,ssemov")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "XI")])

Even with the fact, that the cost of unaligned move is equal to
aligned one, I would rather see these moves split to aligned and
unaligned case, in the same way as movoi_internal_avx.

 (define_insn "*mov<mode>_internal"
-  [(set (match_operand:V16 0 "nonimmediate_operand"               "=x,x ,m")
- (match_operand:V16 1 "nonimmediate_or_sse_const_operand"  "C ,xm,x"))]
+  [(set (match_operand:V16 0 "nonimmediate_operand"               "=v,v ,m")
+ (match_operand:V16 1 "nonimmediate_or_sse_const_operand"  "C ,vm,v"))]

...

+    /* Reg -> reg move is always aligned.  Just use wider move.  */
+    switch (mode)
+      {
+      case MODE_V8SF:
+      case MODE_V4SF:
+ return "vmovaps\t{%g1, %g0|%g0, %g1}";
+      case MODE_V4DF:
+      case MODE_V2DF:
+ return "%vmovapd\t{%g1, %g0|%g0, %g1}";

No need for % prefix. This is AVX only insn.

+      case MODE_OI:
+      case MODE_TI:
+ return "vmovdqu64\t{%g1, %g0|%g0, %g1}";

vmovdqa64, we are operating on registers only.

+    case MODE_XI:
+      if (<MODE>mode == V8DImode)
+ return "vmovdqu64\t{%1, %0|%0, %1}";
+      else
+ return "vmovdqu32\t{%1, %0|%0, %1}";

Please put mode calculations to mode attribute. It looks that
sseinsnmode is not working correctly here, and needs to be updated to
correctly select OI and XI mode.

Uros.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]