[PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512

Jan Hubicka hubicka@ucw.cz
Thu Aug 27 11:09:24 GMT 2020


> Under what circumstances are we seeing a SEQUENCE in the x86 backend?  I'm
> surprised we need to handle that case.
> 
> So your pass modifies the insn in place, which is fine.  But do we actually
> remove the original constant pool entry if it's no longer used?  If not, does
> this patch actually save anything (memory bandwidth perhaps?)

Constant pool entries are output only if actually used by asm output, so
this could just work.
> 
> Is there an existing pass over the RTL chain where this would work so that it's
> more compile-time efficient?

I was also concerned about adding yet another pass and wanted to look
bit more into posibility to make this a part of peephole pass.  While it
is true that the usual way to write it (adding extra pattern for every
instruction)  is a lot of work I was thinking if we can perhaps just add
quite generic define_peephole which will match everything containing
broadcast via predicate, call into the expander that will try to build
mathcing instruction and fail otherwise.  While it is still bit of a
hack I think it may be less intrusive then yet another machine specific
pass.

Honza
> 
> jeff
> 


More information about the Gcc-patches mailing list