[PATCH] [AVX512] [PR87767] Optimize memory broadcast for constant vector under AVX512
Thu Aug 27 11:09:24 GMT 2020
> Under what circumstances are we seeing a SEQUENCE in the x86 backend? I'm
> surprised we need to handle that case.
> So your pass modifies the insn in place, which is fine. But do we actually
> remove the original constant pool entry if it's no longer used? If not, does
> this patch actually save anything (memory bandwidth perhaps?)
Constant pool entries are output only if actually used by asm output, so
this could just work.
> Is there an existing pass over the RTL chain where this would work so that it's
> more compile-time efficient?
I was also concerned about adding yet another pass and wanted to look
bit more into posibility to make this a part of peephole pass. While it
is true that the usual way to write it (adding extra pattern for every
instruction) is a lot of work I was thinking if we can perhaps just add
quite generic define_peephole which will match everything containing
broadcast via predicate, call into the expander that will try to build
mathcing instruction and fail otherwise. While it is still bit of a
hack I think it may be less intrusive then yet another machine specific
More information about the Gcc-patches