This is the mail archive of the
mailing list for the GCC project.
lvx versus lxvd2x on power8
- From: Igor Henrique Soares Nunes <igor dot nunes at eldorado dot org dot br>
- To: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Cc: "wschmidt at linux dot vnet dot ibm dot com" <wschmidt at linux dot vnet dot ibm dot com>
- Date: Mon, 10 Apr 2017 18:36:31 +0000
- Subject: lvx versus lxvd2x on power8
- Authentication-results: sourceware.org; auth=none
I recently checked this old discussion about when/why to use lxvd2x instead of lvsl/lvx/vperm/lvx to load elements from memory to vector: https://gcc.gnu.org/ml/gcc/2015-03/msg00135.html
I had the same doubt and I was also concerned how performance influences on these approaches. So that, I created the following project to check which one is faster and how memory alignment can influence on results:
This is a simple code, that many loads (using both approaches) are executed in a simple loop in order to measure which implementation is slower. The project also considers alignment.
As it can be seen on this plot (https://raw.githubusercontent.com/igorsnunes/load_vec_cmp/master/doc/LoadVecCompare.png) an unaligned load using lxvd2x takes more time.
The previous discussion (as far as I could see) addresses that lxvd2x performs better than lvsl/lvx/vperm/lvx in all cases. Is that correct? Is my analysis wrong?
This issue concerned me, once lxvd2x is heavily used on compiled code.