PING^8 [PATCH 0/9] rs6000: Rework rs6000_emit_vector_compare

Kewen.Lin linkw@linux.ibm.com
Tue Dec 12 06:08:31 GMT 2023


Hi,

Gentle ping this series:

https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607146.html

BR,
Kewen

>>>>>>> on 2022/11/24 17:15, Kewen Lin wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Following Segher's suggestion, this patch series is to rework
>>>>>>>> function rs6000_emit_vector_compare for vector float and int
>>>>>>>> in multiple steps, it's based on the previous attempts [1][2].
>>>>>>>> As mentioned in [1], the need to rework this for float is to
>>>>>>>> make a centralized place for vector float comparison handlings
>>>>>>>> instead of supporting with swapping ops and reversing code etc.
>>>>>>>> dispersedly.  It's also for a subsequent patch to handle
>>>>>>>> comparison operators with or without trapping math (PR105480).
>>>>>>>> With the handling on vector float reworked, we can further make
>>>>>>>> the handling on vector int simplified as shown.
>>>>>>>>
>>>>>>>> For Segher's concern about whether this rework causes any
>>>>>>>> assembly change, I constructed two testcases for vector float[3]
>>>>>>>> and int[4] respectively before, it showed the most are fine
>>>>>>>> excepting for the difference on LE and UNGT, it's demonstrated
>>>>>>>> as improvement since it uses GE instead of GT ior EQ.  The
>>>>>>>> associated test case in patch 3/9 is a good example.
>>>>>>>>
>>>>>>>> Besides, w/ and w/o the whole patch series, I built the whole
>>>>>>>> SPEC2017 at options -O3 and -Ofast separately, checked the
>>>>>>>> differences on object assembly.  The result showed that the
>>>>>>>> most are unchanged, except for:
>>>>>>>>
>>>>>>>>   * at -O3, 521.wrf_r has 9 object files and 526.blender_r has
>>>>>>>>     9 object files with differences.
>>>>>>>>
>>>>>>>>   * at -Ofast, 521.wrf_r has 12 object files, 526.blender_r has
>>>>>>>>     one and 527.cam4_r has 4 object files with differences.
>>>>>>>>
>>>>>>>> By looking into these differences, all significant differences
>>>>>>>> are caused by the known improvement mentined above transforming
>>>>>>>> GT ior EQ to GE, which can also affect unrolling decision due
>>>>>>>> to insn count.  Some other trivial differences are branch
>>>>>>>> target offset difference, nop difference for alignment, vsx
>>>>>>>> register number differences etc.
>>>>>>>>
>>>>>>>> I also evaluated the runtime performance for these changed
>>>>>>>> benchmarks, the result is neutral.
>>>>>>>>
>>>>>>>> These patches are bootstrapped and regress-tested
>>>>>>>> incrementally on powerpc64-linux-gnu P7 & P8, and
>>>>>>>> powerpc64le-linux-gnu P9 & P10.
>>>>>>>>
>>>>>>>> Is it ok for trunk?
>>>>>>>>
>>>>>>>> BR,
>>>>>>>> Kewen
>>>>>>>> -----
>>>>>>>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606375.html
>>>>>>>> [2] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606376.html
>>>>>>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606504.html
>>>>>>>> [4] https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606506.html
>>>>>>>>
>>>>>>>> Kewen Lin (9):
>>>>>>>>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p1
>>>>>>>>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p2
>>>>>>>>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p3
>>>>>>>>   rs6000: Rework vector float comparison in rs6000_emit_vector_compare - p4
>>>>>>>>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p1
>>>>>>>>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p2
>>>>>>>>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p3
>>>>>>>>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p4
>>>>>>>>   rs6000: Rework vector integer comparison in rs6000_emit_vector_compare - p5
>>>>>>>>
>>>>>>>>  gcc/config/rs6000/rs6000.cc                 | 180 ++++++--------------
>>>>>>>>  gcc/testsuite/gcc.target/powerpc/vcond-fp.c |  25 +++
>>>>>>>>  2 files changed, 74 insertions(+), 131 deletions(-)
>>>>>>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/vcond-fp.c
>>>>>>>>
>>


More information about the Gcc-patches mailing list