[Bug target/87832] AMD pipeline models are very costly size-wise

cvs-commit at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Nov 16 13:41:54 GMT 2022


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Alexander Monakov <amonakov@gcc.gnu.org>:

https://gcc.gnu.org/g:dd744f06c9952f92738b0860630085f0f0b99574

commit r13-4092-gdd744f06c9952f92738b0860630085f0f0b99574
Author: Alexander Monakov <amonakov@ispras.ru>
Date:   Tue Nov 1 17:04:25 2022 +0300

    i386: correct x87&SSE division modeling in znver.md

    Correct modeling of division instructions in the SIMD/FP domain for
    AMD Zen architectures and avoid combinatorial explosion of automaton
    tables by modeling the separate floating-point division unit and
    correcting reservations to reflect reciprocal throughput of the
    corresponding instructions, similar to earlier commit
    5cee5f94000 ("i386: correct integer division modeling in znver.md").

    Division is partially pipelined and some instructions have fractional
    throughput (e.g. Zen 3 can issue divss and divsd each 3.5 and 4.5
    cycles on average, respectively). Considering these CPUs implement
    out-of-order execution, the model doesn't need to be exact to the last
    cycle, so simplify it by using 4/5 cycles for SF/DF modes, and not
    modeling the fact that FP3 pipe is occupied for one cycle.

    Top znver table sizes in insn-automata.o:

    Before:

    428108 r znver1_fp_min_issue_delay
    856216 r znver1_fp_transitions

    After:

    30056 r znver1_fp_min_issue_delay
    120224 r znver1_fp_transitions

    gcc/ChangeLog:

            PR target/87832
            * config/i386/znver.md (znver1_fdiv): New automaton.
            (znver1-fdiv): New unit.
            (znver1_fp_op_div): Correct unit and cycles in the reservation.
            (znver1_fp_op_div_load): Ditto.
            (znver1_fp_op_idiv_load): Ditto.
            (znver2_fp_op_idiv_load): Ditto.
            (znver1_ssediv_ss_ps): Ditto.
            (znver1_ssediv_ss_ps_load): Ditto.
            (znver1_ssediv_sd_pd): Ditto.
            (znver1_ssediv_sd_pd_load): Ditto.
            (znver1_ssediv_avx256_ps): Ditto.
            (znver1_ssediv_avx256_ps_load): Ditto.
            (znver1_ssediv_avx256_pd): Ditto.
            (znver1_ssediv_avx256_pd_load): Ditto.


More information about the Gcc-bugs mailing list