This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Thu, Apr 12, 2012 at 5:01 PM, Andrey Belevantsev<abel@ispras.ru> wrote:On 12.04.2012 16:38, Richard Guenther wrote:
On Thu, Apr 12, 2012 at 2:36 PM, Igor Zamyatin<izamyatin@gmail.com> wrote:
On Thu, Apr 12, 2012 at 4:24 PM, Richard Guenther <richard.guenther@gmail.com> wrote:
On Thu, Apr 12, 2012 at 2:00 PM, Alexander Monakov<amonakov@ispras.ru> wrote:
Can atom execute two IMUL in parallel? Or what exactly is the pipeline behavior?
As I understand from Intel's optimization reference manual, the behavior is as follows: if the instruction immediately following IMUL has shorter latency, execution is stalled for 4 cycles (which is IMUL's latency); otherwise, a 4-or-more cycles latency instruction can be issued after IMUL without a stall. In other words, IMUL is pipelined with respect to other long-latency instructions, but not to short-latency instructions.
It seems to be modeled in the pipeline description though:
;;; imul insn has 5 cycles latency (define_reservation "atom-imul-32" "atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4, atom-port-0")
;;; imul instruction excludes other non-FP instructions. (exclusion_set "atom-eu-0, atom-eu-1" "atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4")
The main idea is quite simple:
If we are going to schedule IMUL instruction (it is on the top of ready list) we try to find out producer of other (independent) IMUL instruction that is in ready list too. The goal is try to schedule such a producer to get another IMUL in ready list and get scheduling of 2 successive IMUL instructions.
Why does that not happen without your patch? Does it never happen without your patch or does it merely not happen for one EEMBC benchmark (can you provide a testcase?)?
It does not happen because the scheduler by itself does not do such specific reordering. That said, it is easy to imagine the cases where this patch will make things worse rather than better.
Igor, why not try different subtler mechanisms like adjust_priority, which is get called when an insn is added to the ready list? E.g. increase the producer's priority.
The patch as is misses checks for NONDEBUG_INSN_P. Also, why bail out when you have more than one imul in the ready list? Don't you want to bump the priority of the other imul found?
Could you provide some examples when this patch would harm the performance?
Sched_reorder was chosen since it is used in other ports and looks most suitable for such case, e.g. it provides access to the whole ready list. BTW, just increasing producer's priority seems to be more risky in performance sense - we can incorrectly start delaying some instructions.
Thought ready list doesn't contain DEBUG_INSN... Is it so? If it contains them - this could be added easily
The case with more than one imul in the ready list wasn't considered just because the goal was to catch the particular case when there is a risk to get the following picture: "imul-producer-imul" which is less effective than "producer-imul-imul" for Atom
Andrey
And MD allows us only prefer scheduling of successive IMUL instructions, i.e. If IMUL was just scheduled and ready list contains another IMUL instruction then it will be chosen as the best candidate for scheduling.
at least from my very limited guessing of what the above does. So, did you analyze why the scheduler does not properly schedule things for you?
Richard.
From reading the patch, I could not understand the link between pipeline behavior and what the patch appears to do.
Hope that helps.
Alexander
Thanks, Igor
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |