An automatic thread decomposition approach for pipelined multithreading Online publication date: Wed, 30-Jul-2014
by Yuanming Zhang; Kanemitsu Ootsu; Takashi Yokota; Takanobu Baba
International Journal of High Performance Computing and Networking (IJHPCN), Vol. 7, No. 3, 2013
Abstract: Thread decomposition is critical for pipelined multithreading (PMT) to gain higher performance on target multi-core processors. This paper presents an automatic thread decomposition approach, which maps the decomposition problem onto a graph-theoretic framework to construct an optimised directed acyclic graph (DAG) with minimal bottleneck node size and balanced node size. In this approach, control dependence is treated as special data dependence and then an effective approach is proposed to remove redundant control dependences. A weighted DAG is constructed by assigning appropriate weights to all nodes and all dependences according to profile information. An automatic thread decomposition algorithm is given to generate an optimised pipeline based on the weighted DAG. The algorithm has been evaluated on a commodity multi-core processor, and experimental results show that it has achieved speedup ranging from 113% to 174% on some SPEC CPU 2000 benchmark programs.
Existing subscribers:
Go to Inderscience Online Journals to access the Full Text of this article.
If you are not a subscriber and you just want to read the full contents of this article, buy online access here.Complimentary Subscribers, Editors or Members of the Editorial Board of the International Journal of High Performance Computing and Networking (IJHPCN):
Login with your Inderscience username and password:
Want to subscribe?
A subscription gives you complete access to all articles in the current issue, as well as to all articles in the previous three years (where applicable). See our Orders page to subscribe.
If you still need assistance, please email subs@inderscience.com