hh.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Manycore performance analysis using timed configuration graphs
Högskolan i Halmstad, Sektionen för Informationsvetenskap, Data– och Elektroteknik (IDE), Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).
Högskolan i Halmstad, Sektionen för Informationsvetenskap, Data– och Elektroteknik (IDE), Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).ORCID-id: 0000-0001-6625-6533
2009 (engelsk)Inngår i: International Symposium on Systems, Architectures, Modeling, and Simulation, 2009. SAMOS '09 / [ed] Michael Joseph Schulte and Walid Najjar, Piscataway, N.J.: IEEE Press, 2009, s. 108-117Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

The programming complexity of increasingly parallel processors calls for new tools to assist programmers in utilising the parallel hardware resources. In this paper we present a set of models that we have developed to form part of a tool which is intended for iteratively tuning the mapping of dataflow graphs onto manycores. One of the models is used for capturing the essentials of manycores that are identified as suitable for signal processing and which we use as target architectures. Another model is the intermediate representation in the form of a timed configuration graph, describing the mapping of a dataflow graph onto a machine model. Moreover, this IR can be used for performance evaluation using abstract interpretation. We demonstrate how the models can be configured and applied in order to map applications on the Raw processor. Furthermore, we report promising results on the accuracy of performance predictions produced by our tool. It is also demonstrated that the tool can be used to rank different mappings with respect to optimisation on throughput and end-to-end latency.

sted, utgiver, år, opplag, sider
Piscataway, N.J.: IEEE Press, 2009. s. 108-117
Emneord [en]
graphs, microcomputers, parallel architectures, parallel programming, program compilers, software performance evaluation, task analysis
HSV kategori
Identifikatorer
URN: urn:nbn:se:hh:diva-5987DOI: 10.1109/ICSAMOS.2009.5289221ISI: 000276377000014Scopus ID: 2-s2.0-71949094275ISBN: 978-1-4244-4502-8 OAI: oai:DiVA.org:hh-5987DiVA, id: diva2:353074
Konferanse
2009 International Conference on Embedded Computer Systems: Architectures, Modeling and Simulation, IC-SAMOS 2009, Samos, 20 - 23 July, 2009
Merknad

©2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Tilgjengelig fra: 2010-09-23 Laget: 2010-09-23 Sist oppdatert: 2018-03-23bibliografisk kontrollert
Inngår i avhandling
1. Models and Methods for Development of DSP Applications on Manycore Processors
Åpne denne publikasjonen i ny fane eller vindu >>Models and Methods for Development of DSP Applications on Manycore Processors
2009 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Advanced digital signal processing systems require specialized high-performance embedded computer architectures. The term high-performance translates to large amounts of data and computations per time unit. The term embedded further implies requirements on physical size and power efficiency. Thus the requirements are of both functional and non-functional nature. This thesis addresses the development of high-performance digital signal processing systems relying on manycore technology. We propose building two-level hierarchical computer architectures for this domain of applications. Further, we outline a tool flow based on methods and analysis techniques for automated, multi-objective mapping of such applications on distributed memory manycore processors. In particular, the focus is put on how to provide a means for tunable strategies for mapping of task graphs on array structured distributed memory manycores, with respect to given application constraints. We argue for code mapping strategies based on predicted execution performance, which can be used in an auto-tuning feedback loop or to guide manual tuning directed by the programmer. Automated parallelization, optimisation and mapping to a manycore processor benefits from the use of a concurrent programming model as the starting point. Such a model allows the programmer to express different types and granularities of parallelism as well as computation characteristics of importance in the addressed class of applications. The programming model should also abstract away machine dependent hardware details. The analytical study of WCDMA baseband processing in radio base stations, presented in this thesis, suggests dataflow models as a good match to the characteristics of the application and as execution model abstracting computations on a manycore. Construction of portable tools further requires a manycore machine model and an intermediate representation. The models are needed in order to decouple algorithms, used to transform and map application software, from hardware. We propose a manycore machine model that captures common hardware resources, as well as resource dependent performance metrics for parallel computation and communication. Further, we have developed a multifunctional intermediate representation, which can be used as source for code generation and for dynamic execution analysis. Finally, we demonstrate how we can dynamically analyse execution using abstract interpretation on the intermediate representation. It is shown that the performance predictions can be used to accurately rank different mappings by best throughput or shortest end-to-end computation latency.

sted, utgiver, år, opplag, sider
Göteborg: Chalmers University of Technology, 2009. s. 173
Serie
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie, ISSN 0346-718X ; 2969
Emneord
parallel processing, manycore processors, high-performance digital signal processing, dataflow, concurrent models of computation, parallel code mapping, parallel machine model, dynamic performance analysis
HSV kategori
Identifikatorer
urn:nbn:se:hh:diva-14706 (URN)978-91-7385-288-3 (ISBN)
Disputas
2009-06-10, Wigforssalen, house Visionen, Halmstad University, Kristian IV:s väg 3, Halmstad, 13:15 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2011-04-20 Laget: 2011-04-04 Sist oppdatert: 2018-03-23bibliografisk kontrollert

Open Access i DiVA

fulltekst(726 kB)344 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 726 kBChecksum SHA-512
ab0ffbb60bbcf61529f7ce8c0a0813010de7ee7ad03c04e09d66646d3bcd89e54b7d0ae7aceefdd4b44f15270e048ee294626a014a6cff2da6b8f5f29ac36733
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Bengtsson, JerkerSvensson, Bertil

Søk i DiVA

Av forfatter/redaktør
Bengtsson, JerkerSvensson, Bertil
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 344 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 517 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf