hh.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Compiling Concurrent Programs for Manycores
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).
2015 (Engelska)Licentiatavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

The arrival of manycore systems enforces new approaches for developing applications in order to exploit the available hardware resources. Developing applications for manycores requires programmers to partition the application into subtasks, consider the dependence between the subtasks, understand the underlying hardware and select an appropriate programming model. This is complex, time-consuming and prone to error.

In this thesis, we identify and implement abstraction layers in compilation tools to decrease the burden of the programmer, increase programming productivity and program portability for manycores and to analyze their impact on performance and efficiency. We present compilation frameworks for two concurrent programming languages, occam-pi and CAL Actor Language, and demonstrate the applicability of the approach with application case-studies targeting these different manycore architectures: STHorm, Epiphany and Ambric.

For occam-pi, we have extended the Tock compiler and added a backend for STHorm. We evaluate the approach using a fault tolerance model for a four stage 1D-DCT algorithm implemented by using occam-pi’s constructs for dynamic reconfiguration, and the FAST corner detection algorithm which demonstrates the suitability of occam-pi and the compilation framework for data-intensive applications. We also present a new CAL compilation framework which has a front end, two intermediate representations and three backends: for a uniprocessor, Epiphany, and Ambric. We show the feasibility of our approach by compiling a CAL implementation of the 2D-IDCT for the three backends. We also present an evaluation and optimization of code generation for Epiphany by comparing the code generated from CAL with a hand-written C code implementation of 2D-IDCT.

Ort, förlag, år, upplaga, sidor
Halmstad: Halmstad University Press , 2015. , s. 35
Serie
Halmstad University Dissertations ; 11
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
URN: urn:nbn:se:hh:diva-27789ISBN: 978-91-87045-25-7 ISBN: 978-91-87045-24-0 OAI: oai:DiVA.org:hh-27789DiVA, id: diva2:788338
Presentation
2015-03-20, Haldasalen, House Visionen, Halmstad University, Halmstad, 10:15 (Engelska)
Opponent
Handledare
Tillgänglig från: 2015-02-16 Skapad: 2015-02-13 Senast uppdaterad: 2018-03-22Bibliografiskt granskad
Delarbeten
1. Managing Dynamic Reconfiguration for Fault-tolerance on a Manycore Architecture
Öppna denna publikation i ny flik eller fönster >>Managing Dynamic Reconfiguration for Fault-tolerance on a Manycore Architecture
2012 (Engelska)Ingår i: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012, New York, USA: IEEE Computer Society, 2012, s. 312-319, artikel-id 6270657Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

With the advent of manycore architectures comprising hundreds of processing elements, fault management has become a major challenge. We present an approach that uses the occam-pi language to manage the fault recovery mechanism on a new manycore architecture, the Platform 2012 (P2012). The approach is made possible by extending our previously developed compiler framework to compile occam-pi implementations to the P2012 architecture. We describe the techniques used to translate the salient features of the occam-pi language to the native programming model of the P2012 architecture. We demonstrate the applicability of the approach by an experimental case study, in which the DCT algorithm is implemented on a set of four processing elements. During run-time, some of the tasks are then relocated from assumed faulty processing elements to the faultless ones by means of dynamic reconfiguration of the hardware. The working of the demonstrator and the simulation results illustrate not only the feasibility of the approach but also how the use of higher-level abstractions simplifies the fault handling. © 2012 IEEE.

Ort, förlag, år, upplaga, sidor
New York, USA: IEEE Computer Society, 2012
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
urn:nbn:se:hh:diva-17336 (URN)10.1109/IPDPSW.2012.38 (DOI)000309409400035 ()2-s2.0-84867429212 (Scopus ID)
Konferens
26th IEEE International Parallel & Distributed Processing Symposium, May 21-25, Regal Shanghai East Asia Hotel Shanghai, China, 2012
Projekt
SMECY
Anmärkning

The research leading to these results has received funding from the ARTEMIS Joint Undertaking under grant agreement number 100230 and from the national programmes / funding authorities.

Tillgänglig från: 2012-04-12 Skapad: 2012-03-01 Senast uppdaterad: 2018-03-22Bibliografiskt granskad
2. Programming Real-time Image Processing for Manycores in a High-level Language
Öppna denna publikation i ny flik eller fönster >>Programming Real-time Image Processing for Manycores in a High-level Language
Visa övriga...
2013 (Engelska)Ingår i: Advanced Parallel Processing Technology / [ed] Wu, Chenggang and Cohen, Albert, Berlin Heidelberg: Springer Berlin/Heidelberg, 2013, s. 381-395Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Manycore architectures are gaining attention as a means to meet the performance and power demands of high-performance embedded systems. However, their widespread adoption is sometimes constrained by the need formastering proprietary programming languages that are low-level and hinder portability. We propose the use of the concurrent programming language occam-pi as a high-level language for programming an emerging class of manycore architectures. We show how to map occam-pi programs to the manycore architecture Platform 2012 (P2012). We describe the techniques used to translate the salient features of the language to the native programming model of the P2012. We present the results from a case study on a representative algorithm in the domain of real-time image processing: a complex algorithm for corner detectioncalled Features from Accelerated Segment Test (FAST). Our results show that the occam-pi program is much shorter, is easier to adapt and has a competitive performance when compared to versions programmed in the native programming model of P2012 and in OpenCL.

Ort, förlag, år, upplaga, sidor
Berlin Heidelberg: Springer Berlin/Heidelberg, 2013
Serie
Lecture Notes in Computer Science, ISSN 0302-9743 ; 8299
Nyckelord
Parallel programming, occam-pi, Manycore architectures, Realtime image processing
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
urn:nbn:se:hh:diva-24018 (URN)10.1007/978-3-642-45293-2_29 (DOI)2-s2.0-84893040633 (Scopus ID)978-3-642-45292-5 (ISBN)
Konferens
10th International Conference on Advanced Parallel Processing Technology, APPT 2013, Stockholm, August
Tillgänglig från: 2013-11-27 Skapad: 2013-11-27 Senast uppdaterad: 2018-03-22Bibliografiskt granskad
3. Realizing Efficient Execution of Dataflow Actors on Manycores
Öppna denna publikation i ny flik eller fönster >>Realizing Efficient Execution of Dataflow Actors on Manycores
Visa övriga...
2014 (Engelska)Ingår i: Proceedings: 2014 International Conference on Embedded and Ubiquitous Computing: EUC 2014: August 2014, Milano, Italy / [ed] Randall Bilof, Los Alamitos, CA: IEEE Computer Society, 2014, s. 321-328, artikel-id 6962305Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Embedded DSP computing is currently shifting towards manycore architectures in order to cope with the ever growing computational demands. Actor based dataflow languages are being considered as a programming model. In this paper we present a code generator for CAL, one such dataflow language. We propose to use a compilation tool with two intermediate representations. We start from a machine model of the actors that provides an ordering for testing of conditions and firing of actions. We then generate an Action Execution Intermediate Representation that is closer to a sequential imperative language like C and Java. We describe our two intermediate representations and show the feasibility and portability of our approach by compiling a CAL implementation of the Two-Dimensional Inverse Discrete Cosine Transform on a general purpose processor, on the Epiphany manycore architecture and on the Ambric massively parallel processor array. © 2014 IEEE.

Ort, förlag, år, upplaga, sidor
Los Alamitos, CA: IEEE Computer Society, 2014
Nyckelord
dataflow languages, compilation framework, code generation, manycore, CAL
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
urn:nbn:se:hh:diva-26991 (URN)10.1109/EUC.2014.55 (DOI)000358149800046 ()2-s2.0-84908625634 (Scopus ID)978-0-7695-5249-1 (ISBN)978-1-4799-7609-6 (ISBN)
Konferens
The 12th IEEE International Conference on Embedded and Ubiquitous Computing (EUC 2014), Milan, Italy, Aug. 26-28, 2014
Projekt
HiPEC
Forskningsfinansiär
KK-stiftelsenStiftelsen för strategisk forskning (SSF)
Tillgänglig från: 2014-11-05 Skapad: 2014-11-05 Senast uppdaterad: 2018-03-22Bibliografiskt granskad
4. An Evaluation of Code Generation of Dataflow Languages on Manycore Architectures
Öppna denna publikation i ny flik eller fönster >>An Evaluation of Code Generation of Dataflow Languages on Manycore Architectures
Visa övriga...
2014 (Engelska)Ingår i: RTCSA 2014: 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, Piscataway, NJ: IEEE Press, 2014, artikel-id 6910501Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Today computer architectures are shifting from single core to manycores due to several reasons such as performance demands, power and heat limitations. However, shifting to manycores results in additional complexities, especially with regard to efficient development of applications. Hence there is a need to raise the abstraction level of development techniques for the manycores while exposing the inherent parallelism in the applications. One promising class of programming languages is dataflow languages and in this paper we evaluate and optimize the code generation for one such language, CAL. We have also developed a communication library to support the inter-core communication.The code generation can target multiple architectures, but the results presented in this paper is focused on Adapteva's many core architecture Epiphany.We use the two-dimensional inverse discrete cosine transform (2D-IDCT) as our benchmark and compare our code generation from CAL with a hand-written implementation developed in C. Several optimizations in the code generation as well as in the communication library are described, and we have observed that the most critical optimization is reducing the number of external memory accesses. Combining all optimizations we have been able to reduce the difference in execution time between auto-generated and hand-written implementations from a factor of 4.3x down to a factor of only 1.3x. ©2014 IEEE.

Ort, förlag, år, upplaga, sidor
Piscataway, NJ: IEEE Press, 2014
Nyckelord
Manycore, Dataflow Languages, code generation, Actor Machine, 2D-IDCT, Epiphany, evaluation
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
urn:nbn:se:hh:diva-25649 (URN)10.1109/RTCSA.2014.6910501 (DOI)000352610400005 ()2-s2.0-84908637354 (Scopus ID)
Konferens
RTCSA 2014, 20th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Chongqing, China, August 20-22, 2014
Projekt
HiPEC project
Forskningsfinansiär
KK-stiftelsenStiftelsen för strategisk forskning (SSF)
Anmärkning

The authors would like to thank Adapteva Inc. for giving access to their software development suite and hardware board. This research is part of the CERES research program funded by the Knowledge Foundation and HiPEC project funded by Swedish Foundation for Strategic Research (SSF).

Tillgänglig från: 2014-06-16 Skapad: 2014-06-16 Senast uppdaterad: 2019-05-07Bibliografiskt granskad

Open Access i DiVA

LicEssa(1207 kB)716 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 1207 kBChecksumma SHA-512
b1ab03e5b49222e4d0d1292e480044c7c7900cb009969a431759a3f52a75c3954e12e89cb2e11d1007bce0ec34ef401283045a321946e3c84aebf73922221c05
Typ fulltextMimetyp application/pdf

Personposter BETA

Gebrewahid, Essayas

Sök vidare i DiVA

Av författaren/redaktören
Gebrewahid, Essayas
Av organisationen
Centrum för forskning om inbyggda system (CERES)
Inbäddad systemteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 716 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 2461 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf