hh.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Streaming Tiles: Flexible Implementation of Convolution Neural Networks Inference on Manycore Architectures
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).
Amrita University, Bengaluru, India.
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).ORCID-id: 0000-0002-4932-4036
2018 (engelsk)Inngår i: 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Los Alamitos: IEEE Computer Society, 2018, s. 867-876Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Convolution neural networks (CNN) are extensively used for deep learning applications such as image recognition and computer vision. The convolution module of these networks is highly compute-intensive. Having an efficient implementation of the convolution module enables realizing the inference part of the neural network on embedded platforms. Low precision parameters require less memory, less computation time, and less power consumption while achieving high classification accuracy. Furthermore, streaming the data over parallelized processing units saves a considerable amount of memory, which is a key concern in memory constrained embedded platforms. In this paper, we explore the design space for streamed CNN on Epiphany manycore architecture using varying precisions for weights (ranging from binary to 32-bit). Both AlexNet and GoogleNet are explored for two different memory sizes of Epiphany cores. We are able to achieve competitive performance for both Alexnet and GoogleNet with respect to emerging manycores. Furthermore, the effects of different design choices in terms of precision, memory size, and the number of cores are evaluated by applying the proposed method.

sted, utgiver, år, opplag, sider
Los Alamitos: IEEE Computer Society, 2018. s. 867-876
Emneord [en]
manycores, CNN, stream processing, embedded systems
HSV kategori
Identifikatorer
URN: urn:nbn:se:hh:diva-36887DOI: 10.1109/IPDPSW.2018.00138ISBN: 978-1-5386-5555-9 (digital)ISBN: 978-1-5386-5556-6 (tryckt)OAI: oai:DiVA.org:hh-36887DiVA, id: diva2:1212121
Konferanse
The 7th International Workshop on Parallel and Distributed Computing for Large Scale Machine Learning and Big Data Analytics, Vancouver, British Columbia, Canada, May 21, 2018
Prosjekter
NGES (Towards Next Generation Embedded Systems: Utilizing Parallelism and Reconfigurability)
Forskningsfinansiär
VINNOVA
Merknad

Funding: VINNOVA Strategic Innovation grant and the Department of Science and Technology, Government of India. ©2018 IEEE

Tilgjengelig fra: 2018-06-01 Laget: 2018-06-01 Sist oppdatert: 2018-08-20bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekst

Personposter BETA

Rezk, NesmaUl-Abdin, Zain

Søk i DiVA

Av forfatter/redaktør
Rezk, NesmaUl-Abdin, Zain
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 18 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf