hh.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Algorithms for implementing roots, inverse and inverse roots in hardware
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).ORCID-id: 0000-0003-4828-7488
Lund University, Lund, Sweden.
Lund University, Lund, Sweden.
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Centrum för forskning om inbyggda system (CERES).ORCID-id: 0000-0001-6625-6533
Visa övriga samt affilieringar
2016 (Engelska)Manuskript (preprint) (Övrigt vetenskapligt)
Abstract [en]

In applications as in future MIMO communication systems a massive computation of complex matrix operations, such as QR decomposition, is performed. In these matrix operations, the functions roots, inverse and inverse roots are computed in large quantities. Therefore, to obtain high enough performance in such applications, efficient algorithms are highly important. Since these algorithms need to be realized in hardware it must also be ensured that they meet high requirements in terms of small chip area, low computation time and low power consumption. Power consumption is particularly important since many applications are battery powered.For most unary functions, directly applying an approximation methodology in a straightforward way will not lead to an efficient implementation. Instead, a dedicated algorithm often has to be developed. The functions roots, inverse and inverse roots are in this category. The developed approaches are founded on working in a floating-point format. For the roots functions also a change of number base is used. These procedures not only enable simpler solutions but also increased accuracy, since the approximation algorithm is performed on a mantissa of limited range.As a summarizing example the inverse square root is chosen. For comparison, the inverse square root is implemented using two methodologies: Harmonized Parabolic Synthesis and Newton-Raphson method. The novel methodology, Harmonized Parabolic Synthesis (HPS), is chosen since it has been demonstrated to provide very efficient approximations. The Newton-Raphson (NR) method is chosen since it is known for providing a very efficient implementation of the inverse square root. It is also commonly used in signal processing applications for computing approximations on fixed-point numbers of a limited range. Four implementations are made; HPS with 32 and 512 interpolation intervals and NR with 1 and 2 iterations. Summarizing the comparisons of the hardware performance, the implementations HPS 32, HPS 512 and NR 1 are comparable when it comes to hardware performance, while NR 2 is much worse. However, HPS 32 stands out in terms of better performance when it comes to the distribution of the error.

Ort, förlag, år, upplaga, sidor
2016. , s. 26
Nyckelord [en]
Approximation, unary functions, elementary functions, arithmetic computation, root, inverse, inverse roots, harmonized parabolic synthesis, Newton-Raphson method
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
URN: urn:nbn:se:hh:diva-30860OAI: oai:DiVA.org:hh-30860DiVA, id: diva2:927014
Anmärkning

Som manuskript i avhandling. As manuscript in dissertation.

Tillgänglig från: 2016-05-10 Skapad: 2016-05-10 Senast uppdaterad: 2018-03-22Bibliografiskt granskad
Ingår i avhandling
1. Methodologies for Approximation of Unary Functions and Their Implementation in Hardware
Öppna denna publikation i ny flik eller fönster >>Methodologies for Approximation of Unary Functions and Their Implementation in Hardware
2016 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

Applications in computer graphics, digital signal processing, communication systems, robotics, astrophysics, fluid physics and many other areas have evolved to become very computation intensive. Algorithms are becoming increasingly complex and require higher accuracy in the computations. In addition, software solutions for these applications are in many cases not sufficient in terms of performance. A hardware implementation is therefore needed. A recurring bottleneck in the algorithms is the performance of the approximations of unary functions, such as trigonometric functions, logarithms and the square root, as well as binary functions such as division. The challenge is therefore to develop a methodology for the implementation of approximations of unary functions in hardware that can cope with the growing requirements. The methodology is required to result in fast execution time, low complexity basic operations that are simple to implement in hardware, and – sincemany applications are battery powered – low power consumption. To ensure appropriate performance of the entire computation in which the approximation is a part, the characteristics and distribution of the approximation error are also things that must be possible to manage. The new approximation methodologies presented in this thesis are of the type that aims to reduce the sizes of the look-up tables by the use of auxiliary functions. They are founded on a synthesis of parabolic functions by multiplication – instead of addition, which is the most common. Three approximation methodologies have been developed; the two last being further developments of the first. For some functions, such as roots, inverse and inverse roots, a straightforward solution with an approximation is not manageable. Since these functions are frequent in many computation intensive algorithms, it is necessary to find very efficient implementations of these functions. New methods for this are also presented in this thesis. They are all founded on working in a floating-point format, and, for the roots functions, a change of number base is also used. The transformations not only enable simpler solutions but also increased accuracy, since the approximation algorithm is performed on a mantissa of limited range. Tools for error analysis have been developed as well. The characteristics and distribution of the approximation error in the new methodologies are presented and compared with existing state-of-the-art methods such as CORDIC. The verification and evaluation of the solutions have to a large extent been made as comparative ASIC implementations with other approximation methods, separately or embedded in algorithms. As an example, an implementation of the logarithm made using the third methodology developed, Harmonized Parabolic Synthesis (HPS), is compared with an implementation using the CORDIC algorithm. Both implementations are designed to provide 15-bit resolution. The design implemented using HPS performs 12 times better than the CORDIC implementation in terms of throughput. In terms of energy consumption, the new methodology consumes 96% less. The chip area is 60% smaller than for the CORDIC algorithm. In summary, the new approximation methodologies presented are found to well meet the demanding requirements that exist in this area.

Ort, förlag, år, upplaga, sidor
Halmstad: Halmstad University Press, 2016. s. 76
Serie
Halmstad University Dissertations ; 21
Nationell ämneskategori
Inbäddad systemteknik
Identifikatorer
urn:nbn:se:hh:diva-30983 (URN)978-91-87045-45-5 (ISBN)978-91-87045-44-8 (ISBN)
Disputation
2016-09-02, Wigforssalen, Halmstad, 13:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2016-06-08 Skapad: 2016-05-31 Senast uppdaterad: 2018-03-22Bibliografiskt granskad

Open Access i DiVA

fulltext(1057 kB)764 nedladdningar
Filinformation
Filnamn FULLTEXT02.pdfFilstorlek 1057 kBChecksumma SHA-512
ae08cd19cdcf040c02635ca538611d3d3ec2d3cb4fb8fd7abc44e271cb747d55591c259826bf5e6df4aef90a7a4d6a2f3192b5adb1789b49a5430e71978bf25f
Typ fulltextMimetyp application/pdf

Personposter BETA

Hertz, ErikSvensson, Bertil

Sök vidare i DiVA

Av författaren/redaktören
Hertz, ErikSvensson, Bertil
Av organisationen
Centrum för forskning om inbyggda system (CERES)
Inbäddad systemteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 764 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

urn-nbn

Altmetricpoäng

urn-nbn
Totalt: 505 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf