hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Exploring Architectures for Accelerating Advanced Massive MIMO Algorithms and Applications
Halmstad University, School of Information Technology.ORCID iD: 0009-0007-6933-1608
2025 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

The increasing demand for high-speed, reliable wireless communication has driven the adoption of massive multiple-input multiple-output (MIMO) systems, which leverage a large number of antennas to enhance performance. Despite their potential, massive MIMO systems introduce substantial computational challenges, for instance in uplink detection. This thesis addresses these challenges by exploring software acceleration techniques to optimize the performance of massive MIMO systems.

In the context of uplink detection, the study focuses on linear detection algorithms, such as zero-forcing (ZF) and minimum mean square error (MMSE) and employs graphics processing units (GPUs) to accelerate matrix operations. Techniques such as block Cholesky and QR decompositions were implemented to reduce computational overhead. The results demonstrate significant reductions in execution time and improvements in scalability, achieving notable speedups while balancing precision and performance trade-offs.

Additionally, the thesis investigates the application of convolutional neural networks (CNNs) for channel state information (CSI)-based positioning. By optimizing CNN architectures and employing pruning techniques, the study enhances localization accuracy while minimizing computational requirements. These advancements enable precise positioning in resource-constrained environments, supporting advanced applications in 5G and beyond.

The thesis also proposes future work directions, emphasizing the potential of hardware-based implementations using the dataflow model of computation within the compute abstraction layer (CAL) framework. By modelling algorithms as actor networks with explicit data dependencies, CAL facilitates efficient and scalable hardware designs, particularly for field-programmable gate arrays (FPGAs). This approach offers a promising pathway to address the increasing computational demands of next-generation massive MIMO systems.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2025. , p. 24
Series
Halmstad University Dissertations ; 131
Keywords [en]
High-performance computing, parallel computing, massive MIMO, software acceleration, convolutional neural networks
National Category
Embedded Systems Communication Systems
Identifiers
URN: urn:nbn:se:hh:diva-55976ISBN: 978-91-89587-79-3 (print)ISBN: 978-91-89587-78-6 (electronic)OAI: oai:DiVA.org:hh-55976DiVA, id: diva2:1957787
Presentation
2025-06-05, J102 Wigforss, Kristian IV:s väg 3, Halmstad, 13:00 (English)
Opponent
Supervisors
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications, B02Available from: 2025-05-15 Created: 2025-05-12 Last updated: 2025-10-01Bibliographically approved
List of papers
1. Enhancing the Accuracy of CSI-Based Positioning in Massive MIMO Systems
Open this publication in new window or tab >>Enhancing the Accuracy of CSI-Based Positioning in Massive MIMO Systems
2023 (English)In: 2023 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), IEEE, 2023, p. 90-95Conference paper, Published paper (Refereed)
Abstract [en]

Massive Multiple-Input Multiple-Output (MIMO) communication systems are being investigated intensively for positioning services. Enhancing the accuracy on these services in terms of accurate positioning of users is an important goal to improve related applications in the future. Convolutional Neural Networks (CNNs) has been proposed to infer the position of a user from Channel State Information (CSI) of a massive MIMO system. This paper investigates different architectures of CNNs to enhance the accuracy of a fingerprint-based positionina system. Three new CNNs has been proposed in which the Convolutional Layer (CL) and the Fully Connected (FC) layer are re-dimensioned. Batch Normalization (BN) layer is introduced to the layer structure of the newly proposed CNNs. The CNNs were trained, and accordingly mean error is measured. The first re-constructed CNN composed of 13 CLs, 7 BNs, and 3 FC layers has achieved the best accuracy out of the three models. It managed to achieve a mean error of 10.09 mm, that outperforms a similar work by 82 % in terms of positioning accuracy. Pruning was added to the layer structure of the newly proposed CNN s. It reduced the model size significantly, approximately by 65 % compared to a similar model of previous work.

Place, publisher, year, edition, pages
IEEE, 2023
Keywords
Convolutional Neural Networks, Pruning, Batch Normalization, Positioning Accuracy, Model Size
National Category
Communication Systems Computer Sciences
Identifiers
urn:nbn:se:hh:diva-51986 (URN)10.1109/BlackSeaCom58138.2023.10299742 (DOI)979-8-3503-3782-2 (ISBN)979-8-3503-3783-9 (ISBN)
Conference
2023 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Istanbul, Turkey, July 4-7, 2023
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications, B02
Available from: 2023-11-13 Created: 2023-11-13 Last updated: 2025-10-01Bibliographically approved
2. Software Acceleration of Multi-user MIMO Uplink Detection on GPU
Open this publication in new window or tab >>Software Acceleration of Multi-user MIMO Uplink Detection on GPU
2025 (English)In: Parallel Computing, ISSN 0167-8191, E-ISSN 1872-7336, Vol. 125, p. 1-15, article id 103150Article in journal (Refereed) Published
Abstract [en]

This paper presents the exploration of GPU-accelerated block-wise decompositions for zero-forcing (ZF) based QR and Cholesky methods applied to massive multiple-input multiple-output (MIMO) uplink detection algorithms. Three algorithms are evaluated: ZF with block Cholesky decomposition, ZF with block QR decomposition (QRD), and minimum mean square error (MMSE) with block Cholesky decomposition. The latter was the only one previously explored, but it used standard Cholesky decomposition. Our approach achieves an 11% improvement over the previous GPU-accelerated MMSE study.

Through performance analysis, we observe a trade-off between precision and execution time. Reducing precision from FP64 to FP32 improves execution time but increases bit error rate (BER), with ZF-based QRD reducing execution time from 2.04 μs to 1.24 μs for a 128 × 8 MIMO size. The study also highlights that larger MIMO sizes, particularly 2048 × 32, require GPUs to fully utilize their computational and memory capabilities, especially under FP64 precision. In contrast, smaller matrices are compute-bound.

Our results recommend GPUs for larger MIMO sizes, as they offer the parallelism and memory resources necessary to efficiently handle the computational demands of next-generation networks. This work paves the way forscalable, GPU-based massive MIMO uplink detection systems. © 2025 The Authors. Published by Elsevier B.V.

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2025
Keywords
High-performance computing, parallel computing, massive MIMO, uplink detection, matrix decomposition
National Category
Computer Systems Communication Systems
Identifiers
urn:nbn:se:hh:diva-56024 (URN)10.1016/j.parco.2025.103150 (DOI)
Funder
ELLIIT - The Linköping‐Lund Initiative on IT and Mobile Communications, B02
Available from: 2025-05-14 Created: 2025-05-14 Last updated: 2025-10-28Bibliographically approved

Open Access in DiVA

Fulltext(1811 kB)246 downloads
File information
File name FULLTEXT02.pdfFile size 1811 kBChecksum SHA-512
efe38e7c817e103902837ad21ca61eaa3f776942cff13e7eb5c400d096309afa3d5b9834869373179451e035288b1e48908fd0d8d391730ce53d0559744e4395
Type fulltextMimetype application/pdf

Authority records

Nada, Ali

Search in DiVA

By author/editor
Nada, Ali
By organisation
School of Information Technology
Embedded SystemsCommunication Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 248 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 664 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf