hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Depth- and semantics-aware multi-modal domain translation: Generating 3D panoramic color images from LiDAR point clouds
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-8067-9521
Halmstad University, School of Information Technology.ORCID iD: 0000-0002-5712-6777
2024 (English)In: Robotics and Autonomous Systems, ISSN 0921-8890, E-ISSN 1872-793X, Vol. 171, p. 1-9, article id 104583Article in journal (Refereed) Published
Abstract [en]

This work presents a new depth-and semantics-aware conditional generative model, named TITAN-Next, for cross-domain image-to-image translation in a multi-modal setup between LiDAR and camera sensors. The proposed model leverages scene semantics as a mid-level representation and is able to translate raw LiDAR point clouds to RGB-D camera images by solely relying on semantic scene segments. We claim that this is the first framework of its kind and it has practical applications in autonomous vehicles such as providing a fail-safe mechanism and augmenting available data in the target image domain. The proposed model is evaluated on the large-scale and challenging Semantic-KITTI dataset, and experimental findings show that it considerably outperforms the original TITAN-Net and other strong baselines by 23.7% margin in terms of IoU. © 2023 The Author(s). 

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2024. Vol. 171, p. 1-9, article id 104583
Keywords [en]
Multi-modal domain translation, Semantic perception, LiDAR
National Category
Computer graphics and computer vision Robotics and automation
Identifiers
URN: urn:nbn:se:hh:diva-52943DOI: 10.1016/j.robot.2023.104583ISI: 001125648600001OAI: oai:DiVA.org:hh-52943DiVA, id: diva2:1846429
Funder
European Commission, 10106 9576Available from: 2024-03-22 Created: 2024-03-22 Last updated: 2025-02-05Bibliographically approved
In thesis
1. Semantics-aware Multi-modal Scene Perception for Autonomous Vehicles
Open this publication in new window or tab >>Semantics-aware Multi-modal Scene Perception for Autonomous Vehicles
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Autonomous vehicles represent the pinnacle of modern technological innovation, navigating complex and unpredictable environments. To do so effectively, they rely on a sophisticated array of sensors. This thesis explores two of the most crucial sensors: LiDARs, known for their accuracy in generating detailed 3D maps of the environment, and RGB cameras, essential for processing visual cues critical for navigation. Together, these sensors form a comprehensive perception system that enables autonomous vehicles to operate safely and efficiently.

However, the reliability of these vehicles has yet to be tested when key sensors fail. The abrupt failure of a camera, for instance, disrupts the vehicle’s perception system, creating a significant gap in sensory input. This thesis addresses this challenge by introducing a novel multi-modal domain translation framework that integrates LiDAR and RGB camera data while ensuring continuous functionality despite sensor failures. At the core of this framework is an innovative model capable of synthesizing RGB images and their corresponding segment maps from raw LiDAR data by exploiting the scene semantics. The proposed framework stands out as the first of its kind, demonstrating for the first time that the scene semantics can bridge the gap across different domains with distinct data structures, such as unorganized sparse 3D LiDAR point clouds and structured 2D camera data. Thus, this thesis represents a significant leap forward in the field, offering a robust solution to the challenge of RGB data recovery without camera sensors.

The practical application of this model is thoroughly explored in the thesis. It involves testing the model’s capability to generate pseudo point clouds from RGB depth estimates, which, when combined with LiDAR data, create an enriched perception dataset. This enriched dataset is pivotal in enhancing object detection capabilities, a fundamental aspect of autonomous vehicle navigation. The quantitative and qualitative evidence reported in this thesis demonstrates that the synthetic generation of data not only compensates for the loss of sensory input but also considerably improves the performance of object detection systems compared to using raw LiDAR data only.

By addressing the critical issue of sensor failure and presenting viable solutions, this thesis contributes to enhancing the safety, reliability, and efficiency of autonomous vehicles. It paves the way for further research and developiment, setting a new standard for autonomous vehicle technology in scenarios of sensor malfunctions or adverse environmental conditions.

Place, publisher, year, edition, pages
Halmstad: Halmstad University Press, 2024. p. 40
Series
Halmstad University Dissertations ; 117
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:hh:diva-53115 (URN)978-91-89587-50-2 (ISBN)978-91-89587-51-9 (ISBN)
Public defence
2024-06-13, Wigforss, hus J, Kristian IV:s väg 3, Halmstad, 09:00 (English)
Opponent
Supervisors
Available from: 2024-05-07 Created: 2024-04-08 Last updated: 2025-02-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Cortinhal, TiagoAksoy, Eren

Search in DiVA

By author/editor
Cortinhal, TiagoAksoy, Eren
By organisation
School of Information Technology
In the same journal
Robotics and Autonomous Systems
Computer graphics and computer visionRobotics and automation

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 529 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf