Autonomous vehicles represent the pinnacle of modern technological innovation, navigating complex and unpredictable environments. To do so effectively, they rely on a sophisticated array of sensors. This thesis explores two of the most crucial sensors: LiDARs, known for their accuracy in generating detailed 3D maps of the environment, and RGB cameras, essential for processing visual cues critical for navigation. Together, these sensors form a comprehensive perception system that enables autonomous vehicles to operate safely and efficiently.
However, the reliability of these vehicles has yet to be tested when key sensors fail. The abrupt failure of a camera, for instance, disrupts the vehicle’s perception system, creating a significant gap in sensory input. This thesis addresses this challenge by introducing a novel multi-modal domain translation framework that integrates LiDAR and RGB camera data while ensuring continuous functionality despite sensor failures. At the core of this framework is an innovative model capable of synthesizing RGB images and their corresponding segment maps from raw LiDAR data by exploiting the scene semantics. The proposed framework stands out as the first of its kind, demonstrating for the first time that the scene semantics can bridge the gap across different domains with distinct data structures, such as unorganized sparse 3D LiDAR point clouds and structured 2D camera data. Thus, this thesis represents a significant leap forward in the field, offering a robust solution to the challenge of RGB data recovery without camera sensors.
The practical application of this model is thoroughly explored in the thesis. It involves testing the model’s capability to generate pseudo point clouds from RGB depth estimates, which, when combined with LiDAR data, create an enriched perception dataset. This enriched dataset is pivotal in enhancing object detection capabilities, a fundamental aspect of autonomous vehicle navigation. The quantitative and qualitative evidence reported in this thesis demonstrates that the synthetic generation of data not only compensates for the loss of sensory input but also considerably improves the performance of object detection systems compared to using raw LiDAR data only.
By addressing the critical issue of sensor failure and presenting viable solutions, this thesis contributes to enhancing the safety, reliability, and efficiency of autonomous vehicles. It paves the way for further research and developiment, setting a new standard for autonomous vehicle technology in scenarios of sensor malfunctions or adverse environmental conditions.