Full Text Available
Note: Clicking the button above will open the full text document at the original institutional repository in a new window.
Semantic segmentation is an essential technique to achieve scene understanding for various domains and applications. Particularly, it is of crucial importance in autonomous driving applications. Autonomous vehicles usually rely on cameras and light detection and ranging (LiDAR) sensors to gain conte...
| Main Author: | |
|---|---|
| Format: | Thesis |
| Published: |
AUC Knowledge Fountain
2022
|
| Subjects: | |
| Tags: |
No Tags, Be the first to tag this record!
|
| Summary: | Semantic segmentation is an essential technique to achieve scene understanding for various domains and applications. Particularly, it is of crucial importance in autonomous driving applications. Autonomous vehicles usually rely on cameras and light detection and ranging (LiDAR) sensors to gain contextual information from the environment. Semantic segmentation has been employed to process images and point clouds that were captured from cameras and LiDAR sensors respectively. One important research direction to consider is investigating the impact of utilizing temporal information in the domain of semantic segmentation. Many contributions exist in the field with regards to utilizing temporal information for semantic segmentation on 2D images. However, few studies tackled the usage of temporal information for semantic segmentation on 3D point clouds. Recent studies experimented with scan clustering and bayes filters, however, none were conducted using recurrent networks. Various techniques of semantic segmentation of 3D point clouds are explored, and the best fit to serve as baseline was SqueezeSeg V2. In this work, we introduce a Convolutional-LSTM layer in the model and adjust the “skip” connectors in the architecture, resulting in a mean Intersection over Union (mIoU) of 36%, which improves on the baseline by almost 3%. Recently, we repeated the same experiment on SqueezeSeg V3, a recently published network, which achieved a mIoU of 45.3, improving on its baseline by 2.13%. These results were obtained using sequences 00 to 10 of Semantic KITTI dataset. |
|---|