Semantic Segmentation with Street View Images for Patrol Robots

Here we introduce the segmentation task with street view as follows:
- Requirements: Taking a RGB image of street as input, the model oughts to output masks of vehicles and pedestrains, which can provide obstable information for the patrol robot.
- Depolyment: We trained the state-of-the-arts, i.e., BiSeNet, with a mixture of Cityscapes Dataset and the street view dataset collected by ourselves. Furthermore, the refined model was integrated into the ROS on NVIDIA Jetson TX2 (an ARM-based computing device).
- Performance: The refined model performs well with a 99.07% accuracy and a 94.54% MIoU on the validation set. Moreover, the inferencing works on TX2 with 12.53fps in average, 69.1% of single CPU occupation and 1.968G memory usage, which meets the business requirements.
The following video shows some of the segmentation results with street views both in day and night, indoor and outdoor.