Deep Learning Model

For object detection, we made use of the standard YOLOv5 architecture written in PyTorch. Our model was trained on the 15,065 frames from our augmented dataset and the aerial dataset (https://github.com/UAVVaste), which contained a further 772 annotated images. Qualitative results can be seen in the one of our demonstration videos below.

Demonstration Video