AerialMPT: A Dataset for Pedestrian Tracking in Aerial Imagery

AerialMPT is a dataset for pedestrian tracking in aerial image sequences and presents real-world challenges for MOT algorithms such as low frame rate, small moving objects, and complex backgrounds.

AerialMPT consists of 14 sequences and 307 frames with an average size of 425 × 358 pixels. The images were acquired by DLR's 4K camera system from altitudes ranging from 600 m to 1400 m, resulting in spatial resolutions (GSDs) ranging from 8 cm/pixel to 13 cm/pixel. In a post-processing step, the images were co-registered, geo-referenced, and cropped for each region of interest, resulting in sequences of 2 fps. The images were acquired during different flight campaigns between 2016 and 2017, over different scenes containing pedestrians and with different crowd densities and movement complexities.

Sample images of the AerialMPT dataset
from different locations and with various crowd and movement complexities

The dataset was manually labeled with point annotations on individual pedestrians, with each individual assigned a unique ID across the entire sequence. This process resulted in 2,528 pedestrians annotated with 44,740 annotation points. The dataset is divided into 8 training and 6 test sequences.

Sample aerial image with its overlaid annotations from the AerialMPT dataset
taken over the BAUMA 2016 trade fair

When the VETRA dataset is used, it is mandatory to cite the following publication:

Kraus, M.; Azimi, S.; Ercelik, E.; Bahmanyar, R.; Reinartz, P.; Knoll, A. (2021): AerialMPTNet: Multi-Pedestrian Tracking in Aerial Imagery Using Temporal and Graphical Features. 25th International Conference on Pattern Recognition (ICPR).

Link

Download