VETRA: A Dataset for Vehicle Tracking in Aerial Imagery
VETRA is a dataset for vehicle tracking in aerial image sequences and presents unique challenges such as low frame rates, small and fast-moving objects, as well as high camera movement. These characteristics allow for extended tracking of numerous vehicles with varying motion behaviors over large areas and pose new challenges for MOT algorithms.
In Figure 1, mosaics of three sample image sequences are shown. By overlaying the images in chronological order and considering their geolocation, the heterogeneity of camera motion and spatial resolution becomes visible.
Figure 2 displays the vehicle annotations in greater detail: VETRA offers both polygon, oriented bounding box (OBB) and horizontal bounding box (HBB) annotation for 8 vehicle classes. Furthermore, the image crops illustrate the variety of spatial structures depicted under various viewing settings.
VETRA consists of 52 image sequences captured by airplanes and helicopters using DLR’s 3k and 4k camera systems. The spatial distribution is shown in Figure 3; the acquisition sites are located in Germany and Austria. In addition to the classical training, validation and test sets, VETRA offers a second test set specifically designed for the application of large area monitoring (LAM). The LAM sequences are recorded over 7 rural roads and motorways with a fixed camera speed and configuration. Each road section is captured at 4 different times of the day, enabling the performance of MOT algorithms to be evaluated under different traffic loads in a static environment. Furthermore, the features extracted from the LAM sequences can be utilized in transport research applications.
When the VETRA dataset is used, it is mandatory to cite the following publication:
Hellekes, J.; Mühlhaus, M.; Bahmanyar, R.; Azimi, S.; Kurz, F. (2024): VETRA: A Dataset for Vehicle Tracking in Aerial Imagery – New Challenges for Multi-Object Tracking. Accepted for European Conference on Computer Vision (ECCV).
Download