Trials on self-driving cars have been implemented in a number of cities to help researchers and regulators collect data on the challenges of autonomous driving on public roads.To date, there are at least 9 well-known open datasets on autonomous vehicles (AVs), the earliest released being KITTI by Karlsruhe Institute of Technology and the latest being the Waymo Open Dataset released on August 2019. Other organizations that released open datasets in 2019 include Aptiv (previously Nutonomy) with the nuScenes dataset, Argo with the Argoverse dataset and Lyft with their Level 5 Dataset.
Examples of the nuScenes dataset
In 2018, Berkeley A.I. Research (B.A.I.R), released the BDD100K dataset which is the largest to date in terms of monocular video data frames (120 million frames). Baidu's Apollo program released the ApolloScape dataset which featured 146,997 frames. Hesai & Scale is expected to release their full dataset in the coming months.
Types of data released
The earlier datasets such as the BBD100K and ApolloScape contained primarily annotated frames from monocular camera video. The datasets released in 2019 come in richer variety and include different types of data from LiDAR cameras, radar and stereo cameras. Most of these datasets provide different city scenarios, multiple weather conditions, times of day and scene types to help researchers improve their autonomous driving models and algorithms to work optimally in different situations.
The bulk of the datasets are collected from U.S. cities such as San Francisco, Phoenix, Pittsburgh, New York and others, as well as overseas cities in Singapore, Germany (Karlsruhe) and China.