Mapillary has launched a traffic sign recognition dataset designed to help autonomous vehicles understand road signage.

Holding 100,000 labelled pictures from all over the world, images in this dataset are claimed to have cover diverse traffic sign classes in a number of countries and have high variability. This includes weather conditions, times of day when the images have been taken, to camera sensors and viewpoints.

One of the biggest concerns surrounding driverless vehicles is teaching them to see and understand traffic signs. Mapillary’s dataset has been designed to aid this.

Mapillary automotive vice-president Emil Dautovic said: “Carmakers typically go out and get their own data to train their algorithms, but that means that they have low levels of variability in their training data.

“When it comes to teaching cars to see, more diverse input data means better results. There hasn’t been anything like the Mapillary Traffic Sign Dataset available on the market before, and that’s why we built it.”

Mapillary has launched this diverse and large dataset that can be licensed by anyone to train their own traffic sign recognition systems.

Creation of such large dataset has become possible due to Mapillary’s collaborative approach of gathering images from people and companies worldwide. Mapillary has selected 100,000 images for the dataset out of the 570 million images uploaded by individuals and organisations.

At least 300 traffic sign classes have been verified, leading to over 320,000 labelled traffic signs across the image dataset.

More than 52,000 images have been confirmed by people across the world, while the remaining images have only been partially explained with the help of the firm’s computer vision technology.

According to recent research, cameras can potentially replace LiDAR in teaching autonomous vehicles to understand the environment they are operating in. This would cut the cost of autonomous vehicles by tens of thousands of dollars.

By addressing traffic sign recognition, Mapillary claimed that its new dataset tackles a different aspect of the problem of perception.

According to Dautovic, diverse dataset is essential for advancing towards camera-based solutions.

Dautovic added: “The strength in the Mapillary Traffic Sign Dataset really lies in the diversity of the input data. There are only a few other datasets on the market, and none of them has imagery from all over the world, simply because it would take too much effort for one, single player to get images from such a diverse set of locations on a global scale.

“We don’t actually need to do that at Mapillary, since all the images have been uploaded to the Mapillary platform by people across the entire world. With this dataset, we’re hoping to come closer to solving the problem of self-driving from cameras only, one step at a time.”

The company previously released Mapillary Vistas Dataset, which is a street-level imagery solution with pixel-accurate and instance-specific human annotations to teach machines to understand street scenes at scale.