Google released a huge AI training data warehouse with over 5 million photos of 200,000 locations worldwide

The design of AI systems is able to accurately identify the characteristics of each place in the world at the individual level (that is, it is possible to distinguish between places in the same category, for example Niagara Falls with any other waterfall) and retrieving images (objects in images with other versions of that audience by category) are one of the long-term goals of the intellectual research department. Google's artificial special interest. Last year, the company released Google-Landmarks, a data package that relates to Earth's landmarks that Google claims to be the world's largest at the time, and also organized 2 contests (Landmark Recognition 2018 and Landmark Retriny 2018), attracted the participation of more than 500 researchers of machine learning as well as the world's leading artificial intelligence.

images 1 of Google released a huge AI training data warehouse with over 5 million photos of 200,000 locations worldwide

Improve the effectiveness of assessing depression status by AI models

Following the success of last year, yesterday 5/5, Google officially released the AI Google-Landmarks-v2 training data warehouse with open source, as an important step in the successful development plan. Computer vision models can identify world landmarks quickly, accurately, and more sophisticated. This Google-Landmarks-v2 data warehouse has a much larger scale than the previous version, owning up to 5 million images (double the previous version) of 200,000 locations (7 times the previous version) on Around the world.

In addition, Google did not forget to offer two new 'challenges' this year: Landmark Recognition 2019 and Landmark Retriny 2019 on Kaggle machine learning community, and release source code and model for Detect-to-Retrieve, one framework helps to restore images by region more effectively.

images 1 of Google released a huge AI training data warehouse with over 5 million photos of 200,000 locations worldwide

AI uses tweets to help researchers analyze the flood situation

'Both methods of image recognition and retrieval will generally require a larger-scale training data set in both the number of images and the variety of places to better train the system. as stronger. We hope that this data set will help improve the identity and retrieval of modern AI models' images more thoroughly, 'said two software engineers on the Google team AI Bingyi Cao and Tobias. Weyand shared.

In addition, according to the two experts, 5 million photos of more than 200,000 places stored in Google-Landmarks-v2 were collected as well as contributions from photographers around the world. Each photo will be labeled with a description of the place and author, such as Neuschwanstein Castle (Neuschwanstein Castle), Golden Gate Bridge, Kiyomizu-dera, Burj Khalifa, and Sphinx Giza ( Great Sphinx of Giza), Machu Picchu and many other famous attractions. Later, Google researchers conducted additional, less-known, collected photographs from Wikimedia Commons, Wikimedia Foundation online archive of images, audio, and many other media data types.

images 1 of Google released a huge AI training data warehouse with over 5 million photos of 200,000 locations worldwide

OpenAI artificial intelligence defeated the current world champion Dota 2

So what is the main problem that Detect-to-Retrieve framework will solve? According to the explanation from Bingyi Cao and Tobias Weyand, the models launched by Google (trained based on a subset of the 80,000 photos taken from the first Google-Landmarks data set) can take advantage of the gender fields. bounding boxes from an object detection model to 'add weight' to image areas that contain interesting items, which greatly improve accuracy.

Besides, Landmark Recognition 2019 (where participating teams are responsible for designing AI models to help identify landmarks) and Landmark Retriny 2019 (participating teams use the AI system to find the correct image a designated place) has started to register for participation today. Both competitions will include cash prizes worth $ 50,000 and at the same time winning teams will be invited by Google to attend Computer Vision and Model Identification conferences (Conference on Computer Vision and Pattern Recognition). ) held in Long Beach, California later this year, to introduce details about the method they have implemented.