Facebook engineers have unveiled a new AI training method that helps the system to visually perceive videos and photos. This speeds up the analysis process and becomes less mechanical.
The researchers explained that AI is capable of dozens of manipulations based on existing data. So the engineers at Facebook decided to add “common sense” to the learning process. With this option, machine learning does not need to upload 500 photos of a cat before it can start identifying that animal. New research on social networking will avoid this learning step.
Scientists shared how they improved and scaled advanced computer vision algorithms. One of the interesting areas of Facebook development is “semi-supervisor training”.
Facebook researchers have shown by example that learning can be challenging, but very effective. The DINO system (DIstillation of knowledge with NO labels) is able to find objects of interest in a video without tagged data.
To do this, the system sees the video not as a sequence of images that need to be analyzed in order, but as a complex, interconnected set of data. By paying attention to the middle and the end of a video, the AI can get an idea of things like “an object of such and such a shape moves from left to right.” This information is used for further analysis. Scientists note that the system does not work mechanically, but develops a basic sense of visual meaning without a huge amount of training.
As a result, the system performs well compared to traditionally trained systems. Researchers have shown that an AI trained on 500 photographs of dogs and 500 photographs of cats recognizes both, but cannot understand how they are similar. But Facebook’s algorithm is able to distinguish them due to “common sense” and visual perception of pictures.