MIPT has created a system to automatically track an object using various cameras. Andrey Leus, one of the leading researchers of the Laboratory of Special Purpose Digital Systems of the Moscow Institute of Physics and Technology, told socialbites.ca.
Modern computer algorithms make it possible to recognize images from photos and videos, including human faces. However, in most cases, such programs can only record the fact of the presence of a target object in the frame of one of the cameras; For long-term observation of a person (or machine) and restoration of its course, operator intervention is required.
“The system we created allows you to create tracks. [записи пути] to link together images of observed objects and different cameras. Most modern surveillance cameras with image recognition define each found object as a “new found object” and the system will not see the link between them, even if another camera is observing it at that moment. Imagine a car driving through the city, but its number can only be recognized on a camera. “Our algorithm allows us to identify an image of a car with a visible license plate with an image of the same car on other cameras where the license plate is not visible.”
In this way, the surveillance system can automatically track the target object and “transfer” it to the next cameras in turn. Human perception can serve as a crude analogy: if in one close-up we find the person we need in a crowd with a hat and a long coat, then in other frames we can recognize him simply by seeing a silhouette. faceless hat and coat. At the same time, each object is not assigned an image, but a whole series at different distances and from different angles.
“When recording a track, each camera records a series of shots. Therefore, not a picture is associated with the object, but a whole set of hundreds and thousands. After memorizing this dataset, we can easily recognize the object in the next camera and continue watching it,” Leus explained.
In this case, the surveillance system can be connected to the area where the cameras are located. This allows, for example, to monitor a student’s participation in lectures – to determine in which audience and when a person is present.