Video surveillance for mass events, concrete help to law enforcement agencies

30/06/18

In recent years Europe has been the scene of news events that have caused numerous victims, hardships and insecurity in the social fabric. Terrorist attacks and other criminal actions that occurred during public events and in crowded places have suddenly become a priority to face for the European Union and for the law enforcement agencies of the member states. It is a scenario with heterogeneous traits, to be faced with the right balance between increasing security measures and maintaining individual freedoms.

Nowadays there are different technologies, called "artificial vision"(computer vision), which in the very near future will be able to provide concrete support to the police in the use of video surveillance systems. Technologies developed based on the activities of the research group Pattern Recognition and Applications Lab (PRA Lab - http://pralab.diee.unica.it) of the University of Cagliari is carrying out within the LETSCROWD project (Law Enforcement agencies human factor methods and Toolkit for the Security and protection of CROWDs in mass gatherings), started in May 2017 and funded by the European Commission under the HORIZON program 2020.

The general objective of LETSCROWD concerns the development of strategic and operational methodologies and solutions for the monitoring and protection of crowds during meetings and gatherings in public places, as concrete legislative / executive support in the definition of a European Security Model (European Security Model) in the context of mass gatherings.

For the benefit of law enforcement, various technological tools will be developed to support their activities in mass gatherings, instruments of which they will also be testers through practical demonstrations; training and training activities for them will also be provided.

This is in fact one of the main roles of the research laboratory in Cagliari, but also involved in the project in other areas (analysis of security policies, analysis of information sources such as social networks, dissemination and dissemination of project results).

Video surveillance systems are now a widespread reality for monitoring public and private places (banks, stadiums, car parks, railways, airports, etc.), industrial areas, and urban and extra-urban road infrastructures. As a natural consequence, for operators involved in monitoring, it becomes ever more difficult, if not impossible, to monitor in real time the videos produced by these systems; this has a direct repercussion on a possible prompt reaction to potentially "relevant or suspicious" events or actions; similarly, in the course of a survey carried out ex post, the analysis of all the video recordings available, in order to retrieve relevant frames, may require an excessive amount of time. The introduction of advanced artificial vision technologies to automate at least part of video monitoring and analysis activities becomes a necessity. This is in fact the direction in which the major producers of video surveillance solutions are currently moving, with the introduction of features such as the detection and automatic tracking of vehicles and people, and the recognition of vehicle license plates.

Consider now the scenario of interest for LETSCROWD, ie the monitoring of mass events such as events, concerts and sporting events, by law enforcement agencies. In such events, the number of cameras increases exponentially (dozens of devices in some cases, and this value increases during "critical" events). The Camcorders can be installed specifically by law enforcement agencies, including those mounted on airplanes (helicopters and remote piloting systems, more commonly known as "drones"), or they may belong to pre-existing video surveillance systems installed in public places, even by private individuals (eg in banks and stadiums). The videos produced by each camera are typically observed by one or more operators and law enforcement officers in a control room. Given the high number of such videos, each operator will have to monitor, in real time, images from multiple cameras, communicate with agents in the field, and eventually decide how to change the settings of PTZ cameras (pan-tilt-zoom) according to operational needs (for example by changing the frame or zoom). All videos are also recorded (for a time defined by the law), and can subsequently be viewed during any investigation of events that occurred during the event.

Let's take a closer look at some examples of video analysis activities that law enforcement officers may need to perform during a mass event or after it has taken place. An operator who watches on one of the videos a suspicious behavior from an individual, may wish to recover, possibly in real time, all the videos in which the same person appears to be able to analyze their movements and actions, and then be able to give indications to the agents in the field, for example to track down that person. Similarly, during an investigation, a forensic investigator may wish to retrieve all videos showing a subject described by one or more eyewitnesses (who may also be acting on the field) of an incident or a crime that occurred during the event. It is evident that the "manual" analysis of all the available videos can take too long a time.

If the operator has an image of the people to search for (as in the scenario described above), it may be possible to use biometric facial recognition technologies (face recognition) to perform an automatic search on available videos. However, these technologies are effective only if the face is clearly visible and if it is placed almost frontally. This rarely occurs in application contexts such as those of interest to LETSCROWD: in images taken by video surveillance systems in relatively large areas (eg streets, squares, concert venues) faces may not be visible, or may not be recognizable due to of various factors such as excessive distance from the camera, occlusions from other people or objects in the scene (in addition to the aforementioned non-frontal pose). In these cases the operators, in order to finalize the identification and the recognition, are based on auxiliary characteristics such as gender, the appearance of the clothes, the presence of accessories such as hats or backpacks; these characteristics are mainly useful for short time periods (a few hours, or in any case within the same day), in which it is reasonable that a person does not change his clothing; for this reason they are also called "weak biometries", as opposed to a"strong biometrics"like the face" For some years now in the field of artificial vision we are studying techniques of re-identification based on the appearance of a person rather than on the face (appearance-based person re-identification), which have the purpose of automatically recovering the videos acquired by video surveillance systems in which a person appears who has available an image, typically provided by an operator, in a similar way, for the case in which only the description of the appearance of a person is available , we are studying image search techniques of people whose appearance corresponds to a description provided by an operator in terms of a predefined set of "attributes" related to clothing characteristics (for example, color), to gender, and accessories such as those mentioned above, these techniques are called attribute-based people search.

How the tools of appearance-based person re-identification and attribute-based people search can they concretely support law enforcement officers and investigators? Let's go back to the example of an operator who observes a suspicious person in a video, and wants to retrieve other videos in which it appears. The operator could stop the video, "crop" the image of that person from a frame, and start the software person re-identification. This software will compare the incoming image with all the images of people that the same software will have automatically extracted, operating "behind the scenes" (in the background) in real time on all the available videos; at the end of the comparison, it will return to the operator the sequence of such images, ordered according to the resemblance with the image of the person to be searched. The operator will then be able to scroll through this sequence, to access the "context" information on each image (for example the position of the corresponding video camera and the instant of time in which the image was taken), and to display the corresponding video track.

A software of attribute-based people search has a similar functionality. Taking the example of an individual's description provided by a witness, an investigator can insert, through an appropriate interface, the elements of this description that correspond to the predefined attributes provided by the software (for example, a man in a red shirt and trousers) blacks); the software will then retrieve all the images of people previously extracted (automatically) from all the available videos, and show the user the sequence of such images ordered according to the degree of correspondence with the description provided. Also in this case the user can access the context information and the video track of each recovered image.

The two tools described above therefore make it possible to reduce the manual search time on the available videos, and can also retrieve images of people of interest that would have escaped an operator. One of the activities of the PRA Lab in LETSCROWD consists precisely in the development of prototypes of these tools, and in their validation in realistic use cases by law enforcement agencies involved in the project.

Another set of activities carried out by law enforcement during mass events concerns the monitoring of a crowd; typical examples are the estimate of the number of people present in a given area and the detection of potentially dangerous or suspicious behavior, such as the presence of one or more people running in the middle of a slowly moving crowd. The development of techniques able to automatically monitor a crowd is a goal pursued in the field of artificial vision research for more than twenty years; however, this requires an ability to analyze and interpret the content of images and videos that is not yet within the reach of current technologies, except through ad hoc solutions in very limited and well-defined applications. In this context, the objective of the PRA Lab in LETSCROWD consists in the analysis of the state of the art of artificial vision technologies for the monitoring of crowds, and in the development of prototypes of systems able to support the operators in the following tasks:

  • estimation of the density or number of people in a certain area taken by a video camera;
  • detection of main directions and movement speeds within a crowd;
  • detection of "abnormal" behavior in a crowd due to:
  1. sudden changes in density (for example due to a panic-induced escape);
  2. exceeding a predefined maximum value of the density or number of people in a certain area;
  3. people or groups that move in different directions or speeds than the "normal" ones in a given context.

In particular, taking into account the difficulty in automating these tasks, the prototypes developed by the PRA Lab will be semi-automatic: they will have to interact with the operators reducing the workload, but leaving them with the final decision on the interpretation of a date scene. To give a concrete example, the tool dedicated to the detection of abnormal behavior can draw the operator's attention to a certain scene in which he has detected a sudden decrease in the density of people, but leaving the operator to assess whether the behavior the crowd is such that it requires actions such as an intervention by operators in the field, or if it is a situation that does not involve any real danger, thus avoiding potential false alarms.

The LETSCROWD project is coordinated by ETRA Investigación y Desarrollo SA (Spain) and involves sixteen partners from eight EU countries (private and public research institutes, universities, law enforcement agencies and public authorities) operating in critical areas of government, security, energy, finance, transport and services. In addition to the PRA Lab, the Italian partners include: the consulting firm Deep Blue, the academic spin-off of the PRA Lab, Pluribus One, operating in the computer security sector, and the Ministry of the Interior - State Police, Department of Public Safety. The other European law enforcement agencies involved in the consortium include bodies of the first level: Policía Municipal de Madrid - Ayuntamiento de Madrid (Spain), University of Applied Sciences - Police Affairs (Germany), Local Police Voorkempen (LEA-Belgium), Ministerio da Administracao Interna - Polícia de Segurança Pública (Portugal) and Ministry of Internal Affairs (Romania).

The project therefore sets ambitious goals, of considerable impact on the lives of European citizens and on the work of public security authorities. Further details on the project can be found on the site https://letscrowd.eu (from the site are then accessible the Twitter and Linkedin social channels dedicated to project activities). LETSCROWD has started a year, with satisfactory partial results, and 2019 will be completed in October, leaving an interesting legacy to research institutes and operating structures.

 

Authors / co-authors

Prof. Giorgio Fumera, Associate Professor of Information Processing Systems, at the Department of Electrical and Electronic Engineering of the University of Cagliari.

Dr. Rita Delussu, PhD student, Department of Electrical Engineering and Electronics of the University of Cagliari.

Dott. Matteo Mauri, head of scientific dissemination, Pattern Recognition and Applications Lab, Department of Electrical Engineering and Electronics of the University of Cagliari.

(photo: web)

Useful links:

http://pralab.diee.unica.it

https://letscrowd.eu