In a complex environment, a camera may be required to perform different functions at different times or indeed to be set up with different configurations at different times. For instance, from 9am till noon, the system may be used for counting but from then on, except on public holidays, it may be required for intrusion detection. The ability to schedule different functions in a robust but sophisticated manner is critical for the effectiveness of a good system.
Images from a video are invariably a 2 dimensional representation of a 3 dimensional space. Humans can understand perspective because they can see everything around them in 3D with two eyes. Systems have to be sufficiently intelligent to understand perspective, e.g., to know that a large object in the distance looks much smaller than in the foreground of the image and vice versa.
Normally, if an event occurs, it should be brought to the attention of the operator who can make a decision on whether to archive that particular footage for later review. If there is no operator available, it should be possible to set the system up to automatically archive any event footage. The amount of time that lapses before the system realizes that there is no human around to intervene must be configurable by the user.
The users of a system may not always be close to events or even near the system used to monitor and analyze the events. They may or may not have the resources to manage and maintain their systems locally. For such situations, the iOmniscient system has been designed to allow implementation, diagnostics and maintenance to be performed remotely. The remote access capability can be used for more than the diagnoses of problems. It can be used for configuration when the system is being implemented.
In many situations, it may not be possible to continuously monitor the system. There may be insufficient staff to man a command and control center. To ensure people in charge receiving information on events immediately, all iOmniscient systems come with Mobile Client systems that operate on Android-based smartphones. The operator supervisors and senior management can monitor events and manage their system even when they are not inside a control room.
In a busy city, it is not sufficient to be advised when an incident occurs. One also needs to be aware of its occurring location and the available resources to address the event. For this reason, a good system knows the GPS location of all cameras. Beyond this, it is aware of the GPS location of the nearest police and emergency vehicles to the incident. If there is a fire, the system knows the location of the nearest fire station and of every fire engine. If there is an accident, it should know the location of the nearest ambulance.
In certain jurisdictions, various emergency authorities may not want to initiate an automated response but, at least, the system can provide vital information to the person co-ordinating the response.
So far, we have only talked about intelligence as it relates to the images from a particular camera.
The next generation of video management goes beyond the analysis of single images. It involves using the information from multiple cameras to provide the whole network with intelligence.
The best way to visualize Network Intelligence is to set an example. Consider a large theme park. At such a venue, they have to manage long queues for their various rides and activities. Some of these queues are extremely long. They could, for example, start at the entrance of a building, wind through underground corridors and emerge at a different building. No single camera can see every part of the queue. There are often many entrances to the queue and there may be many points where people leave the queue, possibly out of frustration.
In order to manage these queues, management requires information like the average waiting time for a person who enters the queue. They also need to know the length of the queue or where the queue ends, especially when all parts of the queue are not visible. Some of this information may also be made visible to the public to help them understand how long they may have to wait.
This is a very good example of an application that requires intelligence that goes beyond the use of a single camera. This type of application requires cameras to be placed strategically at all the entrance and exit points for the queue. Every camera is then used to count the number of people that pass that point. The cameras are also used to determine where the queue ends. The information from all the cameras is pooled together to provide the information that the park management requires. This allows management to open up more service points and to put up electronic signage advising customers of the expected waiting time for their ride.
Traditional suppliers of Video Analysis systems are still focused on analyzing the information from a single camera. Network Intelligence is only available from the most sophisticated providers of Video Analysis.
All analytics (video or otherwise) use explicit or implicit rules to determine if certain incidents have occurred, for instance, the system can raise an alarm if a person falls down or if a car speeds above the speed limit.
In real life, many incidents may occur in some combination and the way in which they are combined can provide more information on the situation. A person falling down may have slipped. However, if a gunshot is heard at the same time, it is possible that the person has been shot and a different type of response may be appropriate.
This ability to combine rules using AND and OR conjunctions which involves the concept of Boolean logic can provide greater insight into a situation and can help the stakeholder to provide a more appropriate response.
The information generated by intelligent systems ultimately needs to be communicated to humans. Humans are known to have short attention spans and a limited ability to pull out key information from huge masses of data. Therefore, it is important that information is presented to human operators in a manner that is easy for them to absorb and use.
This is the primary challenge for VMS and Command and Control systems. Showing the video image from different cameras is the trivial part of the capability for such systems. For a Video Management System (VMS) or Physical Security Information Management (PSIM) system to present information effectively, it actually has to have the ability to accept the information about events that occur and to present this information. This means that the system must have a sophisticated interface for accepting such information from an advanced analytics system.
Unfortunately, many systems, while being quite capable of showing videos that come off cameras, have little ability to show what is actually happening in the video. Often, they provide text messages about alarms. Only those systems that have been fully integrated with an advanced Video Analytics system can provide all the information that is available from that system.
Many existing VMS systems maintain a proprietary interface limiting their own ability to interface with advanced Video Analytics systems. The analogy would be two people speaking in different languages. Let us assume that A speaks Japanese and B speaks English. If A says something to B in Japanese, B can look it up in a dictionary and translate the words to English to understand him. However, there may be some words that just cannot be translated because the concepts do not occur in English. This would mean that B cannot understand that particular concept precisely.
Similarly, if the designers of the VMS system have not understood a particular Video Analytics concept, their product would have no ability to display the appropriate information. This creates a dilemma. The Video Analytics system can provide increasingly advanced intelligence but this is of little use if it cannot be displayed and communicated to the user by the VMS system.
iOmniscient with its commitment to openness provides ALL meta data about ALL events in a simple to use and standard format. Unfortunately, this is not useable by those VMS systems that have proprietary interfaces.
To ensure that all information can be effectively displayed, iOmniscient does offer its own VMS and Command and Control system which have been specifically designed to understand and display ALL the intelligent information that is available from the Video Analytics system.
To provide the operator with context, the display can also be integrated into drawings, plans or maps of the site or into Geographical Information Systems (GIS) that have other information available about the environment. The icons for the sensors can be embedded on the maps or images. These icons are dynamic and they can indicate if there is an alarm on that particular camera or if it is not working effectively.
As a further advance on this concept, some vendors have developed a 3D rendering capability of the type used in video games. Essentially, an artist is required to convert a 2D image into a 3D simulation of the environment. This can be quite effective but very expensive to implement for a large network of cameras.
The ultimate requirement of a good display system is to be able to extract all the important information that is available from the analytics system and to display it in a way that is meaningful for the user.