I was recently looking at the different failover mechanisms offered in the security industry and I discovered that even though the term failover is everywhere, the actual implementation differs a lot. Some failover solutions do not even automatically failover and require manual intervention. The IT industry has quite clear definitions about cold, warm and hot standby. So for the purpose of this blog, let's explore them:
- Cold standby or cold server is a server whose purpose is solely to be replacing the main server in case of a failure. That server has all the software installed and configured but isn't powered on.
- Warm standby is a server that receives periodic updates of the information and can be turned on anytime to take over the primary server. The data between the primary and secondary server isn't synchronized at all time.
- Hot standby is a server that runs receives updates continuously; it detects failures automatically on the primary server and takes over immediately.
If we apply the definition of hot standby to a recording server, it means having a server that gets continuously updated with the latest video configurations and a server that can start immediately recording and transmitting video if the primary server fails.
Now, let's look at a typical "hot standby" implementation of a 150 IP-camera recording server:
- Detect a failure of the primary server within a few seconds.
- Start a new recording process on the secondary server.
- Connect to all the 150 IP cameras.
- Start transmitting and recording video.
A few years ago, when designing the Failover Archiver, we tested this implementation and realized it is impossible to execute a complete failover under one minute. The amount of time required to launch a new process and connect to all cameras is too long and it varies between the different camera manufacturers. By complete failover, I mean the entire time it takes for the transmission and recording functions to be fully operational on all cameras, not just to detect a failure.
To ensure a constant and acceptable failover time, we designed something different. With Omnicast and Security Center, the secondary Archiver process is always running and already connected to the cameras, but the recording and transmitting functions are idle and waiting for a failure to occur.
Omnicast and Security Center Failover Archiver:
- Detect a failure of the primary server within a few seconds
- Start transmitting and recording video
This simple difference in the implementation makes a big impact on the total failover time of the system. It goes from a few minutes to a few seconds. This is how Genetec achieves a real hot standby scenario for its recording servers.