Video Database Replication

Video Surveillance Databases are special. They are written to constantly, they are rarely read from, and the index is very simple (just a timestamp as the key). There’s no reason – really – to use anything fancy, certainly not SQL server.

I recently saw a marketing blurb for an expensive and cumbersome storage system that integrated to a VMS. It touted that the VMS had a “proprietary database highly optimized for video storage”. I guess “it uses the file system” did not sound fancy enough.

The entertaining puffery was uncovered as I was looking into the feasibility of geo-redundancy for a partner. Basically, they were looking for a fully mirrored backup system: If the primary site was to vanish, the backup site would take over, with all recorded data being readily available.

Database replication is nothing new; but typical database replication systems assume that you have a much higher outbound throughput than inbound. You may have a database with 2 million records, and if you add 1000 records per day, you’ll need those new records to propagate to the replication sets in your cluster – challenging, but a problem that has been solved a thousand times.

Video data is very different; its a constant torrent of data streaming into the system, and once in a while someone pulls out a few records to look at an incident. If the database uses the file system for its blocks, it’s almost trivial to provide replication. Just make sure the directory on the backup site looks identical to the one on the primary. This can be done with a simple rsync on Linux.

Another option is to use theĀ Distributed Replicated Block Device (DRBD). This (Linux) tool allows you to create a drive that is mirrored 1:1 across a network. In other words, as files are written or changed, the exact same thing will happen on the backup drive. A Windows version appears to exist as well.

Surely, a better solution is to have the VMS be able to determine what files are most valuable, and push them to the remote site first. It might even chose to not mirror files that provide no value (zero motion files for example), or send a pruned version of the files to the backup system.

Depending on the sensitivity of the data, a customer might chose to extend/replicate their storage to the cloud. The problem here is that the upstream bandwidth is often limited, and thus in those cases a prioritization of the data is certainly needed.

Happy replicating…