RAM Buffers

About 13 years ago, we had a roundtable discussion about using RAM for the pre-buffering of surveillance video. I was against it. Coding wise, it would make things more complicated (we’d essentially have 2 databases), and the desire to support everything, everywhere, at any time made this a giant can of worms that I was not too happy to open. At the time physical RAM was limited, and chances were that the OS would then decide to push your RAM buffer to the swap file, causing severe degradation of performance. Worst of all, it would not be deterministic when things got swapped out, so all things considered, I said Nay.

As systems grew from the 25 cameras that was the maximum number supported on the flagship platform (called XXV), to 64 and above, we started seeing severe bottlenecks in disk IO. Basically, since pre-buffering was enabled per default, every single camera would pass through the disk IO subsystem only to be deleted 5 or 10 seconds later. A quick fix was to disable pre-buffering entirely, which would help enormously if the system only recorded on event, and the events were not correlated across many cameras.

However, recently, RAM buffering was added to the Milestone recorders, which makes sense now that you have 64 bit OS’s with massive amounts of RAM.

I always considered “record on event” as a bit of a compromise. It came about because people were annoyed with the way the system would trigger when someone passed through a door. Instead of having the door being closed at the start of the clip, usually the door would be 20% open by the time the motion-detection triggered, thus the beginning of the door opening would be missing.

A pre-buffer was a simple fix, but some caveats came up: systems that were setup to record on motion, would often record all through the night, due to noise in the images. If the system also triggered notifications, the user would often turn down the motion detection sensitivity until the false alarms disappeared. This had the unfortunate side effect of making the system too dull to properly detect motion in daylight, and thus you’d get missing video, people and cars “teleporting” all over the place and so on. Quite often the user would not realize the mistake until an incident actually did occur, and then it’s too late.

Another issue is that the video requires a lot more bandwidth when there is a lot of noise in the scene. This meant that at night, all the cameras would trigger motion at the same time, and the video would take up the max bandwidth allocated.

Video bandwidth over several days. The high bandwidth areas are night-time recordings.
24 hour zoom

Notice that the graph above reaches the bandwidth limit set in the configuration in the camera and then seemingly drops through the night. This is because the camera switches to black/white which requires less bandwidth. Then, in the morning, you see a spike as the camera switches back to color mode. Then it drops off dramatically during the day.

Sticking this in a RAM based prebuffer won’t help. You’ll be recording noise all through the night, from just about every camera in your system, completely bypassing the RAM buffer. So you’ll see a large number of channels trying to record a high bandwidth video – which is the worst case scenario.

Now you may have the best server side motion detection available in the industry, but what good does it do if the video is so grainy you can’t identify anyone in the video (it’s a human – sure – but which human?).

During the day (or in well-lit areas), the RAM buffer will help, most of the time, the video will be sent over the network, reside in RAM for 5-10-30 seconds, and then be deleted, never to be seen again – ever. This puts zero load on disk IO and is basically the way you should do this kind of thing.

But this begs the questions – do you really want to do that? You are putting a lot of faith in the systems ability to determine what might be interesting now, and possibly later, and in your own ability to configure the system correctly. It’s very easy to see when the system is creating false alarms, it is something entirely different to determine if it missed something. The first problem is annoying, the latter makes your system useless.

My preference is to record everything for 1-2-3 days, and rely on external sensors for detection, which then also determines what to keep in long term storage. This way, I have a nice window to go back and review the video if something did happen, and then manually mark the prebuffer for long term storage.

“Motion detection” does provide some meta-information, that can be used later when I manually review the video, but relying 100% on it to determine when/what to record makes me a little uneasy.

But all else being equal, it is an improvement…


When the UI mask “real” Issues

Let’s be clear; UI issues are real issues, and should be treated as such, but they are different in the sense that a poor UI can be usable, given enough practice and training, but technical issues (such as crashes) will render a product useless.

For the NVR writer the test is pretty simple. The NVR must acquire video and audio when instructed to do so, and it should be able to deliver the acquired data to the end user whenever they feel compelled to gaze at it. Sure, there are plenty more things the NVR need to do reliably, but the crux is that the testing is pretty static and deterministic.

Enter the grey area of analytics – it is much more difficult to test an analytics solution. Sure, we could set up a very wide range of reference tests that might be used to measure the quality of the algorithms (assuming the thing does not crash), but these tests might be “gamed” by the vendor, or then number of required tests would make the testing prohibitively expensive. I am sure there would be a market for an analytics test generator, simulating fog, rain, reflections, shadows, vibration and so on ad infinitum at almost no cost, but that’s beside the point.

My concern is that sometimes the actual, real performance is masked by poor UI; the operator blames the poor performance not on the algorithms, but instead themselves for not configuring, calibrating and nursing the system enough. When the vendors techies come on site they are always able to tweak the system to capture the events – “see, you just have to mask out this area, set up this rule, disable this other rule” and so on. The problem is that 2 days later, when it’s raining AND there is fog, the analytics come up short again. You tweak the rules, and now you’ve broken the setup for reflections and shadows and so it goes on and on.

This dark merry-go-round of tweaking is sometimes masked by the poor (crap?) UI. So this brings us to my argument: If the UI had been stellar, logical, easy to use, intuitive and all that, then the operator would quickly realize that the system is flawed and should be tossed. But if the UI is complex, weird, counterintuitive it takes longer for the end user to realize that something is fundamentally wrong, and subsequently they might stay in the business for longer.

Sure, at times things are just inherently complex (the complexities of a program can never be removed by replacing text-entry with drag’n drop), NAT setups and port forwarding is always a little difficult to handle in an elegant way and so on, and naturally analytics do require a fair bit of massaging before they work entirely as intended (if you have realistic expectations!! – which reminds me of a book I read recently on the financial meltdown; if people demand snakeoil, snakeoil will be provided).

End rant.


Video surveillance is now being deployed in NYC’s subway system. Far from being the first subway system to be retrofitted with surveillance gear, MTA (NYC’s subway administration) wants to do a pilot. A sensible choice, but why not run 5 or even 10 pilots at once? Let the vendors install their system on 10 different sets of cars, perhaps MTA could stage 10 or 20 “incidents” that the operators would need to find in the database, or respond to if realtime deployment is the purpose.

Read more about the project here

Video Analytics

ObjectVideo has a blog, which I believe is refreshing (Genetec has one too, and so do Exacq, and I am sure there are a lot of other vendors that have one too). In this post they talk about legislation being a hindrance to application of new technology, the  post that was picked up by John Honovich, who asks some questions about the 98% figure in this article.

John raises a good point – what does 98% success rate mean exactly? If the system fails miserably in the last 2% (1.3% according to OV’s own numbers), then 98.7% might justifiably be totally unacceptable. Another argument could be that a benchmark was set, and OV flat out failed to beat that number.

In the medical industry I was stunned when I realized how benchmarks were met; a reference study was done, and the algorithms were tweaked so that the numbers met the criteria – but on the same set of samples! How the system would perform on a new set of samples was largely unknown. The same thing seems to apply to some analytics systems – the system is tweaked to beat the numbers in the trial phase, only to fail miserably when the conditions change; test in the summer, and be prepared for failure when the leaves start to fall. Test during low season, only to fail miserably when the tourists start flooding the entrances.

Frequently the performance of analytics systems is wildly oversold (remember facial recognition?), so the cards are stacked against the integrator even before the system is deployed. Once in place, there are a plethora of parameters that affect the performance – horsepower, frame rate, compression, lighting, weather, color calibration, network performance and so on. Raise the frame rate to make object tracking more reliable, and the bandwidth just went up, place the analysis on the edge, and you can discard all those old cameras already installed. The truth is that the system has to be designed around analytics from the get go, and more often than not, it isn’t.

ObjectVideo has a good product, it’s just that the expectations have to be adjusted a little bit – not necessarily the benchmarks.