NVR Appliances Will Change The Landscape

In the good old days, if you were a VMS vendor, you’d offer 2 things : a little database engine, and a client to view video (live and recorded). In return, the camera guys would fret over sensor sizes and optics, but not worry about how video was stored and retrieved.

The core competency of the VMS guys was writing code that ran on a PC, while the camera guys would get their code to run on a small embedded platform.

What we are seeing now is that the lines are getting blurry. Milestone offers code that the camera guys can embed in their systems, and the camera guys now offer small databases embedded directly in the cameras, and obviously there’s a client to view the video too.

From an architectural point of view, the advances in camera hardware capabilities open up for a whole host of interesting configurations. In a sense, the cameras are participating in a giant BYOB party, only, instead of beer, the cameras bring processing power and storage. Prior, the host would have to buy more “beer” (storage and processing), as more “guests” (cameras) arrived. Now, if host has enough physical room (switches and power).

Cameras now have a sufficiently high level of sophistication that they can act almost completely autonomously. While my private setup does require a regular server and storage – somewhere – the camera pretty much cares for itself. It does motion detection, and decides on its own when to store footage. When it does store something, it notifies me that it did, and I can check it out from anywhere in the world. The amount of processing needed on the PC is miniscule.

So how do NVR appliances fit in?

A few years ago, the difference between a regular PC and an “appliance” was that the appliance came with Windows XP Embedded. That was pretty much it. The Hardware and software was almost 100% identical to the traditional PC platform, but the cost was usually substantially higher (even though an XP Embedded license is actually cheaper).

Now, though, we are seeing storage manufacturers offer NAS boxes with little *nix kernels that can connect to IP cameras, and store video. There’s usually a pretty cumbersome and somewhat slow web interface to access the video, but no enterprise level management tool to determine who gets to see what, and when.

The advanced DIY user, can buy one of these boxes, hook up a bunch of cheap Dahua cameras, and sleep like a baby, not worrying about Windows Update wrecking havoc during the night. As a cheapskate, I can live without the plethora of features and configuration options that a traditional VMS offers. It’s OK that it is a little bit slow and cumbersome to extract video from the system, because I only very, very rarely need to do so. I am not going for the very cheap (and total crap) solutions that Home Depot and Best Buy offer. Basically, if it doesn’t work, it’s too expensive.

Further up the pyramid though, we have the medium businesses. Retail stores, factories, offices, places where you don’t DIY, but instead call a specialist. And this, I think, is where the NAS boxes might make a dent in the incumbent VMS vendors revenue stream.

The interests of a DIY’er and the specialist align, in the sense that both want a hassle free solution. The awesome specialist is perfectly capable of buying the parts and building and configuring a PC, but to be honest, tallying the cost and frustration if things don’t work as expected. I think I’d pick the slightly more expensive, but hassle-free solution. Even if it didn’t support as many different cameras as the hyper-flexible solution that I am used to. I do not see an advantage to being able to play Hearts on my video surveillance system.

The question is then. For how long, will the specialist also sell a traditional VMS to go along with the NAS? How long before the UI of the NAS becomes so good that you really don’t need/want a VMS to go along with it? Some of the folks I’ve spoken to, show the fancy VMS, but advises, and prods the customer in the direction of a less open solution. Any of their employees can install and maintain (replace) the appliance, but managing the VMS takes training, in some cases certification and generally takes longer to deploy.

We are not there yet, but will we get there? I think so.

Advertisements

Thoughts on Low Light Cameras

For security applications, I don’t rely on a cameras low light performance to provide great footage. I recently installed a camera in my basement, and naturally I ran a few tests to ensure that I would get good footage if anyone were to break in. After some experimentation, I would get either

  • Black frames, no motion detected
  • Grainy, noisy footage triggering motion recording all night
  • Ghosts moving around

The grainy images were caused by the AGC. It simply multiplies the pixel values by some factor, this amplifies everything – including noise. Such imagery does not compress well – so the frames take up a lot of space in the DB, and may falsely trigger motion detection. Cranking up the compression to solve the size problem is a bad solution, as it will simply make the video completely useless. I suppose that some motion detection algorithms can take the noise into consideration, and provide a more robust detection, but you may not have that option on your equipment.

Black frames is dues to the shuttertime being too fast, and AGC being turned off. On the other hand, if I lower the shutter speed, I get blurry people (ghosts) unless the burglar is standing still, facing the camera, for some duration (Which I don’t want to rely upon).

A word of caution

Often you’ll see demonstrations of AGC, where you have a fairly dark image on one side, and a much better image on the other, usually accompanied by some verbiage about how Automatic Gain Control works its “magic”. Notice that in many cases the subject in the scene is stationary. The problem here is that you don’t know the settings for the reference frame. It might be that the shutter speed is set to 1/5th of a second.  The problem is that a 1/5th of second shutter speed is way too slow for most security applications – leading to motion blur.
Ghost

During installation of the cameras, there are too common pitfalls

  1. Leave shutter and gain settings to “Auto”
  2. Manually set the shutter speed with someone standing (still) in front of the camera

#1 will almost certainly cause problems in low light conditions, as the camera turns the shutter waaay down, leading to adverse ghosting. #2 is better, but your pal should be moving around, and you should look at single frames from the video when determining the quality. This applies to live demos too : always make sure that the camera is capturing a scene with some motion, and request an exported still frame, to make sure you can make out the features properly.

A low tech solution

What I prefer, is for visible light to turn on, when something is amiss. This alerts my neighbours, and hopefully it causes the burglar to abort his mission. I also have quite a few PIR (Passive InfraRed) sensors in my house. They detect motion like a champ, in total darkness (they are – in a sense – a 1 pixel FLIR camera), and they don’t even trigger when our cat is on the prowl.

So, if the PIR sensors trigger, I turn on the light. Hopefully, that scares away the burglar. And since the light is on, I get great shots, without worrying about AGC or buying an expensive camera.

The cheapest DIY PIR sensors are around $10 bucks, you’ll then need some additional gear to wire it all together, but if you are nerdy enough to read this blog, I am pretty sure you already have some wiring in the house to control the lighting too.

So – it’s useless – right?

Well, it depends on the application. I am trying to protect my belongings and my house from intruders, that’s the primary goal. I’d prefer if the cameras never, ever recorded a single frame. But there are many other types of applications where low light cameras come in handy. If you can’t use visible light, and the camera is good enough that it can save you from using an infrared source then that might be your ROI. All else being equal, I’d certainly chose a camera with great low light capabilities over one that is worse, but rarely are things equal. The camera is more expensive, perhaps it has lower resolution and so on.

Finally a word on 3D noise reduction

Traditionally noise reduction was done by looking at adjacent pixels. Photoshop has a filter called “despeckle” which will remove some of the noise in a frame. It does so by looking at other pixels in the frame, and so we get 2 dimensions (vertical and horizontal). By looking at frames in the past, we can add a 3rd dimension – time – to the mix (hence 3D). If the camera is stationary, the algorithm tries to determine if a change of pixel value between frames, is caused by true change, or because of noise. Depending on a threshold, the change is either discarded as noise, or kept as a true change. You might also hear terms such as spatiotemporal, meaning “space and time” – basically, another way of expressing the same as 3D noise reduction.

 

Default Passwords and ONVIF

Update:
Before you judge me borderline insane, in this post, I am talking about FACTORY DEFAULT passwords, for example : mobotix

My darling ONVIF, you’ve come of age and I tried to woo you. But you turned me down. Again.

A while back I decided to take the plunge, and have some fun with ONVIF. Axis knows I’ve been very happy with ONVIF’s older, and more mature, sister VAPIX. VAPIX is damn nice. But there’s a certain allure to ONVIF, the young, promiscuous rebel, and I wanted to see if I could tame her too.

ONVIF provides a handy mechanism for detecting ONVIF cameras on the local subnet. Easy peasy.  Got all the cameras in a jiffy. Next step was to get some attributes about each camera. And suddenly the approachable darling turned out to be an outright bitch.

Usually, using a web service is a one-two-three step process. Very simple, which is important if you want any sort of penetration. Unfortunately, the camera in question decided that I wasn’t worthy of a response. Usually, I would have given up, but I was in a fighting mood so a few hours of searching high and low, I found a piece of code that would allow me to authenticate properly with the camera. That, in my opinion, is fail #1. I doubt that there would be any way for me to figure out what the hell was wrong by looking at the authentication failure error code, and it’s not as if the ONVIF site makes it clear either. Now that I spent a day looking for it, I am going to be an asshole and not share the solution until my own thing is good and done.

A small part of my  problems is that I used the root account to access the camera. The root user (built in to Axis cameras) is not an “ONVIF user”. I can – apparently – create an ONVIF user by using the root-credentials and some ONVIF wdsl, but I haven’t tried that yet. My workflow would then be : detect cameras, then connect to the camera to get caps using some user-supplied credentials (say onvif_user:1234). Now that may fail, because the user hasn’t been created yet, so I will now have to use the VAPIX root account (which the user also has to supply the password for) to create the onvif_user account. THEN I will be able to finally do ONVIF. But it’s a damn mess from the user perspective. Especially because it’s a really bad idea to have the same root password on all the cameras.

It seems to me that the lack of an ONVIF default user is a problem.

Ideally, you’d plug in your ONVIF cameras, the DHCP server gives them an IP with a long lease. We then find the cameras on the network using the default credentials. Once you decide to import a camera, the NVR server should change the cameras password and store that in an encrypted file. This way the cameras are easy to install, and you maintain security.

The way it works now is too cumbersome and error prone, and it doesn’t scale too well. I don’t want my users to fiddle with spreadsheets for each installation.

I’ve created a small page where you can, if you like, see and add default credentials for various cameras.

List of default usernames and passwords

Let’s work together and make ONVIF viable.

HD vs PTZ

I have to agree with the Luddites. Analog PTZ is far superior to IP MP Video. Especially if you need to really zoom in on tiny details, and you have a manned system…

…But that’s like judging a fish by it’s ability to climb trees (as Einstein supposedly said once).

Let’s flip it around, and ask how well an analog PTZ camera can look in two directions at once. Let’s ask if analog PTZ can do a tour at 90 degrees per second, 24-7-365 without breaking. Let’s try and do analog PTZ in Houston, from an office in New York on a shoestring. Let’s ask if we can change our minds and zoom in on a different area after the video was recorded.

Now, if an IP based optomechanical PTZ camera was given the same budget as the cost of wiring an analog one, then you would not be able to tell them apart at all. And I am guessing you don’t really need to spend the full budget to get equal performance – you can probably get good (perhaps not as good) performance a lot cheaper.

But what if you really wanted to replace a mechanical PTZ with a MP camera?

I guess a lot of installation were getting mechanical PTZ’s in the past because there was no other choice. Now there is. It’s fairly cheap to install 3 fixed cameras vs. 1 PTZ (simply because you don’t need to pull 3 cables all the way back to the recorder – you can pop in a POE switch and cluster the 3 cameras). If you then put in 3 decent cameras you are golden. You even get to see things from 3 vantage points – something PTZ will never do. Even if a vandal breaks one, you still have 2 others that are recording.

It is true that right now, the cost of 2 additional camera licenses are a burden, but I think that cost will come down dramatically over the next 24 months.

But a mechanical PTZ camera is really equivalent to a Gigapixel camera. If you do the maximum zoom level, and do a full pan-tilt of the area you get a huge resolution. If you were monitoring highways it would make sense to have an optical PTZ at the intersections which would allow you to zoom in much more than the MP would ever let you.

Another disadvantage to HD cameras are that they take up a lot of space and they require a lot of processing power to decode. This is mostly an issue for the client side developer (as we need to decode the frames to show them to you 😉 ), but an issue nonetheless. Some people will just compress the crap out of those feeds, but that totally negates the purpose. You might as well use a lower resolution camera then. Sometimes the framerate gets lowered to the point where you might as well be looking at a slideshow – but that might just be good enough for the user.

So I think Todd Rockoff is correct. HD and PTZ are complimentary.

Motion Detection on the Edge

When we design a surveillance system, we need to carefully consider how we allocate resources and distribute workloads. When you add a camera to an NVR, the most common use is to reduce the camera to a fairly dumb “video transmitter” and then let the server do the heavy lifting.

But even if the server is much, much more powerful than your humble IP camera, it is usually taxed with a lot of work. One of the tasks the server routinely carries out is to do what some folks call “motion detection”. The term is usually misleading as the NVR is not really detecting motion at all. It is detecting “changes in the frame”, which could be noise, light, and transition from color to B/W etc. not related to what we understand as “motion” at all. Analytics engines look at differences too, but they are truly looking for “motion” and not JUST changes.

Looking for changes is usually “good enough”, and does not need to be any more than that. And if looking for “change” is what you need, then you really should let your camera do the work and free up the NVR to do more important things.

The reason we initially decided to analyze the frames for changes was really motivated by storage problems. A common HDD in those days was 200-300 MB, the 640×480 frames were considered “high resolution” and the format was always MJPEG. Naturally, the Axis 200+ could not deliver these crisp HD feeds at anywhere near 30 FPS. 3-5 FPS was usually all you could get. But storing this massive amount of data became a problem, so we decided to discard frames that were almost identical.

Naturally, as time passed we got higher resolutions and higher framerates, we were suddenly able to do MPEG4 encoding on a consumer device – in real time!!! MPEG4 and H.264 actually looks at two successive frames in much the same way we do on the NVR. The codec simply “throws away” the redundant information just as we do. Except the codec is throwing away just the parts of the frame that is similar to the previous one, while preserving only the changes – a much, much better way of doing things.

For the codec to figure out what to throw away, it must look at two successive frames. If they are very similar, it can throw away a lot, if they are very different it needs to send almost all the pixels. On top of that H.264 does a lot of other things before the video is sent across the network. These involve among other things – discrete cosine transformation, quantization and Huffman encoding.

It does not seem like a far stretch that the codec implementation could provide a number that tells the camera how much 2 frames are alike. And in a primitive way it actually does – if the frame is large in terms of bytes, then we can deduce that the frames are very different, if the frames are small, then they are very similar. Naturally this is too crude and would not work on CBR feeds, and there is no windowing etc.

Nor does it seem totally unreasonable that the codec implementation could give the “difference parameter” for each macroblock (a small 8×8 pixel block). It is important to understand that the codec already is doing the computation, we are just asking to get to peek at the result. Furthermore, the codec is also working on the crisp uncompressed frames that have the highest level of fidelity, and no information has been thrown away.

In naive implementations like the one I describe here, there is not a lot to be gained from working on the raw frames in the camera, but ask any analytics vendor if they would prefer to work on the video BEFORE or AFTER compression and the answer will uniformly be the same : BEFORE compression. So while the benefit is not huge, it is not completely without merit.

To do the detection on the NVR, the NVR will have to completely reverse the process: Take the Huffman symbols, and expand them into imaginary coefficients, go from frequency to the spatial domain, and only then can you start to think about examining the frames. You can then make all sorts of tricks – perhaps you only look at every N pixel, perhaps you don’t look at every frame, perhaps you get a lot of noise from too heavy compression, perhaps you don’t. Every single trick lowers the “quality” of the detection. Perhaps the client doesn’t care, even with severe degradation of the quality, and that’s fine by me. I am focused on improving and providing better, more efficient solutions and offering them to the ones who appreciate such things.

The point is this – spending a lot of resources decoding a H.264 stream, to get information that could have been gathered almost for free in the camera, is not my idea of efficient allocation of the resources. It is like rejecting a free apple, only to ride 30 miles to the store to buy the same, exact apple, only now it is slightly bruised from transporting it to the store – AND it takes a lot of effort to unwrap the apple.

In time, an NVR will not need to do much, in fact, I expect an NVR to be very similar to a NAS. Cheap, easy to replace, and very scalable. This will require that the cameras become a little more advanced, but my experience tells me that progress doesn’t just stop. We were amazed by 640 x 480 at 4 fps when I started, and just as we laugh at that today, we will laugh at NVR side change detection 10 years from now.

I suspect that a lot of cameras do not have the fine grained control over the encoding process that is needed here. I assume that they are picking off-the-shelf H.264 encoders or reference designs offered by the chip manufacturers. For such cameras, there might not be a simple way to do on-board processing, and doing so may jeopardize the performance of the camera – for those, you will have to spring for the expensive PC’s.

Start preparing for the change 🙂

Pros and Cons of Web Interfaces in Video Surveillance Applications

Wow – longest headline ever.

A very common request is a web-based interface to the video surveillance system. An often used argument is that the end user won’t have to install anything, and that the system is readily available from a variety of platforms, after all – google.com works on macs, PCs, my 5 year old cell phone and my wife’s spanking new iPhone*

Most people are probably familiar with ActiveX controls that are needed when streaming various video formats from a camera to a web browser. While you may not think that you are “installing” anything (since the ActiveX or plugin does not necessarily appear in the Add/Remove programs window), you actually DID. A piece of executable code was downloaded and written to your hard drive, not unlike downloading and running a regular installer. ActiveX controls may require numerous supports DLL’s, which will be downloaded on demand. So even if the installation method is a little different for ActiveX, you are technically installing something on the machine.

The ActiveX controls are platform dependent (you can’t use a windows control on a mac), and they present a security risk unless managed carefully, but then there are Java Applets. These are sort of platform independent,  but can be (always are) a little slower than ActiveX. Adobe Flash is another option, but it won’t work on my wife’s iPhone, the same goes for Silverlight.

Although the second part of the argument is technically true, there are some costs to bear; although getting text and static images on the screen using baseline HTML is trivial, interactivity and streaming video is a different beast altogether. A commonly used technique is AJAX, which pretty much boils down to issuing requests asynchronously to a server using a XML object, but the XML object differs from browser to browser, so you need to write two different pieces of code to accomplish the same feat — on the same OS! Granted, the handing of the different browsers is well documented, and libraries exist that helps the developer overcome these annoyances, but for all intents and purposes, we have just doubled the number of platforms (IE and “everybody else”). The same applies to CSS, and even PNG handling.

Some companies will happily put together a “web solution”. But if you are still pretty much locked into Windows, IE, and you STILL need to install a bunch of ActiveX controls, what’s the point? Often the web solution is a little less useful than the traditional Windows application since the developers are limited to the browsers capabilities, whereas the old-skool application can pull all the levers in the system.

Recently Adobe added GPU accelerated video playback to Flash, and HTML5 is supposed to support H.264. Javascript is now very fast on a wide range of browsers (IE 9 was shown at MIX10 and looks promising, Chrome has always had fast JS). So perhaps a viable solution for desktop PC’s and macs will be available before too long.

*actually she has a Nokia phone, but I needed to add the iPhone in there somehow.