Lambs to the Slaughter

When lambs are loaded on trucks, as they are sent to the slaughterhouse, I doubt the driver, or the farmer or basically anyone tells them that this is where they’re going.

I wonder what the lambs are thinking.

If they could talk, some would probably say “we’re doomed”, and others would stomp on them and say “shut the hell up, can’t you see everyone is getting anxious” or “why can’t you be more positive”.

Maybe anxiety is natures way of telling you that something bad may be coming your way. If you’re in a rustling truck, driving far away from the farm, it’s appropriate to feel anxious. It’s telling you to be aware of whats going on, and think of an escape route.

The lambs see the butcher, but they don’t know what they’re looking at. The guy is not going to scream “I’M HERE TO KILL YOU ALL”, he’ll whisper reassuringly “come here little friend, follow me”.

Don’t listen to him.

Run away.


Open Systems and Integration

Yesterday I took a break from my regular schedule and added a simple, generic HTTP event source to Ocularis. We’ve had the ability to integrate to IFTTT via the Maker Channel for quite some time. This would allow you to trigger IFTTT actions whenever an event occurs in Ocularis. Soon, you will be able to trigger alerts in Ocularis via IFTTT triggers.

For example, IFTTT has a geofence trigger, so when someone enters an area, you can pop the appropriate camera via a blank screen. The response time of IFTTT is too slow and I don’t consider it reliable for serious surveillance applications, but it’s a good illustration of the possibilities of an open system. Because I am lazy, I made a trigger based on Twitter, that way I did not have to leave the house.

Making a HTTP event source did not require any changes to Ocularis itself. It could be trivially added to previous version if one wanted to do that, but even if we have a completely open system, it doesn’t mean that people will utilize it.



TileMill and Ocularis

A long, long time ago, I discovered TileMill. It’s an app that lets you import GIS data, style the map and create a tile-pyramid, much like the tile pyramids used in Ocularis for maps.


There are 2 ways to export the map:

  • Huge JPEG or PNG
  • MBTiles format

So far, the only supported mechanism of getting maps into Ocularis is via a huge image, which is then decomposed into a tile pyramid.

Ocularis reads the map tiles the same way Google Maps (and most other mapping apps) reads the tiles. It simply asks for the tile at x,y,z and the server then returns the tile at that location.

We’ve been able to import Google Map tiles since 2010, but we never released it for a few reasons:

  • Buildings with multiple levels
  • Maps that are not geospatially accurate (subway maps for example)
  • Most maps in Ocularis are floor plans, going through google maps is an unnecessary extra step
  • Reliance on an external server
  • Licensing
  • Feature creep

If the app is relying on Google’s servers to provide the tiles, and your internet connection is slow, or perhaps goes offline, then you lose your mapping capability. To avoid this, we cache a lot of the tiles. This is very close to bulk download which is not allowed. In fact, at one point I downloaded many thousands of tiles, which caused our IP to get blocked on Google Maps for 24 hours.

Using MBTiles

Over the weekend I brought back TileMill, and decided to take a look at the MBTile format. It’s basically a SQLite DB file, with each tile stored as a BLOB. Very simple stuff, but how do I serve the individual tiles over HTTP so that Ocularis can use them?

Turns out, Node.js is the perfect tool for this sort of thing.

Creating a HTTP server is trivial, and opening a SQLite database file is just a couple of lines. So with less than 50 lines of code, I had made myself a MBTile server that would work with Ocularis.


A few caveats : Ocularis has the Y axis pointing down, while MBTiles have the Y axis pointing up. Flipping the Y axis is simple. Ocularis has the highest resolution layer at layer 0, MBTiles have that inverted, so the “world tile” is always layer 0.

So with a few minor changes, this is what I have.


I think it would be trivial to add support for ESRI tile servers, but I don’t really know if this belongs in a VMS client. The question is time was not better utilized by making it easy for the GIS guys to add video capabilities to their app, rather than having the VMS client attempt to be a GIS planning tool.


One Auga Per Ocularis Base*

In the Ocularis ecosystem, Heimdall is the component that takes care of receiving, decoding and displaying video on the screen. The functionality of Heimdall is offered through a class called Auga. So, to render video, you need to create an Auga object.

Ocularis was designed with the intent of making it easy for a developer to get video into their own application. Initially it was pretty simple – instantiate an Auga instance, and pass in a url, and viola, you had video. But as we added support for a wider range of NVRs, things became a little more complex. Now you need to instantiate an OCAdapter, log into an Ocularis Base Server. Then, pass the cameras to Auga via SetCameraIDispatch and then you can get video. The OCAdapter in turn, depends on a few NVR drivers. So deployment became more complex too.

One of the most common problems that I see today, is that people instantiate one OCAdapter, and one Auga instance per camera. This causes all sorts of problems; each instance counts as one login (which is a problem on login-restricted systems), every instance consumes memory and memory for fonts and other graphics are not shared between the instances. In many ways, I should have anticipated this type of use, but on the other hand, the entire Ocularis Client is using Heimdall/Auga as if it was a 3rd party component, and that seems to work pretty well (getting a little dated to look at, but hey…)

Heimdall also offers a headless mode. We call it video-hooks, and it allows you to instantiate an Auga object, and get decoded frames via a callback, or a DLL, instead of having Auga draw it on the screen. The uses for this are legion, I’ve used the video-hooks to create a web-interface, and until recently we used it for OMS to, video analytics can use the hooks to get live video in less than 75 lines of code . Initially the hooks only supported live video, but it now supports playback of recorded video too. But even when using Auga for hooks, should you ever only create one Auga instance per Ocularis Base. One Auga instance can easily stream from multiple cameras.

However, while Heimdall is named after a God, it does not have magical capabilities. Streaming from 16 * 5 MP * 30 fps will tax the system enormously – even on a beefy machine. One might be tempted to say “Well, the NVR can record it, so why can’t Auga show it?”. Well, the NVR does not have to decode every single frame completely to determine the motion level, but Auga has to decode everything, fully, all the way to the pixel format you specify when you set up the hook. If you specify BGR as your expected pixel format, we will give you every frame as a fully decoded BGR frame at 5MP. Unfortunately, there is no way to decode only every second or third frame. You can go to I-Frame only decoding (we do not support that right now), but that lowers the framerate to whatever the I-frame interval would be, typically 1 fps.

If you are running Auga in regular mode, you can show multiple cameras by using the LoadLayoutFromString function. It allows you to create pretty much any layout that you can think of, as you define the viewports via a short piece of text. Using LoadLayoutFromString  (account reqd.) you do not have to handle maximizing of viewports etc. all that is baked into Auga already. Using video hooks, you can set up (almost) any number of feeds via one Auga instance.


Granted, there are scenarios where making multiple Augas makes sense – certainly, you will need one per window (and hence the asterisk in the headline), and clearly if you have multiple processes running, you’d make one instance per process.

I’ll talk about the Direct3D requirement in another post.

Wildcard Matching in Ocularis

Before you read this : Yes. I am repeating myself.

Some folks might remember when Yahoo! had a folder like structure. A sort of hierarchy, where everything was sorted according to a category, a subcategory, and a sub-subcategory. Naturally, they had a fallback wildcard search too. As it turns out, the wildcard search became quite popular. At some point, I believe the search engine was Google.

Now, the filtering in Ocularis is certainly not Google, but the point is that sometimes a linear search gets you where you want to be faster than going through folders (I have folders on dh0: but I often resort to search anyways).

Here’s a contrived example:


Camera Thumbnails

In the previous version of the administrator tool, we relied heavily on camera thumbnails. In the newest version, we have opted for a more compact tree control. We experimented with thumbnails in the tree, but the UI started to look more like an abstract painting. Sometimes you need a little visual reminder though, so we added a thumbnail panel.

In the lower left corner of any camera picker control, you will see a little triangle. The triangle pops the panel that shows a thumbnail, the camera label, and any comment you may have associated with the camera.

Here’s how it works.

Small Improvements to Ocularis Administrator

You can skip this semi-commercial plug, you have been warned!

In the view designer we added a control that allows you to create on-the-fly search filters. The way it works is that you can enter a wildcard, and then store that wildcard for future use. Sounds lame, but it works pretty good.

Imagine you have a list of cameras named like so :

  • Courtyard South
  • Courtyard North
  • Entrance South
  • Entrance North

As you type – say Courtyard – the list will be filtered to have just cameras that matches that string. Now, you don’t want to enter that string over and over again, so you can click “store”, and you’ve instantly created a filter group. If you then, later, rename or add a camera and the camera contains the word “courtyard”, you will be able to find it in your filter group.

Here’s a little video of the feature in use (in a very small system)


The match is a simple wildcard match. One might think that Regex would be a much better offering. Regex not only offers matching, but also parsing of strings into match groups. We use Regex internally to parse video URI’s etc, because we need the different parts of the string (http://, localhost, 80, /Server1 for example). For most (not all) users regex would simply make the experience quite frustrating and error prone.

As an example of a regular expression, here’s how you match a dotted IP.


ISC West 2012

65 days since my last post. Well. We have been busy (we still are), but I need to spend a little time updating my blog every once in a while. ISC West is over, and I think now is a good a time as any.

We released Ocularis-X at the show. We think it is pretty cool, although I was more than a little worried that people would just look at it and go “yeah, it’s a web client – big deal”. Behind the covers we are doing some fancy footwork to offer the client high framerate, high resolution video over limited bandwidth. But to the end user it looks a lot like what everyone else is showing – except we demoed it over a tethered cellphone, and not on a LAN (at the booth we had to use wire because the wireless signals are just too unreliable).

What you can’t see is the flexibility that we’ve built into the server. Changing the protocol to suit Android devices took less than 5 minutes of coding, and I should also mention that the transcoder is a big bowl of dogfood too. Dogfood in the sense that the transcoder is using the Ocularis SDK components, so the web team were pretty much like a 3rd party developer, except they were in striking distance of yours truly, and could kick my behind when things were missing in the SDK.

Enough with the self-praise.

I spent a fair amount of time wandering aimlessly around the many booths at the show. Some booths were painfully vacant, and sometimes the presentations had nothing to do with the products at all. One company had a magician pull in the crowds. I wonder how many relevant leads that will yield. Everybody and their dog got scanned when they were watching the spectacle. Axis’s presentation was right across from ours (Brad is a freakin’ genius on stage), so there were pretty much non-stop presentations going on.

I love the Axis lightpath stuff. I think the lightpath idea is much, much more interesting than just high megapixel cameras. One company had a banner saying 1 camera replaces 96 VGA cameras. I’d take 96 VGA’s over one high megapixel camera any day. People keep educating me on this, but I probably will never learn. If that camera goes down, by the same logic, you are losing 96 (VGA) cameras. I am not against megapixel at all, I just don’t think it is a meaningful statement. Megapixel is part of a natural evolution – higher fidelity (although not all pixels are created equal), but one high quality vantage point can never replace multiple vantage points of lower quality. It’s apples and oranges. The 1 billion pixel camera reminded me of an old TED presentation of something called “Seadragon”, today it is called “deep zoom” I believe – pretty cool stuff.

The best thing about going to the show was to get to meet a lot of folks that use the software we wrote. Their feedback is a great source of inspiration to me – after a couple of beers, I usually get plenty of ideas, but talking for real users and dealers put things a little more in perspective.

It was good to finally meet you all in person. Hopefully we will meet again.

Live Layouts vs. Forensic Layouts

In our client, if you hit browse, you get to see the exact same view, only in browse mode. The assumption is that the layout of the live view is probably the same as the one you want when you are browsing through video.

I am now starting to question if that was a wise decision.

When clients ask for a 100 camera view, I used to wonder why. No-one can look at 100 cameras at the same time. But then I realized that they don’t actually look at them the way I thought. They use the 100 camera view a “selection panel”. They scan across the video and if they see something out of place they maximize that feed.

I am guessing here, but I suspect that in live view, you want to see “a little bit of everything” to get a good sense of the general state of things. When something happens, you need a more focused view – suddenly you go from 100 cameras to 4 or 9 cameras in a certain area.

Finally, when you go back and do forensic work, the use pattern is completely different. You might be looking at one or two cameras at a time, zooming in and navigate to neighboring cameras quite often.

Hmmm… I think we can improve this area.

Manual Gain Control on the Client

Another day, another experiment.

A question was brought up on one of the LinkedIn forums that I follow. I am no fan of Automatic Gain Control (AGC) as it is implemented today (I’m no fan of a lot of things it seems). The reason is that it seems to me that a lot of AGC implementations are pretty naive implementations that just multiply each pixel by a value determined by averaging the frame. I have not come across a system that will allow you to apply a different gain value to different parts of the frame.

AGC introduces a lot of noise into the frame. This in turn causes a) the compression to go to shit, and b) wreaks havoc on the motion detection. A lot of customers will pretty much be recording noise all night.

So why not do it on the client application?

Here’s a quick example I cooked up. I am using a couple of soft-buttons and since we are using the GPU for rendering the video, the cost of doing the multiplication is almost zero (the GPU was built for this sort of thing, so why not use it?)

Nothing fancy.