Category Archives: Surveillance Software Design

Technical Innovation

This might be wonky, so if you’re not into that sort of thing, you can save yourself 5 minutes and skip this one.

Enterprise installations of software usually consists of several semi-autonomous clusters (or sites) that are arranged in either some sort of hierarchy. Usually a tree-like structure, but it could be a mesh or graph.

Each cluster may have hundreds or even thousands of sensors (not just cameras) that all can be described by their state. A camera may be online, it may be recording, a PIR might be triggered or not and so on. Basically, sensors have three types of properties: Binary, scalar and complex ones (an LPR reader that returns tag and color of a car).

The bandwidth inside each cluster is usually decent (~GBit), and if it isn’t an upgrade of the network is usually a one time fee. Sure, there might be sensors that are at remote locations where the bandwidth is low, but in most cases bandwidth is not an issue inside the cluster. Unfortunately, the bandwidth from the cluster to the outside world is usually not as ample.

It’s not uncommon that a user at the top of the hierarchy will see a number of clusters, and it seems reasonable that this user also wants to see the state of each of the sensors inside the cluster. This, then requires that the state information of each sensor is relayed from the cluster to the client. The client may have a 100 Mbit downstream connection, but if the upstream path from the cluster to the client is a mere 256 Kbit/s then getting the status data across in real time is a real problem.

In the coding world there’s an obsession with statelessness. Relying on states is hard, so is deleting memory after you’ve allocated it, and debugging binary protocols isn’t cakewalk either, so programmers (like me) try to avoid these things. We have garbage collection (which I hate), we have wildly verbose and wasteful containers such as SOAP (which I hate) and we have polling (which I hate).

So, let’s consider a stateless design with a verbose container (but a lot more terse than soap): The client will periodically request the status.

client:
  <request>
    <type>status</type>
    <match>all</match>
  </request>

server:
  <response>
     <sensor id="long_winded_guid_as_text" type="redundant_type_info">
        <state>ON</state>
     </sensor>
  </response>

The client request can be optimized, but it’s the server that is having the biggest problem. If we look at the actual info in the packet, we are told that a sensor, with a guid, is ON.

Let’s examine the entropy of this message.

The amount of data needed to specify the sensor depends heavily on the number of sensors in the system. If there’s just 2 sensors in total, then the identifier requires 1 bit of information (0 or 1), if there are 4 sensors, you’ll need 2 bits (00, 01, 10 and 11), and so if there are 65000 sensors, you’ll need 16 bits (2 bytes).

The state is binary, so you’ll need just a single bit to encode the state (0 for off, 1 for on).

However, since the state information can be of 3 different types, you’ll need a status type identifier as well. However, this information only needs to be encoded in the message if the type actually changes. If the status type stays the same, it can be cached on the receiving side. However, in order to remain stateless, we’re going to need 2 bits to encode the type (00 for binary, 01 for scalar, 10 for complex).

so, for a system with 60.000 sensors, a message might look like this

[ sensor id          ] [ type ] [ value ]
  0000 0000 0000 0001    00       1

19 bits in total

The wasteful message is 142 bytes long, or 1136 bits, or 59 times larger, and it is actually pretty terse compared to some of the stuff you’ll see in the real world. In terms of bandwidth, the compact and “archaic” binary protocol can push 59 times as many messages through the same pipe!

Now, what happens if we remove the statelessness? We could cache the status type on the receiving side, so we’d get down to 17 bits (a decent improvement), but we could also decide to not poll, and instead let the server push data only when things change.

The savings here are entirely dependent on the configuration of the system, and sensors with scalar values probably have to continuously send the value. A technique used when recording GPS info is to only let the device send its position when it has changed more than a certain threshold or when a predefined duration has passed. That means that if you’re stationary, we only have location samples at a 1 sample per minute for example, but as you move, the sample rate increases to 1 per second. The same could be used for temperature, speed and other scalar values.

So, how often does a camera transition from detecting some motion, to not detecting any? Once per second? Once per minute? Once per hour? Probably all 3 over a 24 hour period, but on average, let’s say you have one transition every 5 minutes. If you’re running a 1 Hz polling rate vs. a sensor that reports state changes only, you’re looking at a 300x reduction in bandwidth.

In theory we’re now about 17000 times more efficient using states and binary protocols.

In practice, the TCP overhead makes quite a bit less. We might also add a bit of goo to the binary protocol to make it easier to parse, and that’s the other area where things improve: processing speed.

Let’s loosen up the binary protocol a bit by byte-aligning each element. This increases the payload to 32 bits (~8000x improvement)

[ sensor id       ] [ type    ] [ value ]
0000 0000 0000 0001  0000 0001  0000 0001

It is now trivial, and extremely fast to update the sensor value on the receiving side. Scalar values might be encoded as 32 bit floats, while complex statuses will take up an arbitrary amount of data depending on the sensor.

The point is that sometimes, going “backwards” is actually the low hanging fruit, and sometimes there’s a lot of it to pluck.

The design of a system should take the environment into consideration as well. If you’re writing a module for a client where you have ample bandwidth, and time to market matters, it would be most appropriate to use wasteful, but fast tools. This means that as part of a development contract, the baseline environment must be described. Failure to do so will eventually lead to problems when the client comes back, angry that the system you wrote in 2 days doesn’t work via a dial-up connection.

 

Tagged , , ,

CamStreamer

Axis does not support RTMP as a transmission protocol in their firmware (correct me if I am wrong about this), this seems to be an omission as RTMP offers a pretty good way for the camera to initiate an outbound connection to an RTMP service somewhere outside the network. I believe this is how DropCam works (again, an assumption).

As most probably know (or should know), an Axis camera is basically a small Linux computer with a camera attached. Axis will let you download a development environment, which you can then use to compile applications for the camera. This is called ACAP.

Before compiling an RTMP library for ACAP, I decided to check if someone else had done this. Turns out CamStreamer had done so, and I decided to give it a whirl.

Setup is extremely simple, and the UI looks modern and clean. It has support for some of the most popular streaming services, but I needed the RTMP ingest server (Universal)  as I have my own RTMP servers in the cloud.

Options

Setting this stuff up was trivial,

generic

The system pretty much worked like a charm with a single exception. During configuration, there was an error message telling me that the camera could not reach an outside host on port 1935 (the RTMP port). The error message is helpful and accurate. But later, I lost my internet connection completely (a screw-up at my ISP, where they switched me back to DHCP) and I was unable to enter the RTMP configuration page on the camera entirely. I guess this is because the HTML/configuration is served from a host in the cloud. I did not dig into it, but it struck me as a bit odd.

The ACAP application is pretty expensive: $299 for a 1-camera license (at the time of writing). This is pretty steep as you can hook up a Raspberry-PI type device that will do the same for several cameras for less than half that.

I am not sure why RTMP support is not built into more cameras (again, correct me if I am wrong). The “push” model of RTMP is almost perfect for cloud-based solutions, so why this is not standard fare for IP cameras is beyond me.

While Flash Video uses RTMP, RTMP can be used w/o Flash being involved at all. E.g. if I publish a stream via RTMP to EvoStream, I can view the stream via RTSP instead (and what can you do with RTSP streams and your favorite VMS?). It’s true that the reliance on TCP for RTMP introduces a bit of latency, but in practice, this is hardly a problem (if you are going to use a cloud solution, latencies are going to be pretty high compared to a local one).

Tagged , , ,

The Parts of an IP Camera

To understand where the IP camera market is headed, I think it’s important to understand how one of these things are put together.

Like most high tech devices, each product is really an amalgamation of parts from different manufacturers. In fact many products are the result of tight, but perhaps unappreciated, collaboration of several (sometimes competing) companies. I’d recommend listening to Freakonomics rundown of the “I, Pencil” essay (starts 7 minutes in).

So, an IP camera is not a pencil, but just like all pencil manufacturers don’t manufacture every single part of the pencil, but instead, they purchase the parts (graphite, brass, paint and so on) and every manufacturer puts the pencil together following roughly the same pattern.

And, so, when it comes to IP cameras, they too are composed of parts that are available to everyone who wants to start making cameras.

You’ll need a couple of things: A lens, a sensor, some circuitry and some code.

You’re not going to start making your own lenses or sensors, are you? Probably not, so you’ll get the lenses from a lens maker (and they may even outsource their manufacturing process even further), and the sensor from either Sony or Canon.

You’re not going to design your own CPU either (unless you’re Axis). Today, you’d be better off grabbing an ARM platform and use that to drive the sensor and interface. The other advantage is that ARM is well supported in the software world, so you’re already halfway there.

Now that you have the basics, you need to write some code to get it all working together. If you went the ARM route, it’s pretty simple to get a linux kernel running. Well.. “simple” is depends on your level of skill, but finding a few geeks who can do this shouldn’t take long. So you grab the Linux kernel, add Apache or perhaps GoAhead, you can add gStreamer too (do check the link, it is a great presentation by Axis) . The next thing you know, you have a jumble of cables and breadboards, burns on your fingers from the soldering iron, you haven’t seen your kids in 4 days and the smell is getting a little hard to stomach.

On top of that, you need to wrap this in an enclosure. There’s regulations to follow, tests that need to be carried out and so on. Then you have the nightmare of maintaining all those pieces of code, and trust me – if you wrote everything yourself, it would take even longer and be much harder to test and maintain.

What if there was a company, that could do all of the above? And just stick my name on the box? After all, my company would pick the same lens, the same sensor, the same board and the same software, so why not do it?

I have no intention of starting production of a Raspberry Pi Zero based IP camera, but I know that I can make one for ~$40 (and that’s buying all the parts retail). Not only will this thing work as an IP camera, it can work as a full fledged stand-alone VMS.

In other words, the question is: if some washed up coder in Copenhagen can build a fully functional “IP camera” for $40, I think you’re going to face a tough time if you’ve based your entire organization around selling your cheapest cameras for $250+ (they may be “even more good enough”, but who cares?).

Obviously, my camera is not going to be materially different from the other guy’s cameras. We’re all going to use the same bits and pieces, including software, even the damn protocols are going to be the same.

So, I think we’re going to see a race to the bottom in terms of prices. The cameras will look and perform almost identically across brands, use the same protocols, and be completely interchangeable, much to the chagrin of the incumbents, so the USP for the brands in this realm will have be something else.

VMS Software, perhaps…

 

 

 

 

 

 

Radars

 

A while back I got fed up with people parking their cars right in front of my driveway, and I decided to find a solution.

A camera could work, but since I am cheap, I decided to look for something a bit more… economical. A PIR sensor wouldn’t work because it triggers when there is motion, and cars and people pass by all the time, so I looked into ultrasonic sensors and eventually radars. If the distance measured drops below a pre-defined threshold and stays there, I know to run into the street, yelling and screaming.

The inspiration came from Adafruit and Andreas Spiess who has a great YouTube channel where you can get more information about ultrasonic sensors and radars (and about 1000 other things).

 

Basically, you can get an Arduino capable board. I hope the WiFi capable ESP8266 will work (since I have one lying around). Then get some cheap sensors from China via Alibaba and you’re ready to experiment. At the very least, it should give you some idea of the base cost of such a device.

Both Axis and Avigilon have launched commercial versions of miniature radars that can interface to your favorite VMS. Combined with a PTZ camera this might be a very interesting combination that offers a bit more smartness than the good old PIR/PTZ combo.

 

P2P

As with IP cameras, one of the IoT challenges is how to get your controlling device (typically a phone) to talk to the IoT device in a way that does not require opening up inbound ports on your firewall.

All communication is peer to peer, so the term, when used in the context of IoT devices, is perhaps a little misleading, after all, an exposed camera sending a video stream to a phone somewhere is also “peer to peer”. Instead, P2P might be translated to “send data from A to B, even if both A and B are behind firewalls, using a middleman C” (what the hell is up with all the A, B, C these days).

On a technical level, the P2P cameras use something called UDP hole punching, which sounds a bit onymous, but there’s really nothing sneaky about it. What happens is that A connects to C, so that C now knows the external IP address of A. Likewise, B also connects to C, and now C knows the external IP address of both A and B.

This middleman, now passes the IP address of A to B, and B to A. Next step is for A to fire a volley of UDP packets towards B, while B does the same towards A.

The firewall on A’s side sees a bunch of packets travel to B’s address, and when B’s packets arrive, the firewall thinks that the UDP packets are replies to the packets that were sent from A and let’s them through.

You could accomplish the same thing by having A go to “whatsmyip.com” and email it to B, B would then do the same. Then run scripts that send UDP packets over the network, but a STUN server automates this process.

But who controls this “middle man”? Ideally, you’d be in charge of it; you’d be able to specify your own STUN-type server in the camera interface, so that you have full control of all links in the chain. In time, perhaps the camera vendors will release a protocol description and open source modules so that you can host your own middle-man.

The problem might be that you bought a nice cheap camera in the grey market. The camera is intended for the Chinese market, but comes with a “modded” firmware that enables English menus and so on. This is obviously risky. Updating a modded firmware may be impossible and brick the camera, and the manufacturer may be less inclined to support devices that have been modded. You get what you pay for, so to speak (and this blog is free!)

The modder is selling the cameras in the western markets, but the STUN server is still pointing to a server in China. This makes sense if you are a Chinese user, but it may seem very strange that your camera “calls home” to a server in China. A non-modded camera might do the same, simply because running a STUN service is cheaper, and allows the government to eavesdrop on the traffic. If you are Chinese (I am not), you could argue that you don’t trust Amazon, Microsoft or Google because they might work with the NSA. Therefore, using your own server would be preferred.

Apart from the STUN functionality, the camera may follow direction that are sent from B to C to A. This puts a lot of responsibility in the hands of the guys maintaining this server. If it is breached, a lot of cameras will then be vulnerable.

Depending on the end user, P2P may not be appropriate at all. To some users, the cost of a breach is small, compared to the hassle of installing a fully secure system it might be worth it.

While yours truly has abandoned all attempts to appear professional over the years, the truth is that most big installations have their shit together. Unfortunately the volume of DIYers and amateurish installers who don’t really know what they are doing is much bigger (in terms of headcount, not commercial volume), and if there’s one thing we all want to do, it’s to blame someone else.

Caveat Emptor.

this-is-fine.0

 

Codecs and Client Applications

4K and H.265 is coming along nicely, but it is also part of the problems we as developers are facing these days. Users want low latency, fluid video, low bandwidth and high resolution, and they want these things on 3 types of platforms – traditional PC applications, apps for mobile devices and tablets, and a web interface. In this article, I’d like to provide some insights from a developer’s perspective.

Fluid or Low Latency

Fluid and low latency at the same time is highly problematic. To the HLS guys, 1 second is “low latency”, but to us, and the WebRTC hackers, we are looking for latencies in the sub 100 ms area. Video surveillance doesn’t always need super low latency – if a fixed camera has 2 seconds of latency, that is rarely a problem (in the real world). But as soon as any kind of interaction takes place (PTZ or 2-way audio) then you need low latency. Optical PTZ can “survive” low latency if you only support PTZ presets or simple movements, but tracking objects will be virtually impossible on a high-latency feed.

Why high latency?

A lot of the tech developed for video distribution is intended for recorded video, and not for low latency live video. The intent is to download a chunk of video, and while that plays, you download the next in the background, this happens over and over, but playing back does not commence until at least the entire first block has been read off the wire. The chunks are usually 5-10 seconds in duration, which is not a problem when you’re watching Bob’s Burgers on Netflix.

The lowest latency you can get is to simple draw the image when you receive it, but due to packetization and network latency, you’re not going to get the frames at a steady pace, which leads to stuttering which is NOT acceptable when Bob’s Burgers is being streamed.

How about WebRTC?

If you’ve ever used Google Hangouts,  then you’ve probably used WebRTC. It works great when you have a peer-to-peer connection with a good, low latency connection. The peer-to-peer part is important, because part of the design is that the recipient can tell the sender to adjust its quality on demand. This is rarely feasible on a traditional IP camera, but it could eventually be implemented. WebRTC is built into some web browsers, and it supports H.264 by default, but not H.265 (AFAIK) or any of the legacy formats.

Transcoding

Yes, and no. Transcoding comes at a rather steep price if you expect to use your system as if it ran w/o transcoding. The server has to decode every frame, and then re-encode it in the proper format. Some vendors transcodes to JPEG which makes it easier for the client to handle, but puts a tremendous amount of stress on the server. Not on the encoding side, but the decoding of all those streams is pretty costly.  To limit the impact on the transcoding server, you may have to alter the UI to reflect the limitation in server side resources.

Installed Applications

The trivial case is an installed application on a PC or a mobile device. Large install files are pretty annoying (and often unnecessary), but you can package all the application dependencies, and the developer can do pretty much anything they want. There’s usually a lot of disk-space available and fairly large amounts of RAM.

On a mobile device you struggle with OS fragmentation (in case of Android), but since you are writing an installed application, you are able to include all dependencies. The limitations are in computing power, storage, RAM and physical dimensions. The UI that works for a PC with a mouse is useless on a 5″ screen with a touch interface. The CPU/GPU’s are pretty powerful (for their size), but they are no-where near the processing power of a halfway decent PC. The UI has to take this into consideration as well.

“Pure” Web Clients

One issue that I have come across a few times, is that some people think the native app “uses a lot of resources”, while a web based client would somehow, magically, use fewer resources to do the same job. The native app uses 90.0% of the CPU resources to decode video, and it does so a LOT more efficient than a web client would ever be able to. So if you have a low end PC, the solution is not to demand a web client, but to lower the number of cameras on-screen.

Let me make it clear: any web client that relies on an ActiveX component to be downloaded and installed, might as well have been an installed application. ActiveX controls are compiled binaries that only run on the intended platform (IE, Windows, x86 or x64). They are usually implicitly (sometimes explicitly) left behind on the machine, and can be instantiated and used as an attack vector if you can trick a user to visit a malicious site (which is quite easy to accomplish).

The purpose of a web client is to allow someone to access the system from a random machine in the network, w/o having to install anything. An advantage is also that since there is no installer, there’s no need to constantly download and install upgrades every time you access your system. When you log in, you get the latest version of the “client”. Forget all about “better for low end” and “better performance”.

Technology

Java applets can be installed, but often setting up Java for a browser is a pain in the ass (security issues), and performance can be pretty bad.

Flash apps are problematic too, and suffer the same issues as Java applets. Flash has a decent H.264 decoder for .flv formatted streams, but no support for H.265 or legacy formats (unless you write them, from scratch.. and good luck with that 🙂 ) Furthermore, everyone with a web browser in their portfolio is trying to kill Flash due to it’s many problems.

NPAPI or other native plugin frameworks (NaCL, Pepper) did offer decent performance, but usually only worked on one or two browsers (Chrome or Firefox), and Chrome later removed support for NPAPI.

HTML5 offers the <video> tag, which can be used for “live” video. Just not low latency, and codec support is limited.

Javascript performance is now at a point (for the leading browsers) that you can write a full decoder for just about any format you can think of and get reasonable performance for 1 or 2 720p streams if you have a modern PC.

Conclusion

To get broad client side support (that truly works), you have to make compromises on the supported devices side. You cannot support every legacy codec and device and expect to get a decent client side experience on every platform.

As a general rule, I disregard arguments that “if it doesn’t work with X, then it is useless”. Too often, this argument gains traction, and to satisfy X, we sacrifice Y. I would rather support Y 100% if Y makes more sense. I’d rather serve 3 good dishes, than 10 bad ones. But in this industry, it seems that 6000 shitty dishes at an expensive “restaurant” is what people want. I don’t.

 

 

Tagged , , , ,

Marketing Technology

I recently saw a fun post on LinkedIn. Milestone Systems was bragging about how they have added GPU acceleration to their VMS, but the accompanying picture was from a different VMS vendor. My curiosity had the better of me, and I decided to look for the original press release. The image was gone, but the text is bad enough.

Let’s examine :

Pioneering Hardware Acceleration
In the latest XProtect releases, Milestone has harvested the advantages of the close relationships with Intel and Microsoft by implementing hardware acceleration. The processor-intensive task of decoding (rendering) video is offloaded to the dedicated graphics system (GPU) inside the processer [sic], leaving the main processor free to take on other tasks. The GPU is optimized to handle computer graphics and video, meaning these tasks will be greatly accelerated. Using the technology in servers can save even more expensive computer muscle.

“Pioneering” means that you do something before other people. OnSSI did GPU acceleration in version 1.0 of Ocularis, which is now 8 or 9 years old. Even the very old NetSwitcher app used DirectX for fast YUV conversion. Avigilon has been doing GPU acceleration for a while too, and I suspect others have as well. The only “pioneering” here is how far you can stretch the bullshit.

Furthermore, Milestone apparently needs a “close relationship” with Microsoft and Intel to use standard and publicly available quick sync tech. They could also have used FFMpeg.

We have experimented with CUDA on a high end nVidia card years ago, but came to the conclusion that the scalability was problematic, and while the CPU would show 5%, the GPU was being saturated causing stuttering video when we pushed for a lot of frames.

Using Quick sync is the right thing to do, but trying to pass it off as “pioneering” and suggesting that you have some special access to Microsoft and Intel to do trivial things is taking marketing too far.

The takeaway is that I need to remind myself to make my first gen client slow as hell, so that I can claim 100% performance improvement in v2.0.

keep-calm-and-ignore-bullshit-7-257x300

Tagged , ,

Open Systems and Integration

Yesterday I took a break from my regular schedule and added a simple, generic HTTP event source to Ocularis. We’ve had the ability to integrate to IFTTT via the Maker Channel for quite some time. This would allow you to trigger IFTTT actions whenever an event occurs in Ocularis. Soon, you will be able to trigger alerts in Ocularis via IFTTT triggers.

For example, IFTTT has a geofence trigger, so when someone enters an area, you can pop the appropriate camera via a blank screen. The response time of IFTTT is too slow and I don’t consider it reliable for serious surveillance applications, but it’s a good illustration of the possibilities of an open system. Because I am lazy, I made a trigger based on Twitter, that way I did not have to leave the house.

Making a HTTP event source did not require any changes to Ocularis itself. It could be trivially added to previous version if one wanted to do that, but even if we have a completely open system, it doesn’t mean that people will utilize it.

 

 

Tagged ,

Camera Proxy

There’s a lot of paranoia in the industry right now, some warranted, some not. The primary issue is that when you plug something into your network you basically have to trust the vendor to not spy on you “by design” and to not provide a trivial attack vector to 3rd parties.

First things first. Perhaps you remember that CCTV means Closed Circuit Television. Pay attention to those first two words. I am pretty sure 50% or more of all “CCTV” installations are not closed at all. If your CCTV system is truly closed, there’s no way for the camera to “call home”, and it is impossible for hackers to exploit any attack vectors because there’s no access from the outside world to the camera. There are plenty of PC’s running terrible and vulnerable software out there, but as long as these systems are closed, there’s no problem. Granted, it also limits the flexibility of the system. But that’s the price you pay for security.

In the opposite end of the spectrum are cameras that are directly exposed to the internet. This is a very bad idea, and most professionals probably don’t do that. Well… some clearly do, because a quick scan of the usual sites reveal plenty of seemingly professional installations where cameras are directly accessible from the internet.

To expose a camera directly to the internet you usually have to alter the NAT tables in your router/firewall. This can be a pain in the ass for most people, so another approach is used called hole-punching. This requires a STUN server between the client sitting outside the LAN (perhaps on an LTE connection via AT&T) and the camera inside the LAN. The camera will register with the STUN server via an outbound connection. Almost all routers/firewalls allow outbound connections. The way STUN servers work, probably confuse some people, and they freak out when they see the camera making a connection to “suspicious” IP but that’s simply how things work, and not a cause for alarm.

Now, say you want to record the cameras in your LAN on a machine outside your LAN, perhaps you want an Azure VM to record the video, but how will the recorder on Azure (outside your LAN) get access to your cameras that are inside the LAN unless you set up NAT and thus expose your cameras directly to the internet?

This is where the $10 camera proxy comes in (the actual cost is higher because you’ll need an SD card and a PSU as well).

So, here’s a rough sketch of how you can do things.

  • On Azure you install your favorite VMS
  • Install Wowza or EvoStream as well

EvoStream can receive an incoming RTMP stream, and make the stream available via RTSP, it basically changes the protocol, but uses the same video packets (no transcoding). So, if you were to publish a stream at say rtmp://evostreamserver/live/mycamera, that stream will be available at rtsp://evostreamserver/mycamera. You can then add a generic RTSP camera that reads from rtsp://evostreamserver/mycamera to your VMS.

The next step is to install the proxy, you can use a very cheap Pi clone, or a regular PC.

  • Determine the RTSP address of the camera in question
  • Download FFMpeg
  • Set up FFMpeg so that it publishes the camera to EvoStream (or Wowza) on Azure

Say you have a camera that streams via rtsp://192.168.0.100/video/channels/1, the command looks something like this (all on one line)

ffmpeg -i rtsp://username:password@192.168.0.100/video/channels/1 
-vcodec copy -f flv rtmp://evostreamserver/live/mycamera

This will make your PC grab the AV from the camera and publish it to the evostream server on Azure, but the camera is not directly exposed to the internet. The PC acts as a gateway, and it only creates an outbound connection to another PC that you control as well.

You can now access the video from the VMS on Azure, and your cameras are not exposed at all, so regardless how vulnerable they are, they will not expose any attack vectors to the outside world.

Using Azure is just an example, the point is that you want to isolate the cameras from the outside world, and this can be trivially accomplished by using a proxy.

As a side note. If cameras were deliberately spying on their users, by design, this would quickly be discovered and published. That there are bugs and vulnerabilities in firmware is just a fact of life and not proof of anything nefarious, so calm down, but take the necessary precautions.

TileMill and Ocularis

A long, long time ago, I discovered TileMill. It’s an app that lets you import GIS data, style the map and create a tile-pyramid, much like the tile pyramids used in Ocularis for maps.

tilemill

There are 2 ways to export the map:

  • Huge JPEG or PNG
  • MBTiles format

So far, the only supported mechanism of getting maps into Ocularis is via a huge image, which is then decomposed into a tile pyramid.

Ocularis reads the map tiles the same way Google Maps (and most other mapping apps) reads the tiles. It simply asks for the tile at x,y,z and the server then returns the tile at that location.

We’ve been able to import Google Map tiles since 2010, but we never released it for a few reasons:

  • Buildings with multiple levels
  • Maps that are not geospatially accurate (subway maps for example)
  • Most maps in Ocularis are floor plans, going through google maps is an unnecessary extra step
  • Reliance on an external server
  • Licensing
  • Feature creep

If the app is relying on Google’s servers to provide the tiles, and your internet connection is slow, or perhaps goes offline, then you lose your mapping capability. To avoid this, we cache a lot of the tiles. This is very close to bulk download which is not allowed. In fact, at one point I downloaded many thousands of tiles, which caused our IP to get blocked on Google Maps for 24 hours.

Using MBTiles

Over the weekend I brought back TileMill, and decided to take a look at the MBTile format. It’s basically a SQLite DB file, with each tile stored as a BLOB. Very simple stuff, but how do I serve the individual tiles over HTTP so that Ocularis can use them?

Turns out, Node.js is the perfect tool for this sort of thing.

Creating a HTTP server is trivial, and opening a SQLite database file is just a couple of lines. So with less than 50 lines of code, I had made myself a MBTile server that would work with Ocularis.

tileserver

A few caveats : Ocularis has the Y axis pointing down, while MBTiles have the Y axis pointing up. Flipping the Y axis is simple. Ocularis has the highest resolution layer at layer 0, MBTiles have that inverted, so the “world tile” is always layer 0.

So with a few minor changes, this is what I have.

 

I think it would be trivial to add support for ESRI tile servers, but I don’t really know if this belongs in a VMS client. The question is time was not better utilized by making it easy for the GIS guys to add video capabilities to their app, rather than having the VMS client attempt to be a GIS planning tool.

 

Tagged ,