Category Archives: Surveillance Software Design

Codecs and Client Applications

4K and H.265 is coming along nicely, but it is also part of the problems we as developers are facing these days. Users want low latency, fluid video, low bandwidth and high resolution, and they want these things on 3 types of platforms – traditional PC applications, apps for mobile devices and tablets, and a web interface. In this article, I’d like to provide some insights from a developer’s perspective.

Fluid or Low Latency

Fluid and low latency at the same time is highly problematic. To the HLS guys, 1 second is “low latency”, but to us, and the WebRTC hackers, we are looking for latencies in the sub 100 ms area. Video surveillance doesn’t always need super low latency – if a fixed camera has 2 seconds of latency, that is rarely a problem (in the real world). But as soon as any kind of interaction takes place (PTZ or 2-way audio) then you need low latency. Optical PTZ can “survive” low latency if you only support PTZ presets or simple movements, but tracking objects will be virtually impossible on a high-latency feed.

Why high latency?

A lot of the tech developed for video distribution is intended for recorded video, and not for low latency live video. The intent is to download a chunk of video, and while that plays, you download the next in the background, this happens over and over, but playing back does not commence until at least the entire first block has been read off the wire. The chunks are usually 5-10 seconds in duration, which is not a problem when you’re watching Bob’s Burgers on Netflix.

The lowest latency you can get is to simple draw the image when you receive it, but due to packetization and network latency, you’re not going to get the frames at a steady pace, which leads to stuttering which is NOT acceptable when Bob’s Burgers is being streamed.

How about WebRTC?

If you’ve ever used Google Hangouts,  then you’ve probably used WebRTC. It works great when you have a peer-to-peer connection with a good, low latency connection. The peer-to-peer part is important, because part of the design is that the recipient can tell the sender to adjust its quality on demand. This is rarely feasible on a traditional IP camera, but it could eventually be implemented. WebRTC is built into some web browsers, and it supports H.264 by default, but not H.265 (AFAIK) or any of the legacy formats.

Transcoding

Yes, and no. Transcoding comes at a rather steep price if you expect to use your system as if it ran w/o transcoding. The server has to decode every frame, and then re-encode it in the proper format. Some vendors transcodes to JPEG which makes it easier for the client to handle, but puts a tremendous amount of stress on the server. Not on the encoding side, but the decoding of all those streams is pretty costly.  To limit the impact on the transcoding server, you may have to alter the UI to reflect the limitation in server side resources.

Installed Applications

The trivial case is an installed application on a PC or a mobile device. Large install files are pretty annoying (and often unnecessary), but you can package all the application dependencies, and the developer can do pretty much anything they want. There’s usually a lot of disk-space available and fairly large amounts of RAM.

On a mobile device you struggle with OS fragmentation (in case of Android), but since you are writing an installed application, you are able to include all dependencies. The limitations are in computing power, storage, RAM and physical dimensions. The UI that works for a PC with a mouse is useless on a 5″ screen with a touch interface. The CPU/GPU’s are pretty powerful (for their size), but they are no-where near the processing power of a halfway decent PC. The UI has to take this into consideration as well.

“Pure” Web Clients

One issue that I have come across a few times, is that some people think the native app “uses a lot of resources”, while a web based client would somehow, magically, use fewer resources to do the same job. The native app uses 90.0% of the CPU resources to decode video, and it does so a LOT more efficient than a web client would ever be able to. So if you have a low end PC, the solution is not to demand a web client, but to lower the number of cameras on-screen.

Let me make it clear: any web client that relies on an ActiveX component to be downloaded and installed, might as well have been an installed application. ActiveX controls are compiled binaries that only run on the intended platform (IE, Windows, x86 or x64). They are usually implicitly (sometimes explicitly) left behind on the machine, and can be instantiated and used as an attack vector if you can trick a user to visit a malicious site (which is quite easy to accomplish).

The purpose of a web client is to allow someone to access the system from a random machine in the network, w/o having to install anything. An advantage is also that since there is no installer, there’s no need to constantly download and install upgrades every time you access your system. When you log in, you get the latest version of the “client”. Forget all about “better for low end” and “better performance”.

Technology

Java applets can be installed, but often setting up Java for a browser is a pain in the ass (security issues), and performance can be pretty bad.

Flash apps are problematic too, and suffer the same issues as Java applets. Flash has a decent H.264 decoder for .flv formatted streams, but no support for H.265 or legacy formats (unless you write them, from scratch.. and good luck with that 🙂 ) Furthermore, everyone with a web browser in their portfolio is trying to kill Flash due to it’s many problems.

NPAPI or other native plugin frameworks (NaCL, Pepper) did offer decent performance, but usually only worked on one or two browsers (Chrome or Firefox), and Chrome later removed support for NPAPI.

HTML5 offers the <video> tag, which can be used for “live” video. Just not low latency, and codec support is limited.

Javascript performance is now at a point (for the leading browsers) that you can write a full decoder for just about any format you can think of and get reasonable performance for 1 or 2 720p streams if you have a modern PC.

Conclusion

To get broad client side support (that truly works), you have to make compromises on the supported devices side. You cannot support every legacy codec and device and expect to get a decent client side experience on every platform.

As a general rule, I disregard arguments that “if it doesn’t work with X, then it is useless”. Too often, this argument gains traction, and to satisfy X, we sacrifice Y. I would rather support Y 100% if Y makes more sense. I’d rather serve 3 good dishes, than 10 bad ones. But in this industry, it seems that 6000 shitty dishes at an expensive “restaurant” is what people want. I don’t.

 

 

Tagged , , , ,

Marketing Technology

I recently saw a fun post on LinkedIn. Milestone Systems was bragging about how they have added GPU acceleration to their VMS, but the accompanying picture was from a different VMS vendor. My curiosity had the better of me, and I decided to look for the original press release. The image was gone, but the text is bad enough.

Let’s examine :

Pioneering Hardware Acceleration
In the latest XProtect releases, Milestone has harvested the advantages of the close relationships with Intel and Microsoft by implementing hardware acceleration. The processor-intensive task of decoding (rendering) video is offloaded to the dedicated graphics system (GPU) inside the processer [sic], leaving the main processor free to take on other tasks. The GPU is optimized to handle computer graphics and video, meaning these tasks will be greatly accelerated. Using the technology in servers can save even more expensive computer muscle.

“Pioneering” means that you do something before other people. OnSSI did GPU acceleration in version 1.0 of Ocularis, which is now 8 or 9 years old. Even the very old NetSwitcher app used DirectX for fast YUV conversion. Avigilon has been doing GPU acceleration for a while too, and I suspect others have as well. The only “pioneering” here is how far you can stretch the bullshit.

Furthermore, Milestone apparently needs a “close relationship” with Microsoft and Intel to use standard and publicly available quick sync tech. They could also have used FFMpeg.

We have experimented with CUDA on a high end nVidia card years ago, but came to the conclusion that the scalability was problematic, and while the CPU would show 5%, the GPU was being saturated causing stuttering video when we pushed for a lot of frames.

Using Quick sync is the right thing to do, but trying to pass it off as “pioneering” and suggesting that you have some special access to Microsoft and Intel to do trivial things is taking marketing too far.

The takeaway is that I need to remind myself to make my first gen client slow as hell, so that I can claim 100% performance improvement in v2.0.

keep-calm-and-ignore-bullshit-7-257x300

Tagged , ,

Open Systems and Integration

Yesterday I took a break from my regular schedule and added a simple, generic HTTP event source to Ocularis. We’ve had the ability to integrate to IFTTT via the Maker Channel for quite some time. This would allow you to trigger IFTTT actions whenever an event occurs in Ocularis. Soon, you will be able to trigger alerts in Ocularis via IFTTT triggers.

For example, IFTTT has a geofence trigger, so when someone enters an area, you can pop the appropriate camera via a blank screen. The response time of IFTTT is too slow and I don’t consider it reliable for serious surveillance applications, but it’s a good illustration of the possibilities of an open system. Because I am lazy, I made a trigger based on Twitter, that way I did not have to leave the house.

Making a HTTP event source did not require any changes to Ocularis itself. It could be trivially added to previous version if one wanted to do that, but even if we have a completely open system, it doesn’t mean that people will utilize it.

 

 

Tagged ,

Camera Proxy

There’s a lot of paranoia in the industry right now, some warranted, some not. The primary issue is that when you plug something into your network you basically have to trust the vendor to not spy on you “by design” and to not provide a trivial attack vector to 3rd parties.

First things first. Perhaps you remember that CCTV means Closed Circuit Television. Pay attention to those first two words. I am pretty sure 50% or more of all “CCTV” installations are not closed at all. If your CCTV system is truly closed, there’s no way for the camera to “call home”, and it is impossible for hackers to exploit any attack vectors because there’s no access from the outside world to the camera. There are plenty of PC’s running terrible and vulnerable software out there, but as long as these systems are closed, there’s no problem. Granted, it also limits the flexibility of the system. But that’s the price you pay for security.

In the opposite end of the spectrum are cameras that are directly exposed to the internet. This is a very bad idea, and most professionals probably don’t do that. Well… some clearly do, because a quick scan of the usual sites reveal plenty of seemingly professional installations where cameras are directly accessible from the internet.

To expose a camera directly to the internet you usually have to alter the NAT tables in your router/firewall. This can be a pain in the ass for most people, so another approach is used called hole-punching. This requires a STUN server between the client sitting outside the LAN (perhaps on an LTE connection via AT&T) and the camera inside the LAN. The camera will register with the STUN server via an outbound connection. Almost all routers/firewalls allow outbound connections. The way STUN servers work, probably confuse some people, and they freak out when they see the camera making a connection to “suspicious” IP but that’s simply how things work, and not a cause for alarm.

Now, say you want to record the cameras in your LAN on a machine outside your LAN, perhaps you want an Azure VM to record the video, but how will the recorder on Azure (outside your LAN) get access to your cameras that are inside the LAN unless you set up NAT and thus expose your cameras directly to the internet?

This is where the $10 camera proxy comes in (the actual cost is higher because you’ll need an SD card and a PSU as well).

So, here’s a rough sketch of how you can do things.

  • On Azure you install your favorite VMS
  • Install Wowza or EvoStream as well

EvoStream can receive an incoming RTMP stream, and make the stream available via RTSP, it basically changes the protocol, but uses the same video packets (no transcoding). So, if you were to publish a stream at say rtmp://evostreamserver/live/mycamera, that stream will be available at rtsp://evostreamserver/mycamera. You can then add a generic RTSP camera that reads from rtsp://evostreamserver/mycamera to your VMS.

The next step is to install the proxy, you can use a very cheap Pi clone, or a regular PC.

  • Determine the RTSP address of the camera in question
  • Download FFMpeg
  • Set up FFMpeg so that it publishes the camera to EvoStream (or Wowza) on Azure

Say you have a camera that streams via rtsp://192.168.0.100/video/channels/1, the command looks something like this (all on one line)

ffmpeg -i rtsp://username:password@192.168.0.100/video/channels/1 
-vcodec copy -f flv rtmp://evostreamserver/live/mycamera

This will make your PC grab the AV from the camera and publish it to the evostream server on Azure, but the camera is not directly exposed to the internet. The PC acts as a gateway, and it only creates an outbound connection to another PC that you control as well.

You can now access the video from the VMS on Azure, and your cameras are not exposed at all, so regardless how vulnerable they are, they will not expose any attack vectors to the outside world.

Using Azure is just an example, the point is that you want to isolate the cameras from the outside world, and this can be trivially accomplished by using a proxy.

As a side note. If cameras were deliberately spying on their users, by design, this would quickly be discovered and published. That there are bugs and vulnerabilities in firmware is just a fact of life and not proof of anything nefarious, so calm down, but take the necessary precautions.

TileMill and Ocularis

A long, long time ago, I discovered TileMill. It’s an app that lets you import GIS data, style the map and create a tile-pyramid, much like the tile pyramids used in Ocularis for maps.

tilemill

There are 2 ways to export the map:

  • Huge JPEG or PNG
  • MBTiles format

So far, the only supported mechanism of getting maps into Ocularis is via a huge image, which is then decomposed into a tile pyramid.

Ocularis reads the map tiles the same way Google Maps (and most other mapping apps) reads the tiles. It simply asks for the tile at x,y,z and the server then returns the tile at that location.

We’ve been able to import Google Map tiles since 2010, but we never released it for a few reasons:

  • Buildings with multiple levels
  • Maps that are not geospatially accurate (subway maps for example)
  • Most maps in Ocularis are floor plans, going through google maps is an unnecessary extra step
  • Reliance on an external server
  • Licensing
  • Feature creep

If the app is relying on Google’s servers to provide the tiles, and your internet connection is slow, or perhaps goes offline, then you lose your mapping capability. To avoid this, we cache a lot of the tiles. This is very close to bulk download which is not allowed. In fact, at one point I downloaded many thousands of tiles, which caused our IP to get blocked on Google Maps for 24 hours.

Using MBTiles

Over the weekend I brought back TileMill, and decided to take a look at the MBTile format. It’s basically a SQLite DB file, with each tile stored as a BLOB. Very simple stuff, but how do I serve the individual tiles over HTTP so that Ocularis can use them?

Turns out, Node.js is the perfect tool for this sort of thing.

Creating a HTTP server is trivial, and opening a SQLite database file is just a couple of lines. So with less than 50 lines of code, I had made myself a MBTile server that would work with Ocularis.

tileserver

A few caveats : Ocularis has the Y axis pointing down, while MBTiles have the Y axis pointing up. Flipping the Y axis is simple. Ocularis has the highest resolution layer at layer 0, MBTiles have that inverted, so the “world tile” is always layer 0.

So with a few minor changes, this is what I have.

 

I think it would be trivial to add support for ESRI tile servers, but I don’t really know if this belongs in a VMS client. The question is time was not better utilized by making it easy for the GIS guys to add video capabilities to their app, rather than having the VMS client attempt to be a GIS planning tool.

 

Tagged ,

Razor Software

It dawned on me that the VMS software, in many cases, are equivalent to razors… or pizzas, or burgers. It’s reaching the point where it’s really hard to sell anything better than just “good enough”. Sure, there are 5-blade razors with gels to soothe the skin, and there are “gourmet” burgers and pizzas (I still prefer Wendy’s, and I’ve been to a lot of burger places). So unless you can sell some sort of “experience”, when peddling easily fungible goods, you’ll be competing on price.

I recently was on vacation in Greece and I noticed that the cash register software was (apparently) DOS based, and hadn’t been updated since 2004. The service was as fast as anywhere else, and the software had no issue looking up the price of an item as quickly as the operator could scan the barcode or manually key in the item. A local hardware store here in Copenhagen has a similar system, and the Luddite discount supermarket chain, Aldi, didn’t have barcode scanners until recently.

In the good old days (of scurvy and gout), the store would simply write down what was sold, and at what price in a little black book. Back then, I doubt there were price tags on items, as it was the Quakers who introduced fixed prices.

Cash register by the National Cash Register Co., Dayton, Ohio, United States, 1915.

Cash register by the National Cash Register Co., Dayton, Ohio, United States, 1915.

The first register came about when saloon owner, James Ritty got fed up with his employees stealing from him, so he invented the cash register. A contraption he called the “Ritty’s Incorruptible Cashier”. Soon after, the invention was sold to the company that later became NCR.

NCR went on to add bookkeeping features to the cash register, so not only would you prevent the employees from stealing (it’s intended purpose), but it would also make your life easier as a shop-keeper.

Electronic cash registers was the next step. Clearly, going from a mechanical contraption to an electronic version made sense. It was cheaper, more reliable, faster and had more features.

Then came the DOS based POS systems that offered yet more flexibility, were easier to use and could interface with more printers and cash drawers.

Today, the local hipster hangout coffee shop down the street is now using an iPad with a credit card scanner attached.

As I see it, the embedded electronic register is equivalent to a DVR with some NTSC cameras. The DOS based register corresponds to the windows based IP video recorders that are fairly common today. And finally, a plug’n play video surveillance solution in the DropCam vein is peered to the iPad POS software.

My thinking is this: A tool that solves a particular, well defined problem, won’t be upgraded, at least not until the new tool offers a substantially better ROI. Adding bookkeeping to the mechanical cash made sense, but I doubt people replaced their old working mechanical registers with the slightly improved variant. New customers will buy the feature-rich variant if the price points are similar, but it’s very hard to change a substantial premium for features that really aren’t needed. And even harder to convince people to replace a working system with something more expensive, but only offers marginally useful features. I suspect that the reason I am seeing a lot of stores with DOS (or DOS-like) registers, is because the functionality is simply good enough, and there’s no meaningful ROI on upgrading to a, say, Windows 10 or iOS version.

And that brings me to VMS systems. Clearly, the analog systems were a pain in the ass. Tapes that were worn out, rendering the system useless when something did happen. Terrible low light performance and so on. The DVR systems were substantially better, causing people to upgrade. And for a while, IP based cameras were very, very expensive, but as prices came down, they started to offer a substantially better ROI than DVRs (in many respects, but not all).

The key point here is “good enough”: I don’t think we are quite there yet, but eventually we will arrive at a point where all VMS/NVRs will be pretty much plug and play. It’ll be a piece of cake to replace a camera (why the hell is this still a challenge on some VMSs?), hopefully no-one is going to spend hours tweaking resolutions, messing around with VBR and CBR, or mess around with “motion sensitivity” settings.

If I had my way, I’d let the “other guys” come up with new condiments, and ways to serve their turd-sandwiches, and instead offer cupcakes. Some people don’t like cupcakes, some prefer donuts, but I bet you that most people would pick a cupcake over any turd-sandwich. There are always people that complain, monday-morning-quarterbacks (some made a career of it), who demand some weird condiment that comes with turd-sandwich #2, and in some cases, sinking your teeth into a turd-sandwich is simply the only way to achieve some narrow goal.

I bet that once someone offers a cupcake, it’s game over for the turd-sandwich stores.

Design is About Intent

Regular readers know that I have written about this problem before, I wholeheartedly agree with John R. Moran on his observation

Delegating is by far the most subtle, pernicious, and widespread of the three evasions, particularly among tech companies. Under the guise of being “user-driven” or providing “choice,” delegators leave crucial design decisions up to the user

Design Is About Intent

Cloudy with a Chance of…

Yesterday I questioned the research made public via a press release from Axis. Apparently 40% of the IT respondents said they leveraged cloud based storage. That sounded odd to me. It was a surprisingly high number, and it didn’t jive with my experiences – but I could be wrong. Perhaps sending an AVI via gmail is considered “leveraging the cloud” 🙂

cloud-upload-2For surveillance of a private home I feel that a cloud based solution is absolutely the way to go, and a few companies are gunning for that market (DropCam, CamCloud..). I thought Axis might grab that market, but so far they seem very reluctant to upset any of the VMS vendors by going that route, and so perhaps that market will be left open to the new kids on the block (Foscam, Dahua, HikVision etc). 

Setting up a service in the cloud is exceedingly simple these days – there’s the traditional hosting solutions, and then there are the enterprise ones : Microsoft Azure and Amazon. Microsoft offered me (and pretty much every other developer) a chance to set up a node in their cloud, and within one day I had a (very) simple video recorder set up “in the cloud”.

Now, as I am recording my own home, I only record video “on exception”. So ideally, I would never record anything, because no exceptions occur. I might feel warm and fuzzy about my system if I capture the garbage truck and the mailman every day, but I do not intend the cameras to record myself eating dinner with my family, nor record the family cat as it drags its lazy ass across the floor towards its food-bowl. I don’t need to capture what tool the burglar used to pry open the windows, nor if he entered through the front or the back of the house – those things are evident without video. What I need is one (or more) great snaps of the buglar. That footage, I’d like to store in the cloud as it happens (hopefully never).

Sending noise frames all through the night to the server makes no sense either – the camera should be smart enough to know if the image it is sending contains “meaningful” information. E.g. if you are using AGC, then during low light, the noise level rises. Sometimes to the point where the motion detection triggers again and again. Ideally, I’d like my camera to come with a PIR and have the camera turn the lights on if something is amiss (and I think some of the new players are going that route).

This is totally different from recording what goes on at a gas station, a restaurant or a bank. Where I consider a cloud based solution absolutely appropriate for private homes, I find it totally inappropriate in areas where continuous recording is the norm. Perhaps a hybrid solution would make sense (cloud based after hours, local recording during opening hours).

Default Passwords and ONVIF

Update:
Before you judge me borderline insane, in this post, I am talking about FACTORY DEFAULT passwords, for example : mobotix

My darling ONVIF, you’ve come of age and I tried to woo you. But you turned me down. Again.

A while back I decided to take the plunge, and have some fun with ONVIF. Axis knows I’ve been very happy with ONVIF’s older, and more mature, sister VAPIX. VAPIX is damn nice. But there’s a certain allure to ONVIF, the young, promiscuous rebel, and I wanted to see if I could tame her too.

ONVIF provides a handy mechanism for detecting ONVIF cameras on the local subnet. Easy peasy.  Got all the cameras in a jiffy. Next step was to get some attributes about each camera. And suddenly the approachable darling turned out to be an outright bitch.

Usually, using a web service is a one-two-three step process. Very simple, which is important if you want any sort of penetration. Unfortunately, the camera in question decided that I wasn’t worthy of a response. Usually, I would have given up, but I was in a fighting mood so a few hours of searching high and low, I found a piece of code that would allow me to authenticate properly with the camera. That, in my opinion, is fail #1. I doubt that there would be any way for me to figure out what the hell was wrong by looking at the authentication failure error code, and it’s not as if the ONVIF site makes it clear either. Now that I spent a day looking for it, I am going to be an asshole and not share the solution until my own thing is good and done.

A small part of my  problems is that I used the root account to access the camera. The root user (built in to Axis cameras) is not an “ONVIF user”. I can – apparently – create an ONVIF user by using the root-credentials and some ONVIF wdsl, but I haven’t tried that yet. My workflow would then be : detect cameras, then connect to the camera to get caps using some user-supplied credentials (say onvif_user:1234). Now that may fail, because the user hasn’t been created yet, so I will now have to use the VAPIX root account (which the user also has to supply the password for) to create the onvif_user account. THEN I will be able to finally do ONVIF. But it’s a damn mess from the user perspective. Especially because it’s a really bad idea to have the same root password on all the cameras.

It seems to me that the lack of an ONVIF default user is a problem.

Ideally, you’d plug in your ONVIF cameras, the DHCP server gives them an IP with a long lease. We then find the cameras on the network using the default credentials. Once you decide to import a camera, the NVR server should change the cameras password and store that in an encrypted file. This way the cameras are easy to install, and you maintain security.

The way it works now is too cumbersome and error prone, and it doesn’t scale too well. I don’t want my users to fiddle with spreadsheets for each installation.

I’ve created a small page where you can, if you like, see and add default credentials for various cameras.

List of default usernames and passwords

Let’s work together and make ONVIF viable.

Tagged , , , ,

One Auga Per Ocularis Base*

In the Ocularis ecosystem, Heimdall is the component that takes care of receiving, decoding and displaying video on the screen. The functionality of Heimdall is offered through a class called Auga. So, to render video, you need to create an Auga object.

Ocularis was designed with the intent of making it easy for a developer to get video into their own application. Initially it was pretty simple – instantiate an Auga instance, and pass in a url, and viola, you had video. But as we added support for a wider range of NVRs, things became a little more complex. Now you need to instantiate an OCAdapter, log into an Ocularis Base Server. Then, pass the cameras to Auga via SetCameraIDispatch and then you can get video. The OCAdapter in turn, depends on a few NVR drivers. So deployment became more complex too.

One of the most common problems that I see today, is that people instantiate one OCAdapter, and one Auga instance per camera. This causes all sorts of problems; each instance counts as one login (which is a problem on login-restricted systems), every instance consumes memory and memory for fonts and other graphics are not shared between the instances. In many ways, I should have anticipated this type of use, but on the other hand, the entire Ocularis Client is using Heimdall/Auga as if it was a 3rd party component, and that seems to work pretty well (getting a little dated to look at, but hey…)

Heimdall also offers a headless mode. We call it video-hooks, and it allows you to instantiate an Auga object, and get decoded frames via a callback, or a DLL, instead of having Auga draw it on the screen. The uses for this are legion, I’ve used the video-hooks to create a web-interface, and until recently we used it for OMS to, video analytics can use the hooks to get live video in less than 75 lines of code . Initially the hooks only supported live video, but it now supports playback of recorded video too. But even when using Auga for hooks, should you ever only create one Auga instance per Ocularis Base. One Auga instance can easily stream from multiple cameras.

However, while Heimdall is named after a God, it does not have magical capabilities. Streaming from 16 * 5 MP * 30 fps will tax the system enormously – even on a beefy machine. One might be tempted to say “Well, the NVR can record it, so why can’t Auga show it?”. Well, the NVR does not have to decode every single frame completely to determine the motion level, but Auga has to decode everything, fully, all the way to the pixel format you specify when you set up the hook. If you specify BGR as your expected pixel format, we will give you every frame as a fully decoded BGR frame at 5MP. Unfortunately, there is no way to decode only every second or third frame. You can go to I-Frame only decoding (we do not support that right now), but that lowers the framerate to whatever the I-frame interval would be, typically 1 fps.

If you are running Auga in regular mode, you can show multiple cameras by using the LoadLayoutFromString function. It allows you to create pretty much any layout that you can think of, as you define the viewports via a short piece of text. Using LoadLayoutFromString  (account reqd.) you do not have to handle maximizing of viewports etc. all that is baked into Auga already. Using video hooks, you can set up (almost) any number of feeds via one Auga instance.

sdk_loadlayout

Granted, there are scenarios where making multiple Augas makes sense – certainly, you will need one per window (and hence the asterisk in the headline), and clearly if you have multiple processes running, you’d make one instance per process.

I’ll talk about the Direct3D requirement in another post.