InfluxDB and Grafana

InfluxDBGrafana

When buzzards are picking at your eyes, maybe it’s time to move a little. Do a little meandering, and you might discover that the world is larger, and more fun, than you imagined. Perhaps you realize that what was once a thriving oasis has now turned into a putrid cesspool riddled with parasites.

InfluxDB is what’s knowns as a “streaming database”. The idea is that it’s a database that collects samples over time. Once the sample is collected, it doesn’t change. Eventually the sample gets too old, and is discarded. This is different from traditional databases where the values may change over time, and the deletion of records is not normally based on age.

This sounds familiar doesn’t it?

Now, you probably want to draw some sort of timeline, or graph, that represents the values you popped into InfluxDB. Enter, Grafana. It’s a dashboard designer that can interface with InfluxDB (and other databases too) and show pretty graphs and tables in a web page w/o requiring any HTML/Javascript coding.

If you want to test this wonderful combination of software, you’ll probably want to run Docker, and visit this link.

Now, I’ve already abandoned the idea of using InfluxDB/Grafana for the kind of stuff I mess around with. InfluxDB’s strength is that it can return a condensed dataset over a potentially large time-range. And it can make fast and semi-complex computations over the samples it returns (usually of the statistical kind). But the kind of timeline information I usually record is not complex at all, and there aren’t really any additional calculations I can do over the data. E.g. what’s the average of “failed to connect” and “retention policy set to 10 days”.

InfluxDB is also schema-less. You don’t need to do any pre-configuration (other than creating your database), so if you suddenly feel the urge to create a table called “dunning” then you just insert some data into “dunning”. You don’t need to define columns or their types etc. you just insert data.

And you can do this via a standard HTTP call, so you can use curl on the command line, or use libcurl in your c++ app (which is what I did).

The idea that you can issue a single command to do a full install of InfluxDB and Grafana, and then have it consume data from your own little app in about the time it takes to ingest a cup of coffee says a lot about where we’re headed.

Contrast the “open platforms” that require you to sign an NDA, download SDKs, compile DLLs, test on 7 different versions of the server and still have to nurse it every time there’s a new version. Those systems will be around for a long time, but I think it’s safe to say they’re way past their prime.

 

 

The Singleton Anti-Pattern

In programming, the whole idea is to avoid re-inventing the wheel, and re-use as much as possible. Some clever coders discovered that there were some mechanism that were used over and over again. For example, the “producer/consumer” mechanism, whereby one or more threads are “producers” and one or more threads are “consumers”. Instead of coders figuring out how to do this properly over and over again, a group of people decided to write a book that described how to solve some of these problems. “Design Patterns: Elements of Reusable Object-Oriented Software” they called it. In the business, the authors became known as the “Gang of Four”.

One of the patterns they described is a “Singleton“: A singleton is essentially a global object, that is instantiated when needed. The idea being that the user doesn’t need to know when, or how, the underlying object is created/destroyed, they can just use it, and all parts of the code then shares the same object. Isn’t that cool. It’s like global variables were suddenly being endorsed in a book, and by some clever people too!!

There are cases (rare, constrained) where a global variable makes sense; it makes sense when the physical properties that the software is trying to model, matches with a single object. E.g. a singular file on a disk or a specific camera in a network. It’s perfectly appropriate to model these objects as global, because there truly is only one of them.

Let’s consider a log mechanism. There may be several things that are logging data, but if all that data goes into just one file, then it’s OK to use a singleton for the file, but certainly not for the log abstractions. If there are three or four different modules that are all logging to the same file, then those modules must have their own logger instance, and the various instances that are made, can then write to the same file using the singleton.

A primitive class diagram could look like this:

             Module A -> Log A 
Parent  ->                        -> Singleton File
             Module B -> Log B

When you are acutely aware of this composition, you should eventually realize that each logger instance must add some identifier when it writes to the disk. Otherwise you get a log file that looks like this

File Open
File Open
File Write Failed
File Write Succeeded
File Close
File Close

What you want, in the file, is this

Module A: File Open
Module B: File Open
Module B: File Write Failed
Module A: File Write Succeeded
Module B: File Close
Module A: File Close

This appears to solve the problem; except there’s a caveat. Say someone writes an app that creates two instances of the parent module. Since the log file is a singleton, all log data is written to the same file. This, in turn, means that two instances of the parent will also write to the same file.

Consider this diagram

                              Module A -> Log A
                 Parent ->               
                              Module B -> Log B
Aggregator  ->                                       -> Singleton File
                              Module A -> Log A
                 Parent ->
                              Module B -> Log B

We are now in hell.

Module A: File Open
Module B: File Open
Module B: File Write Failed
Module A: File Open
Module B: File Write Failed
Module A: File Write Succeeded
Module B: File Close
Module A: File Write Succeeded
Module A: File Close

This issue is relatively easy to fix, and it’s still valid to have a requirement that there is just one log file (might be better to create one per parent, but that’s a matter of taste).

But what about issues where things like username, password, preferences etc. are stored in a singleton that contains “user info”. In that case, when the aggregator sets the username, the username change applies to ALL modules, regardless of where they reside in the aggregator tree. It’s therefore impossible for the aggregator to set a different username for Parent 1 and Parent 2. The aggregator, therefore, breaks.

Essentially, the coder might as well have said “let’s make the username a global variable”. 99% of all coders will object when they hear that (or “goto”). But 50% of all coders remain silent when the same pattern is described using the “singleton” moniker.

The morale of the story: don’t use singletons. Not even if you think you know what you are doing. Because if you think you know what you are doing, then you almost certainly do not.

 

Worlds Shittiest NVR pt. 4.

We now have a circular list of video clips in RAM, we have a way to be notified when something happens, and we now need to move the clips in RAM to permanent storage when something happens.

In part 1 we set up FFmpeg to write to files in a loop, the files were called 0.mp4, 1.mp4 … up to 9.mp4. Each file representing 10 seconds of video. We can’t move the file that FFmpeg is writing to, so we’ll do the following instead: We will copy the file previous file that FFmpeg completed, and we’ll keep doing that for a minute or so. This means that we’ll get the file (10 seconds) before the event occurred copied to permanent storage. Then, when the file that was being written while the event happened is closed, we’ll copy that file over, then the next and so on.

We’ll use a node module called “chokidar”, so, cd to your working directory (where the SMTP server code resides) and type:

node install chokidar

Chokiar lets you monitor files or directories and gives you an event when a file has been altered (in our case, FFmpeg has added data to the file). Naturall, if you start popping your own files into the RAM disk and edit those files, you’ll screw up this delicate/fragile system (read the title for clarification).

So, for example if my RAM disk is x:\ we can do this to determine which is the newest complete file:

chokidar.watch('x:\\.', {ignored: /(^|[\/\\])\../}).on('all', (event, path) => {
    
    // we're only interested in files being written to  
    if ( event != "change")  
      return;
    
    // are we writing to a new file?  
   if ( currentlyModifiedFile != path )  
   {  
      // now we have the last file created  
     lastFileCreate = currentlyModifiedFile;  
     currentlyModifiedFile = path;  
   }
});

Now, there’s a slight snag that we need to handle: Node.js’s built-in file handler can’t copy files from one device (the RAM disk) to another (HDD), so to make things easy, we grab an extension library called “fs-extra”

Not surprisingly

node install fs-extra

So, when the camera tries to send an email, we’ll set a counter to some value. We’ll then periodically check if the value is greater than zero. If it is indeed greater than zero, then we’ll copy over the file that FFmpeg just completed and decrement the counter by one.

If the value reaches 0 we won’t copy any files, and just leave the counter at 0.

Assuming you have a nice large storage drive on e:\, and the directory you’re using for permanent storage is called “nvr” we’ll set it up so that we copy from the RAM drive (x:\) to the HDD (e:\nvr). If your drive is different (it most likely is, then edit the code to reflect that change – it should be obvious what you need to change).

Here’s the complete code:

const smtp = require ( "simplesmtp");
const chokidar = require('chokidar');
const fs = require('fs-extra');

// some variables that we're going to needvar 
currentlyModifiedFile = null;
var lastFileCreate = null;
var lastCopiedFile = null;
var flag_counter = 0;
var file_name_counter = 0;

// fake SMTP server
smtp.createSimpleServer({SMTPBanner:"My Server"}, function(req) {
    req.accept();

    // copy files for the next 50 seconds (5 files)
    flag_counter = 10;
}).listen(6789);

// function that will be called every 5 seconds
// tests to see if we should copy files

function copyFiles ( )
{ 
  if ( flag_counter > 0 ) 
  { 
     // don't copy files we have already copied  
     // this will happen because we check the  
     // copy condition 2 x faster than files are being written 
     if ( lastCopiedFile != lastFileCreate ) 
     { 
        // copy the file to HDD 
        fs.copy (lastFileCreate, 'e:/nvr/' + file_name_counter + ".mp4", function(err) {     
           if ( err ) console.log('ERROR: ' + err); 
        });

        // files will be named 0, 1, 2 ... n 
        file_name_counter++;

        // store the name of the file we just copied 
        lastCopiedFile = lastFileCreate; 
     }
     
     // decrement so that we are not copying files  
     // forever 
     flag_counter--; 
  } 
  else 
  { 
     // we reached 0, there is no  
     // file that we copied before. 
     lastCopiedFile = null; 
  }
}

// set up a watch on the RAM drive, ignoring the . and .. files
chokidar.watch('x:\\.', {ignored: /(^|[\/\\])\../}).on('all', (event, path) => {
  // we're only interested in files being written to  
  if ( event != "change")  return;
   
  // are we writing to a new file?  
  if ( currentlyModifiedFile != path )  
  {  
     // now we have the last file created  
     lastFileCreate = currentlyModifiedFile;  
     currentlyModifiedFile = path;  
  }
});

// call the copy file check every 5 seconds from now on
setInterval ( copyFiles, 5 * 1000 );

So far, we’ve written about 70 lines of code in total, downloaded ImDrive, FFMpeg, node.js and a few modules (simplesmtp, chokidar and fs-extra), and we now have a pre-buffer fully in RAM and a way to store things permanently. All detection is done by the camera itself, so the amount of CPU used is very, very low.

This is the UI so far :

folders

In the next part, we’ll take a look at how we can get FFmpeg and nginx-rtmp to allow us to view the cameras on our phone, without exposing the camera directly to the internet.

 

 

Worlds Shittiest NVR pt. 1

In this tutorial, we’ll examine what it takes to create a very shitty NVR on a Windows machine. The UI will be very difficult to use, but it will support as many cameras as you’d like, for as long as you like.

The first thing we need to do is to download FFmpeg.

Do you have it installed?

OK, then we can move on.

Create a directory on a disk that has a few gigabytes to spare. On my system, I’ve decided that the x: drive is going to hold my video. So, I’ve created a folder called “diynvr” on that driver.

Note the IP address of your camera, the make and model too, and use google to find the RTSP address of the camera streams. Many manufacturers (wisely) use a common format for all their cameras. Others do not. Use Google (or Bing if you’re crazy).

Axis:
rtsp://[camera-ip-address]/axis-media/media.amp

Hanwha (SUNAPI):
rtsp://[camera-ip-address]/profile[#]/media.smp

Some random Hikvision camera:
rtsp://[camera-ip-address]/Streaming/Channels/[#]

Now we’re almost ready for the worlds shittiest NVR.

The first thing we’ll do is to open a command prompt (I warned you that this was shitty). We’ll then CD into the directory where you’ve placed the FFmpeg files (just to make it a bit easier to type out the commands).

And now – with a single line, we can make a very, very shitty NVR (we’ll make it marginally better at a later time, but it will still be shit).

ffmpeg -i rtsp://[...your rtsp url from google goes here...] -c:v copy -f segment -segment_time 10 -segment_wrap 10 x:/diynvr/%d.mp4

So, what is going on here?

We tell FFmpeg to pull video from the address, that’s this part

-i rtsp://[...your rtsp url from google goes here...]

we then tell FFmpeg to no do anything with the video format (i.e. keep it H.264, don’t mess with it):

-c:v copy

FFmpeg should save the data in segments, with a duration of 10 seconds, and wrap around after 10 segments (100 seconds in all)

-f segment -segment_time 10 -segment_wrap 10

It should be fairly obvious how you change the segment duration and number of segments in the database so that you can do something a little more useful than having just 100 seconds of video.

And finally, store the files in mp4 containers at this location:

 x:/diynvr/%d.mp4

the %d part, means that the filename will be the digits of the segment as filename, so we’ll get files named 0.mp4, 1.mp4 … up to and including 9.mp4.

So, now we have a little circular buffer with 100 seconds of video. If anyone breaks into my house, I just need to scour through the files to find the swine. I can open the files using any video player that can play mp4 files. I use VLC, but you might prefer something else. Each file is 10 seconds long, and with thumbnails, in Explorer, you have a nice little overview of the entire “database”.

In the next part we will improve on these two things:

  • Everything gets stored
  • Constant writing to HDD

Oh, and if your camera is password protected (it should be), you can pass in the credentials in the rtsp url like so:

rtsp://[username]:[password]@[ the relevant url for your camera]

 

Docker

I recently decided to take a closer look at Docker. Many of us have probably used virtualization at one point or other. I used Parallels on OSX some time ago to run Windows w/o having to reboot. Then there’s Virtualbox, VMWare, and Hyper-V that all allow you to run an OS inside another.

Docker does virtualization a bit differently.

The problem

When a piece of software has been created, it eventually needs to be deployed. If it’s an internal tool, the dev or a support tech will boot the machine, hunch over the keyboard and download and install a myriad of cryptic libraries and modules. Eventually, you’ll be able to start the wonderful new application the R&D dept. created.

Sometimes you don’t have that “luxury”, and you’ll need to create an automated (or at least semi-automated) installer. The end user will obtain it, usually via a download and then proceed to install it on their OS (often Windows) by running the setup.exe file.

The software often uses 3rd party services and libraries, that can cause the installer to balloon to several gigabytes. Sometimes the application needs a SQL server installation, but that, in turn, is dependent on some version of something else, and all of that junk needs to be packaged into the installer file – “just in case” the end user doesn’t have them installed.

A solution?

Docker offers something a bit more extreme: There’s no installer (other than docker). Once docker is running, you run a fully functional OS with all the dependencies installed. As absurd as this may sound, it actually leads to much smaller “installers” and it ensures that you don’t have issues with services that don’t play nice with each other. A Debian OS takes up 100Mbytes  (there’s a slim version at 55 Mbytes).

My only quip with docker is that it requires Hyper-V on windows (which means you need a Windows Pro install AFAIK). Once you install Hyper-V, it’s bye-bye Virtualbox. Fortunately, it is almost trivial to convert your old Virtualbox VM disks to Hyper-V.

There are a couple of things I wish were a bit easier to figure out. I wanted to get a hello-world type app running (a compiled app). This is relatively simple to do, once you know how.

How?

To start a container (with Windows as host), open PowerShell, and type

docker run -it [name of preferred OS]

You can find images on the docker hub, or just google them. If this is the first time you run and instance of the OS, docker will attempt to download the image. This can take some time (but once it is installed, it will spawn a new OS in a second or so). Eventually, you’ll then get a shell (that’s what the -it does), and you can use your usual setup commands (apt-get install … make folders and so on).

When you’re done, you can type “exit”

The next step is important, because the next time you enter “docker run -it [OS]” all your changes will have disappeared! To keep your changes, you need to commit them. To save your changes, start by entering this command:

docker ps -a

You’ll then see something like this

docker

Take note of the Container ID, and enter

docker commit [Container ID] [image name]:[image tag]

For example

docker commit bbd096b363af debian_1/test:v1

If you take a look at the image list

docker images

You’ll see that debian_1/test:v1 is among them (along with the HDD use for that image).

You can now run the image with your changes by entering

docker run -it [image name]:[image tag]

You’ll have to do a commit every single time you want to save your change. This can be a blessing (like a giant OS based undo) and a curse (forgot to save before quitting), but it’s not something the end user would/should care about.

You can also save the docker image to a file, and copy it to another machine and run it there.

Granted, “Dockerization” and virtualization as a foundation is not for everyone, but for deployments that are handled by tech-savvy integrators it might be very suitable.

It sits somewhere between an appliance and traditional installed software. It’s not as simple as the former, not as problematic as the latter.

Technical Innovation

This might be wonky, so if you’re not into that sort of thing, you can save yourself 5 minutes and skip this one.

Enterprise installations of software usually consists of several semi-autonomous clusters (or sites) that are arranged in either some sort of hierarchy. Usually a tree-like structure, but it could be a mesh or graph.

Each cluster may have hundreds or even thousands of sensors (not just cameras) that all can be described by their state. A camera may be online, it may be recording, a PIR might be triggered or not and so on. Basically, sensors have three types of properties: Binary, scalar and complex ones (an LPR reader that returns tag and color of a car).

The bandwidth inside each cluster is usually decent (~GBit), and if it isn’t an upgrade of the network is usually a one time fee. Sure, there might be sensors that are at remote locations where the bandwidth is low, but in most cases bandwidth is not an issue inside the cluster. Unfortunately, the bandwidth from the cluster to the outside world is usually not as ample.

It’s not uncommon that a user at the top of the hierarchy will see a number of clusters, and it seems reasonable that this user also wants to see the state of each of the sensors inside the cluster. This, then requires that the state information of each sensor is relayed from the cluster to the client. The client may have a 100 Mbit downstream connection, but if the upstream path from the cluster to the client is a mere 256 Kbit/s then getting the status data across in real time is a real problem.

In the coding world there’s an obsession with statelessness. Relying on states is hard, so is deleting memory after you’ve allocated it, and debugging binary protocols isn’t cakewalk either, so programmers (like me) try to avoid these things. We have garbage collection (which I hate), we have wildly verbose and wasteful containers such as SOAP (which I hate) and we have polling (which I hate).

So, let’s consider a stateless design with a verbose container (but a lot more terse than soap): The client will periodically request the status.

client:
  <request>
    <type>status</type>
    <match>all</match>
  </request>

server:
  <response>
     <sensor id="long_winded_guid_as_text" type="redundant_type_info">
        <state>ON</state>
     </sensor>
  </response>

The client request can be optimized, but it’s the server that is having the biggest problem. If we look at the actual info in the packet, we are told that a sensor, with a guid, is ON.

Let’s examine the entropy of this message.

The amount of data needed to specify the sensor depends heavily on the number of sensors in the system. If there’s just 2 sensors in total, then the identifier requires 1 bit of information (0 or 1), if there are 4 sensors, you’ll need 2 bits (00, 01, 10 and 11), and so if there are 65000 sensors, you’ll need 16 bits (2 bytes).

The state is binary, so you’ll need just a single bit to encode the state (0 for off, 1 for on).

However, since the state information can be of 3 different types, you’ll need a status type identifier as well. However, this information only needs to be encoded in the message if the type actually changes. If the status type stays the same, it can be cached on the receiving side. However, in order to remain stateless, we’re going to need 2 bits to encode the type (00 for binary, 01 for scalar, 10 for complex).

so, for a system with 60.000 sensors, a message might look like this

[ sensor id          ] [ type ] [ value ]
  0000 0000 0000 0001    00       1

19 bits in total

The wasteful message is 142 bytes long, or 1136 bits, or 59 times larger, and it is actually pretty terse compared to some of the stuff you’ll see in the real world. In terms of bandwidth, the compact and “archaic” binary protocol can push 59 times as many messages through the same pipe!

Now, what happens if we remove the statelessness? We could cache the status type on the receiving side, so we’d get down to 17 bits (a decent improvement), but we could also decide to not poll, and instead let the server push data only when things change.

The savings here are entirely dependent on the configuration of the system, and sensors with scalar values probably have to continuously send the value. A technique used when recording GPS info is to only let the device send its position when it has changed more than a certain threshold or when a predefined duration has passed. That means that if you’re stationary, we only have location samples at a 1 sample per minute for example, but as you move, the sample rate increases to 1 per second. The same could be used for temperature, speed and other scalar values.

So, how often does a camera transition from detecting some motion, to not detecting any? Once per second? Once per minute? Once per hour? Probably all 3 over a 24 hour period, but on average, let’s say you have one transition every 5 minutes. If you’re running a 1 Hz polling rate vs. a sensor that reports state changes only, you’re looking at a 300x reduction in bandwidth.

In theory we’re now about 17000 times more efficient using states and binary protocols.

In practice, the TCP overhead makes quite a bit less. We might also add a bit of goo to the binary protocol to make it easier to parse, and that’s the other area where things improve: processing speed.

Let’s loosen up the binary protocol a bit by byte-aligning each element. This increases the payload to 32 bits (~8000x improvement)

[ sensor id       ] [ type    ] [ value ]
0000 0000 0000 0001  0000 0001  0000 0001

It is now trivial, and extremely fast to update the sensor value on the receiving side. Scalar values might be encoded as 32 bit floats, while complex statuses will take up an arbitrary amount of data depending on the sensor.

The point is that sometimes, going “backwards” is actually the low hanging fruit, and sometimes there’s a lot of it to pluck.

The design of a system should take the environment into consideration as well. If you’re writing a module for a client where you have ample bandwidth, and time to market matters, it would be most appropriate to use wasteful, but fast tools. This means that as part of a development contract, the baseline environment must be described. Failure to do so will eventually lead to problems when the client comes back, angry that the system you wrote in 2 days doesn’t work via a dial-up connection.