Debtors Prison

There’s a wonderful term called “technical debt”. It’s what you accrue when you make dumb mistakes, and instead of correcting the mistake, and taking the hit up front, you take out a small loan, patch up the crap with spittle and cardboard, and ship the product.

kid_credit
Yay! Free money!!!

Outside R&D technical debt doesn’t seem to matter. It’s like taking your family to a restaurant and racking up more debt; the kids don’t care, to them, the little credit card is a magical piece of plastic, and the kids are wondering why you don’t use it more often. If they had the card, it would be new PlayStations and drones every day.

Technical debt is a product killer; as the competition heats up, the company wants to “rev the engine”, but all the hacks and quick fixes mean that as soon as you step on the gas, the damn thing falls apart. The gunk and duct tape that gave you a small lead out of the gate, but in the long run, the weight of all that debt will catch up. It’s like a car that does 0-60 in 3 seconds but then dies after 1 mile of racing. Sure it might enter the race again, limp along for a few rounds, then back to the garage, until it eventually gives up and drops out.

Duct Tape Car Fix - 03
Might get you home, but you won’t win the race with this fix

Why does this happen?

A company may masquerade as a software company and simply pile more and more resources into “just fix it” and “we need” tasks that ignore the real need to properly replace the intake pipe shown above. “If it works, why are you replacing it”, the suit will ask, “my customer needs a sunroof, and you’re wasting time on fixing something that already works!”.

So, it’s probably wise to look at the circumstances that caused the company to take on the debt in the first place. An actual software company might take technical debt very seriously, and very early on they will schedule time for 3 distinct tasks:

  1. Ongoing development of the existing product (warts and all),
  2. Continued re-architecting and refactoring of modules,
  3. Development of the next generation product/platform

Any given team (dependent on size, competency, motivation, and guidance) will be able to deliver some amount of work X. The company sells a solution that requires the work Y. Given that Y < X, the difference can be spent on #2 and #3. The bigger the difference, the better the quality of subsequent releases of the product. If the difference is small, then (absent team changes), the product will stagnate. If Y > X then the product will not fulfill the expectations of the customer. To bridge the gap until the team can deliver an X > Y, you might take on some “bridge debt”. But if the bridge debt is perpetual (Y always grows as fast or faster than X), then you’re in trouble. If Y > X for too long, then X might actually shrink as well, which is a really bad sign.

Proper software architecture is designed so that when more (competent) manpower is added, X grows. Poor architecture can lead to the opposite result. And naturally, incompetent maintenance of the architecture itself (an inevitable result of a quick-fix culture), will eventually lead to the problematic situation where adding people lead to lower throughput.

A different kind of “debt” is the inability to properly value the IP you’ve developed. The cost of development is very different from the value of the outcome. E.g. a company may spend thousands of hours developing a custom log handler, but the value of such a thing is probably very low. This is hard to accept for the people involved, and it often leads to friction when someone points out that the outcome of 1000 hours of work is actually worthless (or possibly even provides a net negative value for the product). A lot of (additional) time may be spent trying to persuade ourselves that we didn’t just flush 1000 hours down the drain, as we’re more inclined to believe a soothing lie than the painful truth.

Solutions?

A company that wants to solve the debt problem must first take a good look at its core values. Not the values it pretends to have, but the actual values; what makes management smile and how it handles the information given to them. Does management frown when a scalability issue is discovered, do they yell and slam doors, points out 20 times that “we will lose the customer if we don’t fix this now!”. The team lead hurries down the hallway, and the team pulls out cans of Pringles and the start ripping off pieces of tape.

The behavior might make the manager feel good. The chest-beating alpha-manager put those damn developers in their place, and got this shit done!. However, over the long run, it will lead to 3 things : 1) Developers will do a “quick fix”, because management wants this fixed quickly, rather than correctly, 2) Developers will stop providing “bad news”, and 3) developers that value correctness and quality will leave.

To the manager, the “quality developer” is not an asset at all. It’s just someone who wants to delay everything to fix an intake that is already working “perfectly”. So over time, the company will get more and more duct-tapers and hacks, and fewer craftsmen and artisans.

The only good thing about technical debt (for a coder) is that it belongs to the company, and not to the employees. Once they’re gone, they don’t have to worry about it anymore. Those that remain do, and they now have to work even harder to pay it back.

debt_mountain2

Advertisements

Technical Innovation

This might be wonky, so if you’re not into that sort of thing, you can save yourself 5 minutes and skip this one.

Enterprise installations of software usually consists of several semi-autonomous clusters (or sites) that are arranged in either some sort of hierarchy. Usually a tree-like structure, but it could be a mesh or graph.

Each cluster may have hundreds or even thousands of sensors (not just cameras) that all can be described by their state. A camera may be online, it may be recording, a PIR might be triggered or not and so on. Basically, sensors have three types of properties: Binary, scalar and complex ones (an LPR reader that returns tag and color of a car).

The bandwidth inside each cluster is usually decent (~GBit), and if it isn’t an upgrade of the network is usually a one time fee. Sure, there might be sensors that are at remote locations where the bandwidth is low, but in most cases bandwidth is not an issue inside the cluster. Unfortunately, the bandwidth from the cluster to the outside world is usually not as ample.

It’s not uncommon that a user at the top of the hierarchy will see a number of clusters, and it seems reasonable that this user also wants to see the state of each of the sensors inside the cluster. This, then requires that the state information of each sensor is relayed from the cluster to the client. The client may have a 100 Mbit downstream connection, but if the upstream path from the cluster to the client is a mere 256 Kbit/s then getting the status data across in real time is a real problem.

In the coding world there’s an obsession with statelessness. Relying on states is hard, so is deleting memory after you’ve allocated it, and debugging binary protocols isn’t cakewalk either, so programmers (like me) try to avoid these things. We have garbage collection (which I hate), we have wildly verbose and wasteful containers such as SOAP (which I hate) and we have polling (which I hate).

So, let’s consider a stateless design with a verbose container (but a lot more terse than soap): The client will periodically request the status.

client:
  <request>
    <type>status</type>
    <match>all</match>
  </request>

server:
  <response>
     <sensor id="long_winded_guid_as_text" type="redundant_type_info">
        <state>ON</state>
     </sensor>
  </response>

The client request can be optimized, but it’s the server that is having the biggest problem. If we look at the actual info in the packet, we are told that a sensor, with a guid, is ON.

Let’s examine the entropy of this message.

The amount of data needed to specify the sensor depends heavily on the number of sensors in the system. If there’s just 2 sensors in total, then the identifier requires 1 bit of information (0 or 1), if there are 4 sensors, you’ll need 2 bits (00, 01, 10 and 11), and so if there are 65000 sensors, you’ll need 16 bits (2 bytes).

The state is binary, so you’ll need just a single bit to encode the state (0 for off, 1 for on).

However, since the state information can be of 3 different types, you’ll need a status type identifier as well. However, this information only needs to be encoded in the message if the type actually changes. If the status type stays the same, it can be cached on the receiving side. However, in order to remain stateless, we’re going to need 2 bits to encode the type (00 for binary, 01 for scalar, 10 for complex).

so, for a system with 60.000 sensors, a message might look like this

[ sensor id          ] [ type ] [ value ]
  0000 0000 0000 0001    00       1

19 bits in total

The wasteful message is 142 bytes long, or 1136 bits, or 59 times larger, and it is actually pretty terse compared to some of the stuff you’ll see in the real world. In terms of bandwidth, the compact and “archaic” binary protocol can push 59 times as many messages through the same pipe!

Now, what happens if we remove the statelessness? We could cache the status type on the receiving side, so we’d get down to 17 bits (a decent improvement), but we could also decide to not poll, and instead let the server push data only when things change.

The savings here are entirely dependent on the configuration of the system, and sensors with scalar values probably have to continuously send the value. A technique used when recording GPS info is to only let the device send its position when it has changed more than a certain threshold or when a predefined duration has passed. That means that if you’re stationary, we only have location samples at a 1 sample per minute for example, but as you move, the sample rate increases to 1 per second. The same could be used for temperature, speed and other scalar values.

So, how often does a camera transition from detecting some motion, to not detecting any? Once per second? Once per minute? Once per hour? Probably all 3 over a 24 hour period, but on average, let’s say you have one transition every 5 minutes. If you’re running a 1 Hz polling rate vs. a sensor that reports state changes only, you’re looking at a 300x reduction in bandwidth.

In theory we’re now about 17000 times more efficient using states and binary protocols.

In practice, the TCP overhead makes quite a bit less. We might also add a bit of goo to the binary protocol to make it easier to parse, and that’s the other area where things improve: processing speed.

Let’s loosen up the binary protocol a bit by byte-aligning each element. This increases the payload to 32 bits (~8000x improvement)

[ sensor id       ] [ type    ] [ value ]
0000 0000 0000 0001  0000 0001  0000 0001

It is now trivial, and extremely fast to update the sensor value on the receiving side. Scalar values might be encoded as 32 bit floats, while complex statuses will take up an arbitrary amount of data depending on the sensor.

The point is that sometimes, going “backwards” is actually the low hanging fruit, and sometimes there’s a lot of it to pluck.

The design of a system should take the environment into consideration as well. If you’re writing a module for a client where you have ample bandwidth, and time to market matters, it would be most appropriate to use wasteful, but fast tools. This means that as part of a development contract, the baseline environment must be described. Failure to do so will eventually lead to problems when the client comes back, angry that the system you wrote in 2 days doesn’t work via a dial-up connection.

 

CamStreamer

Axis does not support RTMP as a transmission protocol in their firmware (correct me if I am wrong about this), this seems to be an omission as RTMP offers a pretty good way for the camera to initiate an outbound connection to an RTMP service somewhere outside the network. I believe this is how DropCam works (again, an assumption).

As most probably know (or should know), an Axis camera is basically a small Linux computer with a camera attached. Axis will let you download a development environment, which you can then use to compile applications for the camera. This is called ACAP.

Before compiling an RTMP library for ACAP, I decided to check if someone else had done this. Turns out CamStreamer had done so, and I decided to give it a whirl.

Setup is extremely simple, and the UI looks modern and clean. It has support for some of the most popular streaming services, but I needed the RTMP ingest server (Universal)  as I have my own RTMP servers in the cloud.

Options

Setting this stuff up was trivial,

generic

The system pretty much worked like a charm with a single exception. During configuration, there was an error message telling me that the camera could not reach an outside host on port 1935 (the RTMP port). The error message is helpful and accurate. But later, I lost my internet connection completely (a screw-up at my ISP, where they switched me back to DHCP) and I was unable to enter the RTMP configuration page on the camera entirely. I guess this is because the HTML/configuration is served from a host in the cloud. I did not dig into it, but it struck me as a bit odd.

The ACAP application is pretty expensive: $299 for a 1-camera license (at the time of writing). This is pretty steep as you can hook up a Raspberry-PI type device that will do the same for several cameras for less than half that.

I am not sure why RTMP support is not built into more cameras (again, correct me if I am wrong). The “push” model of RTMP is almost perfect for cloud-based solutions, so why this is not standard fare for IP cameras is beyond me.

While Flash Video uses RTMP, RTMP can be used w/o Flash being involved at all. E.g. if I publish a stream via RTMP to EvoStream, I can view the stream via RTSP instead (and what can you do with RTSP streams and your favorite VMS?). It’s true that the reliance on TCP for RTMP introduces a bit of latency, but in practice, this is hardly a problem (if you are going to use a cloud solution, latencies are going to be pretty high compared to a local one).

Are You Diffident?

It always amused me when someone says “my personal opinion”. I find it strange, because the “personal” part is superflous. If the person says “my opinion is that red is a nice color”, I assume that the person means what he says: To him, red is a nice color.

If I then give him a red shirt, he says “don’t like it, I hate red”, I would assume some mental illness at play…

“but.. but.. you just said…”, I stutter

“yes, but that was not my personal opinion”

“WAT?”,

puzzled

Opinions don’t have to be personal, sometimes you’ll read “it is the opinion of the court” and things of that nature. But in those cases, it’s pretty clear that it is not the opinion of the person saying the words, that we are talking about. It would be exceedingly weird if the court clerk said “in my opinion, the accused is guilty”.

There are people with mental issues that have trouble with this concept. It is known as Dependant Personality Disorder. It basically means that you can’t have an opinion on anything, you constantly have to ask someone else what their opinion is and then act in accordance with that.

Someone who is deeply narcissistic (borderline?) might assume that everyone in the world, besides themselves, ought to suffer from DPD, and become upset and frustrated when people have opinions that do not align with what they are preaching.

The truth is that finding factual, verifiable information about IP cameras and software is getting easier every day (and this is an old video). Like most people, I don’t much care for what salespeople are saying if it can’t be verified or measured. If the salesperson can provide the raw data, I’ll take it. I will form my own opinion based on what I see. I don’t need some Gríma Wormtongue whispering into my ear.

grima2

With the commoditization of IP cameras, increasing demand for true interoperability we’re getting to a point where facts are valuable, whereas opinions are not (yep, this blog is free!!!). In some cases though. arguments and opinions may be based, not on unbiased interpretation of facts, but instead it is shaped by grudges and anger.

If you are paying for facts, you definitely should demand full disclosure, or if you’re not, you need to ask yourself, am I reading verifiable facts, or just bullshit? You might ask: Are manufacturers paying (directly, or indirectly) the one stating opinions about either the manufacturers products, or the products of the manufacturer’s competitors? If you’re being lied to in the full disclosure, you might be lied to elsewhere.

 

 

Password Security

When you’re dealing with user passwords, the site owner does not (or should not) want to know the actual password the user enters. A very good reason for not wanting to know the actual password is that if the database is leaked, in one way or the other, whoever gets hold of the database won’t know the passwords.

Hashing

What you do instead, is to save a hash of the password. A hash is a kind of checksum, so that for a given input, you get a fixed length output value. A very simple (and useless) hash is to just add the ascii-values of each char together, taking modulo 256 of the sum.

So, when the user enters “23456” as his password, you calculate the sum of the ascii-values ( 50 + 51 + 52 + 53 + 54 ) = 260, take the modulo 256 of 260, we get “4” as the checksum, we then store the “4” in the database, instead of the “23456” string.

Username Password-Hash
Morten   4

The next time, the user comes along, we do the same calculation, and compare the results. If the result is “4”, the user gets access. If the user, by a mistake, entered “23457” we’d get the hash of “5” and we’d reject the user.

But wait a minute, the astute reader yells out, banging his fist on the table: “the password “65432” would give the exact same hash!”. And the reader would be correct. This is known as collision, and is why some hashes are good, and others are bad. In my idiot hash algorithm, it is completely trivial for someone to find a combination of letters that give “4” as the output.

Instead of a useless hash, you might use some that are considered “good”. The SHA family of hashes are generally considered the gold standard of hashes. Bitcoin, for example, uses SHA-256 to provide Proof of Work.

Salting

So, let’s say we pick SHA-1, and we store the output of that. In that case we’d get

SHA1( "23456" ) = c24d0a1968e339c3786751ab16411c2c24ce8a2e

But what if two users have the same password? Consider these two rows in the db

Username    Password-Hash
Morten      c24d0a1968e339c3786751ab16411c2c24ce8a2e
John        c24d0a1968e339c3786751ab16411c2c24ce8a2e

If I know the password of “Morten”, I also now know the password of “John”.

What you do, instead, is to add a “salt”. The salt is just a random sequence of strings that I append to the password before computing the hash. For example, with salts, we’d get

Username    Password-Hash                               Salt
Morten      ac5740fde13da84ba4a5266ce9e9b7d697e0622b    xyz
John        c96bbd1bed0ffa3f5a0098ce7ee568ce6e9d496c    abc

Now it’s not apparent that Morten and John both use “23456” as passwords, and we’re almost done….

Brute Force Attacks

In my bad hashing algorithm, it’s pretty clear that you can’t guess my password from knowing the hash value “4”. On one hand it means that other users, might easily get access to an account on the site, but on the other hand, it also means that they can’t guess the original password.

This means that even if they could get access to an account on the site using the (false) password “65432”, they probably wouldn’t be able to access the users gmail account, because it uses a different security model, and in that case the “65432” just wouldn’t work.

However, if the site uses SHA1, the chance of collision is quite low, which means that if I can find an input string that gives the same SHA1 there’s a very good chance that I found the actual original password, and if the user reused the password on other sites, I might be able to gain access to those too.

Brute force attacks don’t actually try every conceivable input, instead it uses something known as “rainbow tables”. These are tables of the most common passwords, As GPU’s improve, the number of SHA1 computations we can make in a given time increases. For example, a new NVidia GTX 1080 TI will do 11374.1 MH/s

In other words, SHA1 salted passwords are not safe, and they are certainly not safe if they occur in a rainbow table somewhere, and there is plenty of software available to accomplish this, hashcat being one.

What To Do?

If you’re reading this, you probably aren’t using either “password” or “123456” as passwords (or are you John?). Furthermore, if you are using “123456”, then you probably are also using the same password for multiple logins, and chances are your other accounts have already been breached.

Two-factor authentication goes a long way to remedy the situation; if the user logs in from a new IP (meaning outside a 255.255.255.0 subnet of known verified ips), then the 2nd stage kicks in, and you need to authenticate over email or SMS. However, this doesn’t work too well if the user uses “123456” all over the place, because the attacker might already have access to your email account (especially if the email account is your username on the site).

Enforcing “good” passwords on the site is also an improvement. You don’t have to go nuts on the requirements that it has to have special chars, upper and lower case etc. The site could actually generate the password for you. This way it would be unique, and hard to guess, if the site was breached, only the site would be affected, and if the site is just a TMZ style rag, the damage would not be too bad.

 

 

The Parts of an IP Camera

To understand where the IP camera market is headed, I think it’s important to understand how one of these things are put together.

Like most high tech devices, each product is really an amalgamation of parts from different manufacturers. In fact many products are the result of tight, but perhaps unappreciated, collaboration of several (sometimes competing) companies. I’d recommend listening to Freakonomics rundown of the “I, Pencil” essay (starts 7 minutes in).

So, an IP camera is not a pencil, but just like all pencil manufacturers don’t manufacture every single part of the pencil, but instead, they purchase the parts (graphite, brass, paint and so on) and every manufacturer puts the pencil together following roughly the same pattern.

And, so, when it comes to IP cameras, they too are composed of parts that are available to everyone who wants to start making cameras.

You’ll need a couple of things: A lens, a sensor, some circuitry and some code.

You’re not going to start making your own lenses or sensors, are you? Probably not, so you’ll get the lenses from a lens maker (and they may even outsource their manufacturing process even further), and the sensor from either Sony or Canon.

You’re not going to design your own CPU either (unless you’re Axis). Today, you’d be better off grabbing an ARM platform and use that to drive the sensor and interface. The other advantage is that ARM is well supported in the software world, so you’re already halfway there.

Now that you have the basics, you need to write some code to get it all working together. If you went the ARM route, it’s pretty simple to get a linux kernel running. Well.. “simple” is depends on your level of skill, but finding a few geeks who can do this shouldn’t take long. So you grab the Linux kernel, add Apache or perhaps GoAhead, you can add gStreamer too (do check the link, it is a great presentation by Axis) . The next thing you know, you have a jumble of cables and breadboards, burns on your fingers from the soldering iron, you haven’t seen your kids in 4 days and the smell is getting a little hard to stomach.

On top of that, you need to wrap this in an enclosure. There’s regulations to follow, tests that need to be carried out and so on. Then you have the nightmare of maintaining all those pieces of code, and trust me – if you wrote everything yourself, it would take even longer and be much harder to test and maintain.

What if there was a company, that could do all of the above? And just stick my name on the box? After all, my company would pick the same lens, the same sensor, the same board and the same software, so why not do it?

I have no intention of starting production of a Raspberry Pi Zero based IP camera, but I know that I can make one for ~$40 (and that’s buying all the parts retail). Not only will this thing work as an IP camera, it can work as a full fledged stand-alone VMS.

In other words, the question is: if some washed up coder in Copenhagen can build a fully functional “IP camera” for $40, I think you’re going to face a tough time if you’ve based your entire organization around selling your cheapest cameras for $250+ (they may be “even more good enough”, but who cares?).

Obviously, my camera is not going to be materially different from the other guy’s cameras. We’re all going to use the same bits and pieces, including software, even the damn protocols are going to be the same.

So, I think we’re going to see a race to the bottom in terms of prices. The cameras will look and perform almost identically across brands, use the same protocols, and be completely interchangeable, much to the chagrin of the incumbents, so the USP for the brands in this realm will have be something else.

VMS Software, perhaps…

 

 

 

 

 

 

HDR and Low Light

In the early days, the only thing that mattered was pixel count. The more pixels you could cram onto a sensor, the better. Some people noticed that the higher pixels count would decrease the performance in low light scenes (for the same sensor size), and we got to a point where you’d have ridiculously high resolutions on extremely small sensors, but you’d need to be a few miles from the sun in order to get good, sharp footage.,

There are all sorts of software tricks that you can employ to improve the appearance of the image to the casual observer, but you can’t conjure up data that just aren’t there. One trick is to use very long exposure, but that causes moving objects to get blurry, or you can do noise reduction, but that also removes actual details.

So what happened that caused the cameras to suddenly improve quite dramatically over the last couple of years? The sensor guys came up with back-illuminated sensors. Basically, the sensors got a hell of a lot better, and now we’re reaping the benefits.

bi_sesnor

Xda-Developers has a great article on the Sony IMX378 explaining BI sensors and how HDR is achieved. And of course, there’s always Wikipedia.

Competition is a wonderful thing.