Lies, Damn Lies and Video Analytics

Today, doing object tracking using OpenCV can be done in just a few hours. The same applies to face detection and YOLO. Object tracking and recognition is no longer “magic” or require custom hardware. Most coders can whip something together in a day or two that will run on a laptop. Naturally, the research behind these algorithms is the work of some extremely clever guys who, commendably, are sharing their knowledge with the world (the YOLO license is legendary).

But there’s a catch.

During a test of YOLO, it would show me a couple of boxes. One around a face, and YOLO was about 51% certain that this was a person. Around my sock, there would be another where it was 54% sure it was also a person. But there was another face in the frame that was not identified as one.

It’s surprising and very cool that an algorithm can recognize a spoon on the table. But when the algorithm thinks a sock is a face and a face isn’t one, are you going to actually make tactical decisions in a security system based on it?

Charlatans will always make egregious claims about what the technology can do, and gullible consumers and government agencies are being sold on a dream that eventually turn out to be a nightmare.

Recently I saw a commercial where a “journalist” was interviewing a vendor about their analytics software (it wasn’t JH). Example footage was shown of a terrorist unpacking a gun, and opening fire down the street. This took place in your typical corner store in a middle eastern country. The video systems in these stores are almost always pretty awful, bad cameras, heavy compression.


The claim being made in the advert is that their technology would be able to identify the terrorist and determine his path through the city in a few hours. A canned demo of the photographer walking through the offices of the vendor was offered as a demonstration of how easy and fast this could be done.

I call bullshit!

-village fool

First of all, most of the cameras on the path are going to be recording feed at similar quality to what you see above. This makes recognition a lot harder (useless/impossible?).

Second, if you’re not running object tracking while you are recording, you’ll need to process all the recorded video. Considering that there might be thousands of cameras, recorded on different equipment recording in different formats, the task of doing the tracking on the recorded video is going to take some time.

Tracking a single person walking down a well lit hallway, with properly calibrated and high quality cameras is one thing. Doing it on a camera with low resolution, heavily compressed video, and a bad sensor on the street with lots of movement, overlaps, etc. is a totally different ballgame.

You don’t know anything about marketing!

-arbitrary marketing person, yelling at Morten

Sure, I understand that this sort of hyperbole is just how things are done in this business. You come up with things that are fantastic and plausible for the uneducated user, and hope that it makes someone buy your stuff. And if your magical tool doesn’t work, then it’s probably too late, and who defines “works” anyways? If it can do it 20% of the time, then it “works”, doesn’t it. Like a car that can’t drive in the rain also “works”.

If you want to test this stuff, show up with real footage from your environment, and demand a demo on that content (if the vendor/integrator can’t do it, they need to educate themselves!). Keep an eye on the CPU and GPU load and ask if this will run on 300 cameras in your mall/airport without having to buy 100 new PC’s with 3 top of the line GPU’s in them.

I’m not saying that it doesn’t ever work. I’m saying that my definition of “works” is probably more dogmatic than a lot of people in this industry.


Managing the Manager-less Process

Fred George has quite a resume – he’s been in the software industry since the 70’s and is still active. His 2017 talk @ the GOTO conference is pure gold.

His breakdown of the role of the Business Analyst at 19:20 is spot on. The role of the manager is even saltier (23:12) – “I am the God here”.

Well worth an hour of your life (mostly for coders).

As a side node there are two characters in the Harry Potter movies called “Fred and George”, making searches for “Fred George” a pain.

Lambs to the Slaughter

When lambs are loaded on trucks, as they are sent to the slaughterhouse, I doubt the driver, or the farmer or basically anyone tells them that this is where they’re going.

I wonder what the lambs are thinking.

If they could talk, some would probably say “we’re doomed”, and others would stomp on them and say “shut the hell up, can’t you see everyone is getting anxious” or “why can’t you be more positive”.

Maybe anxiety is natures way of telling you that something bad may be coming your way. If you’re in a rustling truck, driving far away from the farm, it’s appropriate to feel anxious. It’s telling you to be aware of whats going on, and think of an escape route.

The lambs see the butcher, but they don’t know what they’re looking at. The guy is not going to scream “I’M HERE TO KILL YOU ALL”, he’ll whisper reassuringly “come here little friend, follow me”.

Don’t listen to him.

Run away.



20 years ago, the NVR we wrote was a monolith. It was a single executable, and the UI ran directly on the console. Rendering the UI, doing (primitive) motion detection and storing the video was all done within the same executable. From a performance standpoint, it made sense; to do motion detection we needed to decode the video, and we need to decode the video to render it on the screen, so decoding the video just once made sense. We’d support up to a mind-blowing 5 cameras per recorder. As hardware improved, we upped the limit to 25, in Roman numerals, 25 is XXV, and hence the name XProtect XXV (people also loved X’s back then – fortunately, we did not support 30 cameras).

Image result for rock

I’m guessing that the old monolith would be pretty fast on today’s PC, but it’s hard/impossible to scale beyond a single machine. Supporting 1000 cameras is just not feasible with the monolithic design. That said, if your system is < 50 cameras, a monolith may actually simpler, faster and just better, and I guess that’s why cheap IP recorders are so popular.

You can do a distributed monolith design too; that’s where you “glue” several monoliths together. The OnSSI Ocularis system does this; it allows you to bring in a many autonomous monoliths and let the user interact with them via one unified interface. This is a fairly common approach. Instead of completely re-designing the monolith, you basically allow remote control of the monolith via a single interface. This allows a monolith to scale to several thousand cameras across many monoliths.

One of the issues of the monolithic design is that the bigger the monolith, the more errors/bugs/flaws you’ll have. As bugs are fixed, all the monoliths must be updated. If the monolith consists of a million lines, chances are that the monolith will have a lot of issues, and fixes for these issues introduce new issues and so on. Eventually, you’re in a situation where every day you have a new release that must be deployed to every machine running the code.

The alternative to the monolith is the service based architecture. You could argue that the distributed monolith is service based; except the “service” does everything. Ideally, a service based design ties together many different services that have a tightly defined responsibility.

For example; you could have the following services: configuration, recorder, privileges, alarms, maps, health. The idea being that each of these services simply has to adhere to an interface contract. How the team actually implements the functionality is irrelevant. If a faster, lighter or more feature rich recorder service comes along, it can be added to the service infrastructure as long as it adheres to the interface. Kinda like ONVIF?

This allows for a two-tiered architectural approach. The “city planner” who plans out what services are needed and how they communicate, and the “building architect” who designs/plans what goes into the service. Smaller services are easier to manage, and thus, hopefully, do not require constant updates. To the end user though, the experience may actually be the same (or even worse). Perhaps patch 221 just updates a single service, but the user has to take some action. Whether patch 221 updates a monolith or a service doesn’t make much difference to the end-user.

Just like cities evolve over time, so does code and features. 100 years ago when this neighborhood was built, a sewer pipe was installed with the house. Later, electricity was added, it required digging a trench and plugging it into the grid. Naturally, it required planning and it was a lot of work, but it was done once, and it very rarely fails. Services are added to the city, one by one, but they all have to adhere to an interface contract. Electricity comes in at 50 Hz, and 220V at the socket, and the sockets are all compatible. It would be a giant mess if some providers used 25 Hz, some 100 Hz, some gave 110V, some 360V etc. There’s not a lot of room for interpretation here; 220V 50 Hz is 220V 50 Hz. If the spec just said “AC” it’s be a mess. Kinda like ONVIF?.

Image result for wire spaghetti

In software, the work to define the service responsibilities, and actually validate that services adhere to the interface contract is often overlooked. One team does a proprietary interface, another uses WCF, a third uses HTTPS/JSON, and all teams think that they’re doing it right, and everyone else is wrong. 3rd parties have to juggle proprietary libraries that abstract the communication with the service or deal with several different interface protocols (never mind the actual data). So imagine a product that has 20 different 3rd party libraries, each with bugs and issues, and each of those 3rd parties issue patches every 6 months. That’s 40 times a year that someone has to make decide to update or not; “Is there anything in patch 221 that pertains to my installation? Am I using a service that is dependent on any of those libraries” and so on.

This just deals with the wiring of the application. Often the UI/UX language differs radically between teams. Do we drag/drop things, or hit a “transfer” button. Can we always filter lists etc. Once again, a “city planner” is needed. Someone willing to be the  a-hole, when a team decide that deviating from the UX language is just fine, because this new design is so much better.

I suppose the problem, in many cases, is that many people think this is the fun part of the job, and everyone has an opinion about it. If you’re afraid of “stepping on toes”, then you might end up with a myriad of monoliths glued together with duct-tape communicating via a cacophony of protocols.

OK, this post is already too long;

Monoliths can be fine, but you probably should try to do something service based. You’re not Netflix or Dell, but service architecture means a more clearly defined purpose of your code, and that’s a good thing. But above all, define and stick to one means of communication, and it should not be via a library.



In 2017, IBM spent $5.4 billion dollars on R&D (sources vary on the number). That’s a lot of money. Not as much as Amazon ($23 billion) or Alphabet ($16 billion), but still a pretty good chunk of money. Instead of saying billion, let’s say thousand-million, so IBM spent five-thousand-four-hundred million dollars on R&D.

It’s roughly the same as what they spent in 2005, and their revenue is roughly the same as back then ($88 billion in 2006, $80 billion in 2017)

They’re doing a bunch of things to stay relevant, but while most people have heard about AWS or Azure, I don’t often bump into people who know or use IBM Bluemix (now “IBM Cloud”). Is it any good? I guess it must be; the revenue was at the time of writing (Jan 2019) about the same as AWS and Azure (between $7 billion and 9 billion), but AWS and Azure is growing very, very fast (~50% growth per year). As a developer, trying out AWS or Azure, for real, can be accomplished in a few hours. I tried Bluemix years ago. I gave up. I’m sure that I could have gotten it off the ground, but why should I spend days on something that I can do in hours with the other vendors?

Most have heard about IBM’s Watson project. Watson is a project to make a computer that knows everything; it can play Jeopardy, diagnose patients and judge your wardrobe. Reading about the intended purposes of Watson, it seems as if they’re constantly trying new (random) things, only to get beat by Amazon, Apple, Microsoft or Google in the areas that matter. Morgan-Stanley asked ~100 CIO’s about their interest in AI, and 43 of them were considering using AI. Of those, 10 preferred IBM Watson. I don’t know what “preferred” means, and I don’t know what they plan to do, but just because IBM has plowed a lot of money into building something that runs on their mainframes, doesn’t mean it’s valuable. As a side note, the survey (in the link) converted the numbers to percentages to make them seem more significant, but really, it was just 10 dudes out of 100 who said they “preferred Watson”. SO FAKE NEWS!!! (or at least, take it with a grain of salt).

IBM’s revenue did grow recently, but the growth was driven largely by mainframes (cloud business also saw an uptick, about 20% growth), but most people are wondering if this is sustainable. Aren’t we all moving towards serverless (a’la Amazon Lambda), which basically means “sell me cycles as cheap as possible”. It smells like commoditization, narrow margins and huge volume – running on the cheapest possible hardware. A game in which IBM’s expensive mainframes probably will struggle. It seems as if IBM is anticipating this, and just took a major step in that direction by paying $34 billion for Red Hat.

IBM basically went from being front and center of the PC revolution to being a a large, but mostly invisible company. I used to be that “PC software” def. would work on an IBM PC, but then things got bad. For example, 20 years ago there was just one person at the dorm with an actual IBM computer, and it was not compatible with any of the clones the rest of us had. The world had taken the parts that were useful (common platform), and moved on. IBM thought they were still in control of the platform, and could deviate from it. Turned out they were wrong.

Is there a need for IBM any longer?

Sure, banks, government and insurance companies with huge legacy systems, that can’t be moved, will keep buying mainframes forever. So there’s certainly a need. But it’s the same sort of need a drug addict has when they’re coming off their drugs. It’s not a “good” need. Does IBM have anything unique and valuable they can offer to the “cheap reliable cycles” world of the cloud? Do they have anything in the AI space that isn’t being beat by the usual suspects, or at least will be beat, as soon as it becomes commercially viable?

We need competition. If we’re left with Azure and AWS, the world will be a boring place. IBM could maybe compete with AWS. I don’t see why they wouldn’t be able to. But perhaps they aren’t willing or capable.



Conway’s Law

I was re-watching this video about the (initially failed) conversion from a monolithic design of an online store, into a microservice based architecture. During the talk, Conway’s Law is mentioned. It’s one of those laws that you really should keep in mind when building software.

“organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.”
— M. Conway

The concept was beautifully illustrated by a conversation I had recently; I was explaining why I disliked proprietary protocols, and hated the idea of having to rely on a binary library as the interface to a server. If a server uses HTTPS/JSON as it’s external interface, it allows me to use a large number of libraries – of my choice, for different platforms (*nix, windows) – to talk to the server. I can trivially test things using a common web browser. If there is a bug in any of those libraries, I can use another library, I can fix the error in the library myself (if it is OSS) etc. Basically I become the master of my own destiny.

If, on the other hand, there is a bug in the library provided to me, required to speak some bizarre proprietary protocol, then I have to wait for the vendor/organizational unit to provide a bug-fixed version of the library. In the meantime, I just have to wait. It’s also much harder to determine if the issue is in the server or the library because I may not have transparency to what’s inside the library, and I can’t trivially use a different means of testing the server.

But here’s the issue; the bug in the communication library that is affecting my module might not be seen as a high priority issue by the unit in charge of said library. It might be that the author left, and it takes considerable time to fix the issue etc. etc. this dramatically slows down progress and the time it takes to deliver a solution to a problem.

Image result for bottleneck

The strange thing is this; the idea that all communication has to pass through a single library, making the library critically important (but slowing things down) was actually 100% mirrored in the way the company communicated internally. Instead of encouraging cross team communication, there was an insistence that all communication pass through a single point of contact.

Basically, the crux is this, if the product is weird, take a look at the organization first. It might just be the case that the product is the result of a sub-optimal organizational structure.

Crashing a Plane

Ethiopian Airlines Flight 961 crashed into the Indian ocean. It had been hijacked en route from Addis-Ababa to Nairobi. The hijackers wanted to go to Australia. The captain warned that the plane only had enough fuel for the scheduled flight and would never make it to Australia. The hijackers disagreed. The 767-200ER had a max. flight capacity of 11 hours, enough to make it to Australia they argued. 125 people died when the plane finally ran out of fuel and the pilots had to attempt an emergency landing on water.

Korean Air Flight 801 was under the command of the very experienced Captain Park Yong-chul. During heavy rain, the Captain erroneously thought that the glidescope instrument landing system was operational, when it fact it wasn’t. The Captain sent the plane into the ground about 5 km from the airport killing 228 people.

In the case of Ethiopian Airlines, there’s no question that the people in charge of the plane (the hijackers), had no idea what they were doing. Their ignorance, and distrust of the crew, ultimately caused their demise. I am certain that up until the last minute, the hijackers believed they knew what they were doing.

For Korean Air 801, the crew was undoubtedly competent. The Captain had 9000 hours logged, and during the failed approach, we can safely assume that he felt that he knew what he was doing. In fact, he might have been so good that everyone else stopped second guessing Captain Park even though their instruments was giving them a reading that told them something was seriously wrong. Only the 57 year old flight engineer Nam Suk-hoon with 13000 hours logged dared speak up.

I think there’s an analogy here; we see companies crash due to gross incompetence, inexperience and failure to listen to experienced people, but we also see companies die (or become zombies) because they have become so experienced that they felt that they couldn’t make any fatal mistakes. Anyone suggesting they were short on approach are ignored. The “naysayers” can then leave the plane on their own, get thrown out for not being on board with the plan, or meet their maker when the plane hits the ground.

Yahoo comes to mind; witness this horror-show:

Image result for yahoo bad decisions

The people making these mistakes were not crazed hijackers with an insane plan. These were people in expensive suits, with many many years of experience. They all had a large swarm of people doing their bidding and showing them excel sheets and power-point presentations from early morning to late evening. Yet, they managed to crash the plane into the ground.

So, I guess the moral is this: if you’re reading the instruments, and they all say that you’re going to crash into the ground, then maybe, just maybe the instruments are showing the true state of things. If the Captain refuses to acknowledge the readings and dismisses the reports, then the choices are pretty clear.

The analogy’s weakness is that in most cases, no-one dies when the Captain sends the product in the wrong direction. The “passengers” (customers) will just get up from their seats and step into another plane or mode of transportation, and (strangely) in many cases the Captain and crew will move on and take over the controls of another plane. We can just hope that the new plane will stay in the air until it reaches it’s intended destination safely.