The (Real) Problem With Cybersecurity

Having been in the sausage-factory for a long time, I’d like to share some thought about what I think is a problem when it comes to cyber-security.

Contrary to popular belief programmers are humans; we make stupid mistakes, some are big, some are small. Some days we are lazy, others we are energetic and highly motivated. So (exploitable) bugs inevitably creep in, this is just a fact of life.

The first step in writing robust and secure code is in the design and architecture. The next step is to have developers with good habits and skills, and finally you run a good selection of automated tests on the modules that make up the product.

But consider that a VMS typically runs on an OS that the vendor has very little control over, or uses a database (SQL server, mySQL, MaxDB etc) that is also outside the manufacturers reach. Furthermore, the VMS itself uses libraries from 3rd parties, and with good reason too. Open Source libraries are often much better tested and under an extreme amount of scrutiny compared to a VMS doing a homemade concoction reviewed by just a few peers, but they too have bugs (just fewer than the homebrew stuff usually).

Inevitably, someone finds a way to break the system, and when it happens, it’s a binary event. The product is now insecure. You can argue all you want that the other windows and doors are super-secure, but if the back door is open – who cares about the lock on the window?

To be fair, if the rest of the building is locked down well, then fixing the broken door may be a smaller event.

Contrast to a system that is insecure by design. Where fixing the security issues requires changes to the architecture. We’re no longer talking about replacing a broken lock, but upheaval of the entire foundation. An end-user doesn’t know if the cracks are due to a fundamental issue, or something that just needs a bit of plaster and paint.

And this brings me to the real issue.

Say a developer politely asks demand that resources are allocated to fixing these issues, what do you imagine will happen? In some companies, I assume that a task-force is assembled to estimate the severity of the issue, resources are then allocated to fix the issue. A statement is issued so that people know to apply the patch (they’re not going to do it, but it’s the right thing to do). This is what a healthy company ought to do. A sick company would make the following statement: “no-one has complained about this issue, and – actually – we have to make money”.

A good way to make yourself unpopular (as a programmer) is to respond by saying that if the issue IS discovered, you can forget about making any money. Your market will be limited to installations who really don’t care about security. The local Jiffy-Lube who replaced their VHS based recorder with a DVR that just sits on a dusty shelf may truly not care. The system is not exposed in any way – it is a CCTV (Closed, being the operative word here). They’re fine. And the root password is written on a post-it note stuck on the monitor. But what about a power plant? What about a bank? an airport?

You might imagine that an honest coder with integrity would resign on the spot, but this doesn’t solve the problem. Employees are often gagged by NDAs and non-disparagement clauses, and while disclosure of security flaws is clearly protected by the first amendment, it is generally a bad idea to talk about these things. The company may suffer heavy losses and you are putting (unsuspecting) customers at risk by making these things public. The threat of legal action and the asymmetry (a single person vs a corporation) ensures that flaws rarely surface.

It’s also conceivable that the dumbass programmer, is wrong about the risk of a bug/design issue. A developer may think that a trivial bypass of privilege checks is “dangerous”, but customers might genuinely not care.

Who knows? During the Black Hat convention in 2013, IP cameras from different manufacturers were shown to be hopelessly unsafe. Didn’t seem to make any difference.

I referenced this talk in an earlier post as well.

4 years later, cybersecurity is all the rage, and perhaps people do care – but from what I can tell, it’s just a few SJWs who crave the spotlight who pretend to care. Whether the crazy accusations have merit is irrelevant, all that matters is that viewers tune in, and the show will get increasingly grotesque to keep people entertained. And if the freaks-show is not bringing in the crowds, you can always turn it into a sort of “anonymous facebook” where people can back-stab each other – like the bitchiest teenage girls used to treat each other.

What the industry probably needs to do, is to pay professional penetration testers to go to work on the systems out there. I’m not talking about the kind of shitty automated tests that are being done today. They are far, far from being sufficient. You need people like Craig Heffner in the video to go to town to get to the bottom.

Happy hacking.

InfluxDB and Grafana

InfluxDBGrafana

When buzzards are picking at your eyes, maybe it’s time to move a little. Do a little meandering, and you might discover that the world is larger, and more fun, than you imagined. Perhaps you realize that what was once a thriving oasis has now turned into a putrid cesspool riddled with parasites.

InfluxDB is what’s knowns as a “streaming database”. The idea is that it’s a database that collects samples over time. Once the sample is collected, it doesn’t change. Eventually the sample gets too old, and is discarded. This is different from traditional databases where the values may change over time, and the deletion of records is not normally based on age.

This sounds familiar doesn’t it?

Now, you probably want to draw some sort of timeline, or graph, that represents the values you popped into InfluxDB. Enter, Grafana. It’s a dashboard designer that can interface with InfluxDB (and other databases too) and show pretty graphs and tables in a web page w/o requiring any HTML/Javascript coding.

If you want to test this wonderful combination of software, you’ll probably want to run Docker, and visit this link.

Now, I’ve already abandoned the idea of using InfluxDB/Grafana for the kind of stuff I mess around with. InfluxDB’s strength is that it can return a condensed dataset over a potentially large time-range. And it can make fast and semi-complex computations over the samples it returns (usually of the statistical kind). But the kind of timeline information I usually record is not complex at all, and there aren’t really any additional calculations I can do over the data. E.g. what’s the average of “failed to connect” and “retention policy set to 10 days”.

InfluxDB is also schema-less. You don’t need to do any pre-configuration (other than creating your database), so if you suddenly feel the urge to create a table called “dunning” then you just insert some data into “dunning”. You don’t need to define columns or their types etc. you just insert data.

And you can do this via a standard HTTP call, so you can use curl on the command line, or use libcurl in your c++ app (which is what I did).

The idea that you can issue a single command to do a full install of InfluxDB and Grafana, and then have it consume data from your own little app in about the time it takes to ingest a cup of coffee says a lot about where we’re headed.

Contrast the “open platforms” that require you to sign an NDA, download SDKs, compile DLLs, test on 7 different versions of the server and still have to nurse it every time there’s a new version. Those systems will be around for a long time, but I think it’s safe to say they’re way past their prime.

 

 

Lies, Damn Lies and Video Analytics

Today, doing object tracking using OpenCV can be done in just a few hours. The same applies to face detection and YOLO. Object tracking and recognition is no longer “magic” or require custom hardware. Most coders can whip something together in a day or two that will run on a laptop. Naturally, the research behind these algorithms is the work of some extremely clever guys who, commendably, are sharing their knowledge with the world (the YOLO license is legendary).

But there’s a catch.

During a test of YOLO, it would show me a couple of boxes. One around a face, and YOLO was about 51% certain that this was a person. Around my sock, there would be another where it was 54% sure it was also a person. But there was another face in the frame that was not identified as one.

It’s surprising and very cool that an algorithm can recognize a spoon on the table. But when the algorithm thinks a sock is a face and a face isn’t one, are you going to actually make tactical decisions in a security system based on it?

Charlatans will always make egregious claims about what the technology can do, and gullible consumers and government agencies are being sold on a dream that eventually turn out to be a nightmare.

Recently I saw a commercial where a “journalist” was interviewing a vendor about their analytics software (it wasn’t JH). Example footage was shown of a terrorist unpacking a gun, and opening fire down the street. This took place in your typical corner store in a middle eastern country. The video systems in these stores are almost always pretty awful, bad cameras, heavy compression.

bad_video

The claim being made in the advert is that their technology would be able to identify the terrorist and determine his path through the city in a few hours. A canned demo of the photographer walking through the offices of the vendor was offered as a demonstration of how easy and fast this could be done.

I call bullshit!

-village fool

First of all, most of the cameras on the path are going to be recording feed at similar quality to what you see above. This makes recognition a lot harder (useless/impossible?).

Second, if you’re not running object tracking while you are recording, you’ll need to process all the recorded video. Considering that there might be thousands of cameras, recorded on different equipment recording in different formats, the task of doing the tracking on the recorded video is going to take some time.

Tracking a single person walking down a well lit hallway, with properly calibrated and high quality cameras is one thing. Doing it on a camera with low resolution, heavily compressed video, and a bad sensor on the street with lots of movement, overlaps, etc. is a totally different ballgame.

You don’t know anything about marketing!

-arbitrary marketing person, yelling at Morten

Sure, I understand that this sort of hyperbole is just how things are done in this business. You come up with things that are fantastic and plausible for the uneducated user, and hope that it makes someone buy your stuff. And if your magical tool doesn’t work, then it’s probably too late, and who defines “works” anyways? If it can do it 20% of the time, then it “works”, doesn’t it. Like a car that can’t drive in the rain also “works”.

If you want to test this stuff, show up with real footage from your environment, and demand a demo on that content (if the vendor/integrator can’t do it, they need to educate themselves!). Keep an eye on the CPU and GPU load and ask if this will run on 300 cameras in your mall/airport without having to buy 100 new PC’s with 3 top of the line GPU’s in them.

I’m not saying that it doesn’t ever work. I’m saying that my definition of “works” is probably more dogmatic than a lot of people in this industry.

 

Managing the Manager-less Process

Fred George has quite a resume – he’s been in the software industry since the 70’s and is still active. His 2017 talk @ the GOTO conference is pure gold.

His breakdown of the role of the Business Analyst at 19:20 is spot on. The role of the manager is even saltier (23:12) – “I am the God here”.

Well worth an hour of your life (mostly for coders).

As a side node there are two characters in the Harry Potter movies called “Fred and George”, making searches for “Fred George” a pain.

Lambs to the Slaughter

When lambs are loaded on trucks, as they are sent to the slaughterhouse, I doubt the driver, or the farmer or basically anyone tells them that this is where they’re going.

I wonder what the lambs are thinking.

If they could talk, some would probably say “we’re doomed”, and others would stomp on them and say “shut the hell up, can’t you see everyone is getting anxious” or “why can’t you be more positive”.

Maybe anxiety is natures way of telling you that something bad may be coming your way. If you’re in a rustling truck, driving far away from the farm, it’s appropriate to feel anxious. It’s telling you to be aware of whats going on, and think of an escape route.

The lambs see the butcher, but they don’t know what they’re looking at. The guy is not going to scream “I’M HERE TO KILL YOU ALL”, he’ll whisper reassuringly “come here little friend, follow me”.

Don’t listen to him.

Run away.

 

Monolith

20 years ago, the NVR we wrote was a monolith. It was a single executable, and the UI ran directly on the console. Rendering the UI, doing (primitive) motion detection and storing the video was all done within the same executable. From a performance standpoint, it made sense; to do motion detection we needed to decode the video, and we need to decode the video to render it on the screen, so decoding the video just once made sense. We’d support up to a mind-blowing 5 cameras per recorder. As hardware improved, we upped the limit to 25, in Roman numerals, 25 is XXV, and hence the name XProtect XXV (people also loved X’s back then – fortunately, we did not support 30 cameras).

Image result for rock

I’m guessing that the old monolith would be pretty fast on today’s PC, but it’s hard/impossible to scale beyond a single machine. Supporting 1000 cameras is just not feasible with the monolithic design. That said, if your system is < 50 cameras, a monolith may actually simpler, faster and just better, and I guess that’s why cheap IP recorders are so popular.

You can do a distributed monolith design too; that’s where you “glue” several monoliths together. The OnSSI Ocularis system does this; it allows you to bring in a many autonomous monoliths and let the user interact with them via one unified interface. This is a fairly common approach. Instead of completely re-designing the monolith, you basically allow remote control of the monolith via a single interface. This allows a monolith to scale to several thousand cameras across many monoliths.

One of the issues of the monolithic design is that the bigger the monolith, the more errors/bugs/flaws you’ll have. As bugs are fixed, all the monoliths must be updated. If the monolith consists of a million lines, chances are that the monolith will have a lot of issues, and fixes for these issues introduce new issues and so on. Eventually, you’re in a situation where every day you have a new release that must be deployed to every machine running the code.

The alternative to the monolith is the service based architecture. You could argue that the distributed monolith is service based; except the “service” does everything. Ideally, a service based design ties together many different services that have a tightly defined responsibility.

For example; you could have the following services: configuration, recorder, privileges, alarms, maps, health. The idea being that each of these services simply has to adhere to an interface contract. How the team actually implements the functionality is irrelevant. If a faster, lighter or more feature rich recorder service comes along, it can be added to the service infrastructure as long as it adheres to the interface. Kinda like ONVIF?

This allows for a two-tiered architectural approach. The “city planner” who plans out what services are needed and how they communicate, and the “building architect” who designs/plans what goes into the service. Smaller services are easier to manage, and thus, hopefully, do not require constant updates. To the end user though, the experience may actually be the same (or even worse). Perhaps patch 221 just updates a single service, but the user has to take some action. Whether patch 221 updates a monolith or a service doesn’t make much difference to the end-user.

Just like cities evolve over time, so does code and features. 100 years ago when this neighborhood was built, a sewer pipe was installed with the house. Later, electricity was added, it required digging a trench and plugging it into the grid. Naturally, it required planning and it was a lot of work, but it was done once, and it very rarely fails. Services are added to the city, one by one, but they all have to adhere to an interface contract. Electricity comes in at 50 Hz, and 220V at the socket, and the sockets are all compatible. It would be a giant mess if some providers used 25 Hz, some 100 Hz, some gave 110V, some 360V etc. There’s not a lot of room for interpretation here; 220V 50 Hz is 220V 50 Hz. If the spec just said “AC” it’s be a mess. Kinda like ONVIF?.

Image result for wire spaghetti

In software, the work to define the service responsibilities, and actually validate that services adhere to the interface contract is often overlooked. One team does a proprietary interface, another uses WCF, a third uses HTTPS/JSON, and all teams think that they’re doing it right, and everyone else is wrong. 3rd parties have to juggle proprietary libraries that abstract the communication with the service or deal with several different interface protocols (never mind the actual data). So imagine a product that has 20 different 3rd party libraries, each with bugs and issues, and each of those 3rd parties issue patches every 6 months. That’s 40 times a year that someone has to make decide to update or not; “Is there anything in patch 221 that pertains to my installation? Am I using a service that is dependent on any of those libraries” and so on.

This just deals with the wiring of the application. Often the UI/UX language differs radically between teams. Do we drag/drop things, or hit a “transfer” button. Can we always filter lists etc. Once again, a “city planner” is needed. Someone willing to be the  a-hole, when a team decide that deviating from the UX language is just fine, because this new design is so much better.

I suppose the problem, in many cases, is that many people think this is the fun part of the job, and everyone has an opinion about it. If you’re afraid of “stepping on toes”, then you might end up with a myriad of monoliths glued together with duct-tape communicating via a cacophony of protocols.

OK, this post is already too long;

Monoliths can be fine, but you probably should try to do something service based. You’re not Netflix or Dell, but service architecture means a more clearly defined purpose of your code, and that’s a good thing. But above all, define and stick to one means of communication, and it should not be via a library.

 

IBM

In 2017, IBM spent $5.4 billion dollars on R&D (sources vary on the number). That’s a lot of money. Not as much as Amazon ($23 billion) or Alphabet ($16 billion), but still a pretty good chunk of money. Instead of saying billion, let’s say thousand-million, so IBM spent five-thousand-four-hundred million dollars on R&D.

It’s roughly the same as what they spent in 2005, and their revenue is roughly the same as back then ($88 billion in 2006, $80 billion in 2017)

They’re doing a bunch of things to stay relevant, but while most people have heard about AWS or Azure, I don’t often bump into people who know or use IBM Bluemix (now “IBM Cloud”). Is it any good? I guess it must be; the revenue was at the time of writing (Jan 2019) about the same as AWS and Azure (between $7 billion and 9 billion), but AWS and Azure is growing very, very fast (~50% growth per year). As a developer, trying out AWS or Azure, for real, can be accomplished in a few hours. I tried Bluemix years ago. I gave up. I’m sure that I could have gotten it off the ground, but why should I spend days on something that I can do in hours with the other vendors?

Most have heard about IBM’s Watson project. Watson is a project to make a computer that knows everything; it can play Jeopardy, diagnose patients and judge your wardrobe. Reading about the intended purposes of Watson, it seems as if they’re constantly trying new (random) things, only to get beat by Amazon, Apple, Microsoft or Google in the areas that matter. Morgan-Stanley asked ~100 CIO’s about their interest in AI, and 43 of them were considering using AI. Of those, 10 preferred IBM Watson. I don’t know what “preferred” means, and I don’t know what they plan to do, but just because IBM has plowed a lot of money into building something that runs on their mainframes, doesn’t mean it’s valuable. As a side note, the survey (in the link) converted the numbers to percentages to make them seem more significant, but really, it was just 10 dudes out of 100 who said they “preferred Watson”. SO FAKE NEWS!!! (or at least, take it with a grain of salt).

IBM’s revenue did grow recently, but the growth was driven largely by mainframes (cloud business also saw an uptick, about 20% growth), but most people are wondering if this is sustainable. Aren’t we all moving towards serverless (a’la Amazon Lambda), which basically means “sell me cycles as cheap as possible”. It smells like commoditization, narrow margins and huge volume – running on the cheapest possible hardware. A game in which IBM’s expensive mainframes probably will struggle. It seems as if IBM is anticipating this, and just took a major step in that direction by paying $34 billion for Red Hat.

IBM basically went from being front and center of the PC revolution to being a a large, but mostly invisible company. I used to be that “PC software” def. would work on an IBM PC, but then things got bad. For example, 20 years ago there was just one person at the dorm with an actual IBM computer, and it was not compatible with any of the clones the rest of us had. The world had taken the parts that were useful (common platform), and moved on. IBM thought they were still in control of the platform, and could deviate from it. Turned out they were wrong.

Is there a need for IBM any longer?

Sure, banks, government and insurance companies with huge legacy systems, that can’t be moved, will keep buying mainframes forever. So there’s certainly a need. But it’s the same sort of need a drug addict has when they’re coming off their drugs. It’s not a “good” need. Does IBM have anything unique and valuable they can offer to the “cheap reliable cycles” world of the cloud? Do they have anything in the AI space that isn’t being beat by the usual suspects, or at least will be beat, as soon as it becomes commercially viable?

We need competition. If we’re left with Azure and AWS, the world will be a boring place. IBM could maybe compete with AWS. I don’t see why they wouldn’t be able to. But perhaps they aren’t willing or capable.