“It was the best decision I could have made”

A man walks into a restaurant.

The menu has many different dishes, so the man has to make a choice. After some deliberation, he picks the veal (always the veal) and soon after, a plate of veal is placed in front of him. He tastes the dish, and then utters

“It was the best decision I could have made”.

This is the sort of thing you tell yourself when you worry that you made a mistake. When you glance at the guy next table over, and watch him eat a delicious streak, while you are stuck with your veal. Chances are that the guy eating the steak is wondering if he should really have picked the veal.

This is the paradox of choice. You know that there are other options, and when the options are more or less comparable, then it’s hard to shake that nagging feeling of regret when you finally do make a choice – even if it is an educated one.

Axis ZipStream

Finally…

Important forensic details like faces, tattoos or license plates are isolated and preserved, while irrelevant areas such as white walls, lawns and vegetation are sacrificed by smoothing in order to achieve the better storage savings

It detects tattoos?

I think Vivotek has something similar (although relying on manual configuration of the zones), and I’ve rambled about this stuff here too.

Tips on Motion Detection

Update:

We just overhauled the motion detection engine in Cayuga; the results were promising (better than the older 4.2 recorders, and with new cameras and a good integrator, we can run a lot more cameras on the same box). I still believe that motion detection should be done on the edge. But for those who refuse, the new recorders are a lot better.

Motion Detection on the Edge

I think server side motion detection ought to be the last resort (I also think that the VMS should make the choice transparent, so that if a camera has decent motion detection the VMS will simply set up the camera to do it on the edge.)

If the VMS is not able to do this transparently, then most systems will allow you to set it up manually. Usually a royal pain in the ass, but c’est la vie.

Server side motion detection can be very taxing on the CPU. Obviously depending on the accuracy required, both in terms of the number of pixels processed, and how often the detection takes place. Generally, lowering the accuracy will also lower the strain on the CPU.

Motion detection on the edge incurs no additional CPU load – regardless of the accuracy. So if your system is running hot (due to motion detection), then consider moving the detection to the edge. Edge detection may offer more options, and perform better (or worse, so test before deployment), depending on the camera type.

If you are going to do server side motion detection, consider if the motion detection can happen on a secondary stream of lower resolution (and perhaps lower frame-rate too). This obviously requires more bandwidth on the network, but it might be worth the trade-off.

Be very careful when “overbooking” a server. You may often be able to fit more cameras on one server by recording only when there is motion. Since the disk IO can represent a bottleneck, the assumption is that not all cameras will have motion at the same time, and so if only 30% of the cameras are recorded at the same time, the disk IO may not represent a bottleneck. Unfortunately, a lot of motion detectors are susceptible triggering on account of noise in the image. At night, when the light is sparse, the camera increases the gain (unless this is turned off). This also increases the amount of noise, and often trigger the motion detector. This causes the camera to be recorded to disk. Since most cameras all experience the same loss of light at the same time, they all get recorded. And this triggering the disk IO bottleneck and wasting space on meaningless recordings. Consider if post-recording image processing can replace AGC, a very, very unscientific suggested that we might as well do the AGC on the client.

Click for Better Fidelity

If the motion detection is used for alerting, it may be difficult or even impossible to find the right parameters. If the setup is too sensitive, you will get too many false alarms. On the other hand, making the system too in-sensitive may cause you to loose the event (if you are using the same detection to trigger recording, you may lose the event completely!). A setup that was appropriate in the winter may not be appropriate in the summer where foliage, azimuth of the sun etc may influence your sensitivity or masking settings.

For real-time alerting, I would recommend a dedicated video analytics engine, preferably installed on the camera.

Off to work…

JPEG vs MPEG in IP Video Surveillance

I am being “called out” by a commenter on a different site. I am “duping” users, and I might not know what I am talking about (buyers beware!).

The commenter claims that JPEG (or MxPEG) is the superior choice for video surveillance. Having observed poor video that had been compressed with MPEG (and I use the term MPEG for all variants here – including H.264 and HEVC) is of poor quality, the logical conclusion is that MPEG is bad, and JPEG is good (because it doesn’t compress as much).

The discussion reminds me of discussions I’ve had with HDcctv proponents. Evangelists that get up on a pedestal never like it when someone questions their authority. Sure, I understand that being told that you are factually wrong is not fun, and I am all too well aware of the Dunning-Kruger and the backfire effect, and so I’ve chosen not to pursue the discussion any further in that forum. A forum is a bad place to discuss politics and religion, but it’s a horrible place to discuss facts, so why even bother?

There’s an old joke about a CEO, a physicist and an engineer stranded on an island (substitute the CEO for whatever you like). A can of spam drifts ashore, and the three discuss how to open it and share the contents. The physicist suggest that they heat the can, thereby making the contents expand and break the can. The engineer suggest using a sharp rock, but then the CEO interrupts by saying “let’s assume that we have a can-opener…”

A variant of the joke would be this: An internet commenter, a customer and Morten (me) are sitting in a store. The customer says, I have $2000, I need 7 days retention, 8 fps, and I have 16 cameras. Morten says, “well, if you use MPEG, you will be able to record your cameras for 7 days at 8 fps. The quality will not be great, but at least you will have something”. Then the commenter says “Let’s assume that you only needed 1 day retention, or that you only had 2 cameras, or perhaps we can pretend you have $5000 to spend”.

Some of the most hysterical people, when it comes to video quality, are the good people of Hollywood. One of the “low-cost” companies is RED. They have a proprietary codec based on JPEG2000, offering a lossy compression of 1:3. But keep in mind that an uncompressed 4K frame clocks in at 24 MB (assuming 8 bit 4:4:4 YCrCb ), so at 3:1 compression, you’d still need to store 8 MB per frame. Sure, 1080p is only 1/4th of that, but it’s still a hell of a lot of data.

How many customers will be able/willing to afford that kind of bandwidth requirement? Not many. Perhaps zero.

10 years ago, the commenter “published” a document, in which he thanks the members of the MPEG group (MPEG stands for Moving Picture Experts Group), yet the reference list simply points to “MPEG committee various published documents”. If I handed in a paper that had that sort of reference, I think I would have flunked the course. But this is the Internet, you can “publish” any kind of garbage you want. Usually, when you make a scientific publication it is peer-reviewed, I don’t know what peers reviewed this nonsense.

For example, the author states that

MPEG/H264 codecs are considered a lossy compression method. You will have no doubt seen the effects of MPEG compression in everyday life when watching normal digital television, for example sports coverage or live event coverage when the bandwidth available is restricted the image sometimes pixelates into blocks and then gradually recovers as a new key frame is transmitted.

The logical conclusion from this is that there would be that a) JPEG isn’t lossy (which it most definitely is) or that b) you would not see blockiness if broadcasters used JPEG (which you most definitely would).

I suppose that I should try to explain why removing temporal redundancy will always lead to better compression. Or, of you keep the bandwidth constant, MPEG will in 99.9% of all cases deliver better quality. Or trying to explain that JPEG also uses macroblocks and also decimates the frequency components prior to huffman-encoding. I won’t. It’s pointless, when you are clinging to an idea for 10 years, no amount of proof will sway you.

The cost of making MPEG more efficient is CPU cycles. Encoding efficiently requires more memory and more CPU cycles per frame. You could argue that the cost in CPU cycles is not worth the extra compression, but that’s not what the commenter argues. No, the commenter argues that JPEG/MxPEG is a better choice for video surveillance than MPEG – period. And that’s just bollocks.

JPEG has some advantages to MPEG, primarily that it is universally supported in web-browsers, offers direct random access to each frame and there are no patent gang (MPEGLA) to deal with. But none of those arguments deal with compression efficiency. The primary problem with JPEG for video is that the temporal redundancy is not removed. MPEG allows you to do the same, but it would be stupid to do so. Throwing away redundant information is a good thing. You can then claim that some people/integrators/systems throw away things that are not redundant, like facial features, license plates and so on, but that sure as hell has nothing to do with the codec itself.

MxPEG tries to get rid of redundant temporal information too. By doing a delta, and sending only the macroblocks that did change, it gets better compression than JPEG. Either you get a full macroblock, or you get nothing. Practically it does this by inserting a comment in the EXIF header that describes what blocks have changed compared to the previous frame. But MxPEG is not natively supported by any web-browsers (if you know, tell me). Since there’s no use of motion-vectors, the compression is lower than MPEG (assuming constant quality). MxPEG does not offer random frame access either, which would have been a big advantage to Mobotix.

Here’s a quick synthetic test I did (JPEG fans will be quick to point out that rarely do we record screens that just show the time, but I urge anyone to perform the same test using their own equipment).

At 600 kbps the JPEG stream looked like this :

JPEG, 160x120, 30 fps

In contrast, the terrible H.264 looked better :

H264, 30 fps

I deliberately did not try to capture right at an I-frame, but went for something that would show some of the damaging blocking and place it side-by-side to the JPEG frame.

JPEG on the left

It’s important to stress that this is at the same bandwidth, but also that this test was made in just 10 minutes. If you can make a side by side comparison of JPEG and H.264 at the same bandwidth, where JPEG comes out on top, I sure as shit would like to see it. It certainly will not be a common scenario.

I’ll leave it to the reader to decide if JPEG is the better choice.