Cost of Error

When I got my first computer, the language it offered was BASIC. Ask any good programmer, and they’ll tell you that BASIC is a terrible language, but it got worse: my next language was 68K assembler on the Commodore Amiga, and with blatant disregard to what Commodore was telling us, I never used the BIOS calls. Instead, I bought the Amiga Hardware Reference Manual and started hitting the metal directly. During my time in school, I was taught Pascal, C++ and eventually I picked up a range of other languages.

What I’ve learned over the years is that the cost of a tool depends on two things: The time it takes to implement something, and (often overlooked) – the time it takes to debug when something goes wrong.

Take “Garbage Collection”, for example, the idea is that you will not have memory leaks because there’s no “new/delete” or “malloc/free” pair to match up. The GC will know when you are done with something you new/malloced and call delete/free when needed. This, surely, must be the best way to do things. You can write a bunch of code and have no worries that your app will leak memory and crash. After all, the GC cleans up.

But there are some caveats. I’ve created an app that will leak memory and eventually crash.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace LeakOne
{
 class Program
 {
   class referenceEater
   {
     public delegate void Evil ( );
     public Evil onEvil;
     ~referenceEater() {
       Console.WriteLine("referenceEater finalizer");
     }
   }

   class junk
   {
     public void noShit() { }
     public void leak() {
       for (int i = 0; i < 100000; i++) {
          referenceEater re = new referenceEater();
       }
     }

     ~junk() {
     }
   }
 
   static void Main(string[] args) {
      for (int i = 0; i < 1000000; i++) {
       junk j = new junk();
       j.leak();
      }
    }
  }
}

What on earth could cause this app to leak?

The answer is the innocent looking “Console.WriteLine” statement in the referenceEater finalizer. The GC runs in its own thread, and because Console.WriteLine takes a bit of time, the main thread will create millions of referenceEater objects and the GC simply can’t keep up. In other words, a classic producer/consumer problem, leading to a leak, and eventually a crash.

Running this app, the leak is fairly apparent just by looking at the task manager. On my laptop it only takes 5-10 minutes for the app to crash (in 32 bit mode), but in 64 bit mode the app would probably run for days, slowing things down for day, until eventually causing a crash.

It’s a bitch to debug, because the memory usage over a loop is expected to rise until the GC kicks in. So you get this see-saw pattern that you need to keep running for a fairly long time to determine, without a doubt, that you have a leak. To make matters worse, the leak may show up on busy systems, but not on the development machine that may have more cores or be less busy. It’s basically a nightmare.

java-memory-usage-example

There are other ways for .NET apps to leak – a good example is forgetting to unsubscribe from a delegate, which means that instead of matching new/delete, you now have to match subscription and unsubscription from delegates. Fragmentation of the Large Object Heap (not really a leak, but will cause memory use to grow, and ultimately kill the app)

The C++ code I have I can test for leaks by just creating a loop. Run the loop a million times, and when we are done we should have exactly the same amount of memory as before the loop.

I am not proposing that we abandon garbage collection, or that everything should be written in C++, not even by a long shot. As an example, writing a stress test for our web-server (written in C++), was done using node.js. This took less than 2 hours to put together, and I don’t really care if the test app leaks (it doesn’t). There are a myriad of places where I think C# and garbage collection is appropriate. I use C# to make COM objects that get spawned and killed by IIS, and it’s a delight to write those and not having to worry about the many required macros needed if I had done the same in C++.

And with great care, and attention, C# apps would never leak, but the reality is that programmers will make mistakes, and that the cost of this is taken into consideration when trying to determine what tool is appropriate for what task.

Advertisements

Author: prescienta

Prescientas ruler

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s