.NET Memory management is an impressive and complex process, but it's not flawless, and neither are we developers. There are a few common misconceptions which need to be dispelled before you can really start working the memory kinks of your code.
One of the essential ideas behind most Garbage Collector strategies is that the majority of objects die young. Most programs typically generate lots of temporary objects while they're doing some calculation, meaning that the majority of the young objects produced by programs die very quickly. This being the case, the .NET Garbage Collector algorithm is designed to collect dead items without processing them all individually (as the alternative is expensive).
Specifically, it only walks across the live objects, and promotes them to Gen1 to keep them safe (see the webinar & downloadable article for more details). Once the Garbage Collector knows that the live objects are safe, and everything else in Gen0 is dead, it can freely reallocate all that memory. So, assuming most objects die young, we've only had to process a very small number of objects (i.e. just the non-garbage) in order to recycle the whole of Gen0. The Garbage Collector actually doesn't care about garbage at all.
This is essentially an extension of Misconception #1. We've seen that the time to do a Gen0 collection is proportional to the amount of live data in that Generation, although there are some fixed overheads for bringing the threads to a safe point. While moving lots of objects around is expensive, just doing a Gen0 collection is not necessarily an inherently bad thing. Imagine a hypothetical situation whereby Gen0 became full, but all the objects taking up the space were dead. In that situation, no live objects would be moved, and so the actual cost of that collection would be minimal.
In short, doing lots of Gen0 collections is very probably not a bad thing to do, unless you're in the situation where all the objects in Gen0 are live.
Windows comes with a notion of a Performance Counter; this is basically just some statistic that gets periodically updated by the system. Using tools to look at these values from time to time is useful if you want to try and deduce what's happening inside the system. The .NET framework offers a number of these counters, which you can use various tools to look at in pseudo-realtime.
The key terms here are "periodically" and "pseudo-realtime". For example, memory counters only update during Garbage Collections, and are therefore prone to sudden, sharp changes, even though the reality is a much smoother, consistent behavior. Similarly, some counters are driven by information you may not be aware of, and can therefore be misleading. Know your sources!
In one sense, this statement is literally true, but the there are problems in .NET which have the same symptoms. It's ultimately a question of definition:
When you used malloc
and free
to manage memory yourself, a leak was what happened any time that you forgot to do the free part. Basically, you'd allocate some memory, you'd do some work with it, and then you'd forget to release it back to the runtime system.
The good news is that the .NET framework now takes care of freeing objects for you, and is also ultra-cautious. It works out whether it thinks a particular object is going to be needed while your program runs, and it will only release that object if it can completely guarantee that that object is not going to be needed again.
Unfortunately, the framework's careful approach also presents loopholes (through library caches, user interfaces, as well as the runtime and compiler themselves) which allow objects to live longer than they should.
As briefly mentioned in Misconception #2, it's a bad idea to use a copy-&-promote strategy for very large objects, because it takes a very long time (and is every expensive) to copy those sorts of objects around. While copy-&-promote is fine for objects smaller than 85k, the designers of the .NET memory management system opted to hybridize the GC to handle larger objects, and resorted to a technique called "mark-and-sweep".
Rather than promoting large live objects to another generation, the GC leaves them in place, keeping a record of the free areas around them, and using that record to allocate new objects. Crucially, there is no compaction. Unfortunately, because new objects are not the same size as the collected dead objects, they don't always fit into the gaps left behind, leading to fragmentation and unnecessary allocation of additional memory.