Sunday, 15 November 2015

The basics of garbage collection in .Net

Time for some revision again. Every now and then I like to go back to basics and refresh my memory about some aspect of software development and this time it’s garbage collection in .Net.

If you are looking for a more definitive explanation of garbage collection you can do no better than the Garbage Collection documentation on the MSDN site [1]. This post will only contain a few titbits that I find useful to remember. Go to the horse’s mouth for details.

What is the garbage collector?

One of the key differences between .Net development and many other programming paradigms is that you as the programmer are no longer directly responsible for memory management in your application. That’s not to say you can’t still have memory leaks and other problems – which is why having an understanding of garbage collection is important – but the business of allocating and de-allocating memory is largely taken out of your hands.

The garbage collector  is the automatic memory management component of the common language runtime (CLR). It provides a number of benefits including:

  • Enabling you to develop your application without having to free memory yourself.
  • Allocating objects on the managed heap efficiently (a description of the managed heap follows).
  • Reclaiming the memory occupied by objects that are no longer in use.
  • Providing memory safety by ensuring that an object cannot use the content of another object.

 

This isn’t a get out of jail free card though. You can still run in to a number of problems if you are careless including running out of memory (out of memory exceptions), high CPU usage during garbage collection and other performance problems.[2]

The managed heap

The managed heap is a segment of memory used to store managed objects in a .Net process and is allocated by the garbage collector when the process is initialised by the CLR.

Each process has one managed heap with all threads in the process allocating memory for objects on the same managed heap.

Actually, the managed heap can be considered as two heaps: the large object heap and the small object heap. The large object heap contains very large objects – usually arrays - that are 85,000 bytes and larger.

Managed heap generations

A nice performance optimisation of the garbage collector is to use generations of which there are 3 in the managed heap:

Generation Description
Generation 0
  • This is the youngest generation containing short-lived objects
  • Garbage collection occurs most frequently in this generation.
  • Newly allocated objects are implicitly generation 0 unless they are large objects (they go on the large object heap in a generation 2 collection).
  • Most objects are reclaimed for garbage collection in generation 0 and do not survive to the next generation.
Generation 1
  • Contains short-lived objects.
  • Serves as a buffer between short-lived objects and long-lived objects.
Generation 2
  • This generation contains long-lived objects.

Garbage collection occurs most frequently on Generation 0, and successively less frequently for generations 1 and 2. Details of how this optimisation works follow in the next section.

Generations 0 and 1 are are known as the ephemeral generations because the objects they contain are short-lived.

Untitled Diagram

Figure 1 – A super simplified view of the managed heap

What is garbage collection?

The garbage collector reclaims the memory occupied by dead objects. It also compacts live objects so that they are moved together and the dead space is removed. The overall effect is to make the managed heap smaller making more memory available to the process.

Garbage collection in the CLR is basically a mark and sweep operation which includes compaction of the managed heap. During the marking phase, the garbage collector runs through the objects in the managed heap - actually through a generation in the heap - and marks those objects it identifies as being live, that is those objects that still have references in the process. The remaining objects are considered as being dead and are therefore candidates for clean-up.

So, the phases of garbage collection are:

  • Marking – find and create a list of all live objects.
  • Relocating - update the references to the objects that will be compacted.
  • Compacting - reclaims the space occupied by the dead objects and compacts the surviving objects.

 

The survivors of a garbage collection run are promoted to the next generation. This is a neat trick because the garbage collector can perform garbage collections at a different rate for each generation (i.e.  more frequently at generation 0). If objects survive to generation 2 it’s a fair assumption that they will be hanging around in the application for a while. As a result garbage collection can be performed less frequently on generations 1 and 2 thereby saving system resources (garbage collection uses CPU).

So, the following rules describe how objects are promoted through the generations:

  • Objects that survive a generation 0 garbage collection are promoted to generation 1
  • Objects that survive a generation 1 garbage collection are promoted to generation 2
  • Objects that survive a generation 2 garbage collection remain in generation 2

 

Note that in the past the large object heap would not be compacted because of the performance penalty incurred by copying large objects around. From .NET Framework 4.5.1 onwards you can use the GCSettings.LargeObjectHeapCompactionMode property to compact the large object heap on demand.

When garbage collection occurs

Garbage collection occurs when:

  • the system has low physical memory, or
  • the memory that is used by the managed heap is greater than a constantly adjusted threshold, or
  • the GC.Collect method is called

 

Wrapping up

So that’s it for this quick look at garbage collection. There are a number of other concepts to look at including root objects, flavours of garbage collection (workstation or server), concurrency, and object finalization. All suitable subjects for future posts. Stay tuned.

References

[1] Garbage Collection, MSDN
[2] Garbage Collector Basics and Performance Hints, MSDN

Sunday, 15 November 2015