Garbage Collector in Java | How Garbage Collection Works ?



Garbage Collection does exactly what it’s more fancier name “Automatic dynamic memory management” suggests. Dynamic memory is hard to manage and GC attempts to do that automatically and relieves the coder from the hard task.

GC basically attempts to take care of two basic scenarios remove garbage and avoid dangling pointers. They are very inter-related but are different scenarios


Garbage Collect

Consider a typical object reference graph. In the graph, every rectangle is an object and the arrows denote object references. So A is an object which references B. For a program to be able to use an object it should be in one of these reference chains. The reference chain starts from what is called Root and is typically references held in registers, on the stack as a local variable or global variables.

image

Let’s assume that due to some operation, A relinquishes the reference to B and the graph becomes something like this…

image

Now B and hence C is not reachable from any valid root in the program and hence have become Garbage (unreachable). The programmer must ensure to follow all reference and free (de-allocate them). One of the duties of a GC system is to automate this process by tracking down (using various algorithms) such objects and reclaim the memory used by them automatically. So in a GC system when the reference is broken it will figure out that B and hence C is not reachable and will de-allocate them.


Hanging/dangling reference

Let’s consider another object graph which is similar to the one above but in addition to A, another object A’ also has a reference to B (or in other words B is shared between them)

image

Even here after some operation object A doesn’t need a reference to B. The programmer does what he thinks is right and de-allocates B and C

image

However, A’ still has a reference to B and hence that reference is now hanging or in more specific term pointing to invalid memory and would typically return an unpredictable result when accessed. The key here is the unpredictable behavior. It is not necessary that program will crash. Unless the memory location in B is re-used it will seem to have valid data and de-references from A’ will work fine. So the failure will come up in unexpected ways and totally un-related places in the program and will make locating the root cause extremely hard.

GC helps by automatically taking care of both of the above scenarios and ensuring that the system doesn’t land up in either of them.