Memory Management in GO:

Memory Management in GO:

When starting out with any language I think it's essential to understand how it stores variables, functions, data structures, etc. in memory.

Go has two structures where it stores variables/objects from the running program.
1. Stack
2. Heap
Stack is a LIFO (Last In First Out) data structure where, like in C++, all values that are within a function are stored on the stack. For eg:

package main
import "fmt"

func AddTwoNumbers(n1,n2 int)int{
result:=n1+n2
return result
}

func main(){
n1:=2
n2:=4
x:=AddTwoNumbers(n1,n2)
fmt.Println(x)
}

Here we have two functions main() and AddTwoNumbers(), and each has its local variables defined. And when the program is run, the main() function is the first one to be called and the stack at that time looks like this:

Then AddTwoNumbers(n1,n2) gets called with two arguments n1 &n2.

Once the result is calculated and returned the functions popped off the stack.

And similarly, once the last statement of the main() function fmt.Println(x) is executed, main() is also popped off the stack, and the program is finished.

In contrast, Heap contains values that are referenced outside a function. For example, statically defined constants at the start of a program, or more complex objects, like Go structs.

Heap is a graph where objects are represented as nodes that are referred to in code or by other objects in the heap.

When we define an object that gets placed on the heap, the needed amount of memory is allocated and a pointer to it is returned. As a program runs, the heap will continue to grow as objects are added unless the heap is cleaned up.

How objects are removed from memory?

Objects need tobe removed when they are not needed. And this process is called Memory management.
There are two methods in general:

  1. Manual Memory Management

    In a language like C we make use of functions such as malloc or calloc to write an object to memory. These functions return a pointer to the location of that object in heap memory. When this object is not needed anymore, we call the free function to free the memory so that it can be used again. This method of memory management is called "explicit deallocation" and is quite powerful. It gives the programmer greater control over the memory in use, which allows for some types of easier optimization, particularly in low-memory environments. However, it leads to two types of programming errors.
    Errors:

    1. Calling free prematurely creates a "dangling pointer".
      Dangling pointers are pointers that no longer point to valid objects in memory.
      This is bad because the program expects a defined value to live at a pointer. When this pointer is later accessed there’s no guarantee of what value exists at that location in memory. There may be nothing there, or some other value entirely.

    2. Failing to free memory. If we forget to free an object we may face a "memory leak" as memory fills up with more and more objects. This can lead to the program slowing down or crashing if it runs out of memory. Unpredictable bugs can be introduced into a program when memory has to be explicitly managed.

  2. Automatic Memory Management
    Go offer automatic dynamic memory management, or more simply, garbage collection. Languages with garbage collection offer benefits like:

    - increased security

    - better portability across operating systems

    - less code to write

    - runtime verification of code

    - bounds checking of arrays

Garbage collection has a performance overhead, but it isn’t as much as is often assumed. The tradeoff is that a programmer can focus on the business logic of their program and ensure it is fit for purpose, instead of worrying about managing memory.



**Garbage Collection in Go:**

Go [prefers to allocate memory on the stack](https://groups.google.com/g/golang-nuts/c/KJiyv2mV2pU/m/wdBUH1mHCAAJ), so most memory allocations will end up there. This means that Go has a stack per goroutine and when possible Go will allocate variables to this stack. The Go compiler attempts to prove that a variable is not needed outside of the function by performing **escape analysis** to see if an object “escapes” the function. If the compiler can determine a variable's [lifetime](https://www.memorymanagement.org/glossary/l.html#term-lifetime), it will be allocated to a stack. However, if the variable’s lifetime is unclear it will be allocated to the heap. Generally, if a Go program has a pointer to an object then that object is stored on the heap.  
For example:
type myStruct struct {
  value int
}
var testStruct = myStruct{value: 0}

func myFunction() {
  testStruct.value = 654
}
func main() {
  // some other code
  myFunction()
  // some more code
}

testStruct is defined and placed on the heap in an available block of memory. myFunction is executed and allocated a stack while the function is being executed.

testStruct.value = addTwoNumbers(testVar1, testVar2)
The pointer to testStruct is followed to the location on the heap containing it and the value field is updated.

The value for testStruct stays on the heap until garbage collection occurs.

testStruct is now on the heap and without analysis, the Go runtime doesn’t know if it’s still needed. To do this, Go relies on a garbage collector. Garbage collectors have two key parts, a mutator, and a collector.

  1. The collector executes garbage collection logic and finds objects that should have their memory freed.

  2. The mutator executes the application code and allocates new objects to the heap. It also updates existing objects on the heap as the program runs, which includes making some objects unreachable when they’re no longer needed.

The implementation of Go’s garbage collector:

Go’s garbage collector is a non-generational concurrent, tri-color mark and sweep garbage collector. Let's see what these terms mean:

The generational hypothesis assumes that short-lived objects, like temporary variables, are reclaimed most often. Thus, a generational garbage collector focuses on recently allocated objects. However, as mentioned before, compiler optimizations allow the Go compiler to allocate objects with a known lifetime to the stack. This means fewer objects will be on the heap, so fewer objects will be garbage collected. This means that a generational garbage collector is not necessary for Go. So, Go uses a non-generational garbage collector.

Concurrent means that the collector runs at the same time as mutator threads. Therefore, Go uses a non-generational concurrent garbage collector.

Mark and sweep is the type of garbage collector and tri-color is the algorithm used to implement this

A mark-and-sweep garbage collector has two phases:
1. Mark
2. Sweep
In the mark phase, the collector traverses the heap and marks objects that are no longer needed. The follow-up sweep phase removes these objects. Mark and sweep is an indirect algorithm, as it marks live objects, and removes everything else.

Go implements this in a few steps:

Go has all goroutines reach a garbage collection safe point with a process called "stop the world". This temporarily stops the program from running and turns a "write barrier" ON, to maintain data integrity on the heap. This allows for concurrency by allowing goroutines and the collector to run simultaneously.

Once all goroutines have the write barrier turned on, the Go runtime starts the world and has workers perform the garbage collection work.

Marking is implemented by using a tri-color algorithm. When marking begins, all objects are white except for the root objects which are grey. Roots are an object that all other heap objects come from, and are instantiated as part of running the program. The garbage collector begins marking by scanning stacks, globals, and heap pointers to understand what is in use.

When scanning a stack, the worker stops the goroutine and marks all found objects grey by traversing downwards from the roots. It then resumes the goroutine.

The grey objects are then enqueued to be turned black, which indicates that they’re still in use.

Once all grey objects have been turned black, the collector will stop the world again and clean up all the white nodes that are no longer needed. The program can now continue running until it needs to clean up more memory again.

This process is initiated again once the program has allocated extra memory proportional to the memory in use. The GOGC environment variable determines this and is set to 100 by default. The Go source code describes this as:

If GOGC=100 and we’re using 4M, we’ll GC again when we get to 8M (this mark is tracked in next_gc variable). This keeps the GC cost in linear proportion to the allocation cost. Adjusting GOGC just changes the linear constant (and also the amount of extra memory used).

Go’s garbage collector improves our efficiency by abstracting memory management into the runtime and is one part of what enables Go to be so performant. Go has built-in tooling to allow us to optimize how garbage collection occurs in our program.