dotnet csharp performance

.NET Memory Management 1: Stack, Heap and Strings

Hey there. This week I decided to focus on memory management after a recent interview. I had surface-level knowledge on the topic and wanted to go deeper. What is the stack, what is the heap, why does string live on the heap, what happens behind the scenes when CLR creates an object, how does Span<T> work without allocating memory, when does GC kick in, who benefits from IDisposable - I'll cover all of it in order.

But first, let's start from the basics. What exactly are "stack and heap"?

Finding Space for Data

When you define a variable:

int x = 5;
string name = "Erdinc";
User user = new User { Name = "Erdinc", Age = 30 };

Where do x, name, and user live? In your computer's RAM - okay. But where in RAM? In some corner, the middle? Answer: it depends on the type.

The .NET runtime uses two main regions for memory management:

RegionDescription
StackA LIFO (Last-In-First-Out) structure where method calls, local variables, and parameters live. Small, fast, automatically cleaned.
HeapThe region where dynamically allocated, long-lived objects live. Large, managed by the GC.

Understanding the difference between the two is the first step in grasping the performance of the code you write.

Stack: LIFO Structure

The best analogy for understanding how the stack works is a stack of plates. When you want to add a plate, you put it on top. When you want to take one, you take the top one. You can't pull one from the middle. The last plate you put on is the first one taken. That's LIFO.

[Plate 3]  ← Last placed, first taken (Last-In)
[Plate 2]
[Plate 1]  ← First placed, last taken (First-Out)

This is exactly what happens on the stack. Methods run stacked on top of each other, and when a method finishes, its variables are cleaned up immediately:

/*
 * When Calculate finishes, all its variables (a, b, result) are
 * completely removed from memory. But what exactly is this "removal"?
 *
 * At the CPU level, the stack is managed by the RSP (Stack Pointer) register.
 * When entering a method, RSP is pulled down (sub rsp, N), allocating
 * N bytes of space. When the method returns, RSP is pushed up
 * (add rsp, N), "freeing" that space.
 *
 * This is called "Stack Pointer Adjustment" or simply "stack unwinding."
 *
 * Key point: the data in the freed region is NOT physically DELETED.
 * It is just marked as no longer "in use." The operating system can
 * overwrite the same memory addresses with new values on the next
 * method call. This is why the compiler gives an error if you try to
 * read an uninitialized local variable in C# - the value you read
 * could be leftover garbage from the previous method.
 */
void Calculate()
{
    int a = 10;       // a = 10 written to stack
    int b = 20;       // b = 20 written to stack
    int result = Add(a, b);  // Add method called
}   // When Calculate returns, RSP is pushed up, frame is freed

int Add(int x, int y)
{
    int sum = x + y;  // New frame for x, y, sum opened on stack
    return sum;       // When Add returns, its own frame is freed
}

The stack grows as methods are called and shrinks as methods return. Each method creates its own stack frame. The frame contains the method's parameters, local variables, and return address. Visually:

flowchart TB
    subgraph T1["1. Calculate called"]
        direction LR
        A1["[Empty]"]
    end
    subgraph T2["2. int a = 10"]
        direction LR
        A2["a = 10"]
    end
    subgraph T3["3. int b = 20"]
        direction LR
        A3["b = 20"]
        A3b["a = 10"]
    end
    subgraph T4["4. Add(10, 20) called"]
        direction LR
        A4["sum = 30"]
        A4b["y = 20"]
        A4c["x = 10"]
        A4d["--- frame boundary ---"]
        A4e["b = 20"]
        A4f["a = 10"]
    end
    subgraph T5["5. Add returned (frame removed)"]
        direction LR
        A5["result = 30"]
        A5b["b = 20"]
        A5c["a = 10"]
    end
    subgraph T6["6. Calculate returned (all removed)"]
        direction LR
        A6["[Empty]"]
    end

    T1 --> T2 --> T3 --> T4 --> T5 --> T6

Notice: when Add returns, x, y, sum - along with the frame - were immediately removed. We didn't wait for the GC to come. This is called deterministic cleanup.

Garbage Values and Memory Safety

We said earlier: the data in the freed stack region is not physically deleted. So what happens if we try to access this "unswept" data?

The C++ Side: Here's an Address, Go Wherever You Want

C++ puts up no barriers. If you declare a variable on the stack, don't initialize it, and try to read it:

void Foo()
{
    int previousData = 42;    // left a value on the stack
}

void Bar()
{
    int x;                    // not initialized!
    std::cout << x;           // prints 42 or some other random number
}

When Bar is called, its stack frame might land exactly on the region Foo left behind. x's value will be 42 - but this is pure coincidence. In another run you'll get a completely different number. This is called undefined behavior. The compiler won't error, you won't crash at runtime; you'll just get wrong results without ever knowing it.

Even worse, C++ pointer arithmetic lets you step completely outside the stack and access completely different data within the same process:

int arr[3] = {1, 2, 3};
int* p = arr;
p += 100;                // 100 elements past the array - undefined region
std::cout << *p;         // reads whatever is there

On modern operating systems, thanks to virtual memory, you can't access another program's memory - the MMU won't allow it, you'd get a segfault. But within your own process, you can go to any address and read it.

The C# Side: The Compiler Blocks You

C# is the polar opposite of C++ in this regard. C# memory operations are safe. Let's try the same scenario in C#:

void Bar()
{
    int x;
    Console.WriteLine(x);  // COMPILE ERROR: Use of unassigned local variable 'x'
}

The C# compiler performs definite assignment analysis. It requires that a local variable must always have a value assigned before it is read. If it isn't assigned, you get an error at compile time - it never even reaches runtime.

The reason is exactly what you said: reading an uninitialized variable means you get whatever happens to be in memory at that moment. There's a random value there, and that value can break your program's logic.

Consider a hospital software system:

// This code won't compile in C# - but let's assume it did
int insulinDose;                  // not assigned!
AdministerInsulin(patient, insulinDose);  // applies a random dose

If insulinDose reads the previous leftover value in memory (say, 999), the patient could die. C# puts the "assign before use" rule precisely to prevent this scenario. C++ offers no such protection - the error silently continues running.

You can use pointers in C#'s unsafe context, but that's a deliberate choice and requires the /unsafe compiler flag. In normal C# code, you cannot accidentally read an uninitialized value.

Summary: Why Does C# Block It?

SituationC++C#
Reading uninitialized local variableCompiles, returns random value (undefined behavior)Compile error (CS0165)
Going outside the stack with a pointerCompiles, segfault or random dataCan't do it without unsafe
Array bounds overflowCompiles, undefined behaviorThrows IndexOutOfRangeException
Using a null referenceCompiles, segfaultThrows NullReferenceException

The C# runtime performs extra checks to protect you. These checks have a performance cost (like array bounds checking), but it's worth it for safety and debugging convenience.

Stack properties:

  • Size: 1-8 MB per thread. Won't fill up unless you use stackalloc.
  • Speed: Allocation and deallocation is a single CPU instruction (stack pointer movement). Cache-friendly.
  • Lifetime: Limited to method scope. When the method ends, the variable dies.
  • What lives here: Value types like int, bool, double, struct, enum. Also the references (addresses) of reference types.

The most critical rule of the stack: nothing whose size is unknown at compile time can live on the stack. The stack frame size in a method call must be calculated in advance.

That's why:

byte[] buffer = new byte[4096];  // buffer reference (8 bytes) on stack, 4096 bytes on heap

The buffer variable (an 8-byte address) is on the stack. But the 4096 bytes it points to are on the heap. Because the lifetime of those 4096 bytes might not end when the method finishes - someone else might still be using them.

The Real Structure of an Array on the Heap

So how exactly does byte[4096] sit on the heap? And when you write buffer[150], how does CLR find the 150th byte? How does it know the boundary? Let's break it down.

The structure of an array object on the heap:

flowchart LR
    subgraph Stack["Stack"]
        U["buffer = 0x00E100 (starting address of byte[] object on heap)"]
    end
    subgraph Heap["Heap"]
        direction TB
        subgraph Arr["byte[] object (address: 0x00E100)"]
            direction LR
            A1["MethodTable* (0x00A000) (8 bytes, address of Byte[] MethodTable)"]
            A2["Length = 4096 (4 bytes)"]
            A3["padding (4 bytes)"]
            A4["[0] = 0 (1 byte)"]
            A5["[1] = 0 (1 byte)"]
            A6["... (4094 bytes)"]
            A7["[4095] = 0 (1 byte)"]
        end
        subgraph MT["Byte[] MethodTable (0x00A000)"]
            direction LR
            M1["BaseSize: 16"]
            M2["ComponentSize: 1"]
            M3["ElementType: System.Byte"]
            M4["Rank: 1"]
            M5["IsArray: true"]
        end
    end
    U -.->|"holds the heap address"| Arr
    A1 -.->|"holds the MethodTable address"| MT

MethodTable* (8 bytes): This is the heap address of the MethodTable belonging to the Byte[] type. It's a memory address like 0x00A000. CLR reads this address to learn what type the object is, the size of its elements, its boundaries, and its behavior (methods). Each type has one MethodTable, and all objects of that type share the same MethodTable.

Notice: the buffer variable on the stack holds the value 0x00E100. This value is the starting address of the byte[] object on the heap. In other words, buffer is the "address card" for that 4096-byte region on the heap. When you hold this reference, CLR can do the following:

1. Learn the type: buffer.GetType() → first goes to the address 0x00E100 that buffer holds, reads the 8-byte MethodTable* value (0x00A000) there. 0x00A000 is the address of the Byte[] MethodTable. From that MethodTable, it retrieves IsArray = true, ElementType = System.Byte, Rank = 1.

2. Learn the length: buffer.Length → reads the 4-byte Length value at offset 8 from 0x00E100. It's 4096.

3. Access an element: When you write buffer[150], CLR takes three controlled steps:

1. Bounds check:    if (150 < 0 || 150 >= 4096) → throw IndexOutOfRangeException
2. Address calc:    target = 0x00E100 + 16 (header) + (150 × 1) (componentSize)
                    = 0x00E100 + 16 + 150
                    = 0x00E196
3. Read/Write:      *(byte*)0x00E196

The formula breakdown:

PartValueSource
Base address0x00E100buffer reference on the stack
Header size16 bytes (8 MT + 4 Length + 4 padding)BaseSize in Array MethodTable
ComponentSize1 (for byte)ComponentSize field in Array MethodTable
Length4096At offset 8 in the object
index150Your code

MethodTable has two more special fields for arrays:

  • ComponentSize: The size of each element in bytes. 1 for byte[], 4 for int[], the struct size for a struct array.
  • BaseSize: The size of the array's header (MT + Length + padding). Actual object size = BaseSize + (Length × ComponentSize).

CLR extracts everything it needs for buffer[150] from the MethodTable + Length pair. It doesn't need a separate "start/length" struct because arrays always start at index 0. start is always 0.

Start/Length in Span<T>

What you described as "a struct type with start and length values" is actually Span<T>. Span is a view struct that works on top of an array:

byte[] fullBuffer = new byte[4096];
Span<byte> slice = fullBuffer.AsSpan(100, 50);  // start at byte 100, take 50 bytes

Span itself is a struct (value type) living on the stack and contains:

flowchart LR
    subgraph Stack["Stack"]
        direction TB
        subgraph Span["Span<byte> slice (stack)"]
            S1["_reference (8 bytes, holds heap address of element 100)"]
            S2["_length = 50 (4 bytes)"]
        end
    end
    subgraph Heap["Heap"]
        direction TB
        subgraph Arr["byte[] fullBuffer (0x00E100)"]
            direction LR
            MT["MethodTable* (8)"]
            LEN["Length = 4096 (4)"]
            PAD["padding (4)"]
            EL0["[0]...[99] (100 bytes, header + 0..99)"]
            EL100["[100] = slice start"]
            EL149["[149] = slice end"]
            EL150["[150]...[4095]"]
        end
    end
    S1 -.->|"skips the header, points directly to element 100"| EL100

Span's _reference field does not point to the start of the array, but to the start of the slice (with the offset already added). When indexing on a Span, CLR does:

1. Slice bounds:     if (index < 0 || index >= 50) → error (Span's _length)
2. Actual address:   target = _reference + (index × 1)

As you can see, Span carries the "start and length" information. But byte[] itself does not - an array always starts at 0, and its length is embedded in the object. The start/length pair comes into play when a view is needed.

Where Does the Element Type Come From?

When reading buffer[150], CLR knows the 150th byte is a byte from the ComponentSize value. The Byte[] MethodTable has ComponentSize = 1. For Int32[] it's 4, for Double[] it's 8.

What if the element is a reference type? Consider string[]:

string[] names = new string[3];

This array's structure on the heap:

[MethodTable* → String[]]    (8 bytes)
[Length = 3]                 (4 bytes)
[padding]                    (4 bytes)
[names[0] = null]            (8 bytes - string reference)
[names[1] = null]            (8 bytes - string reference)
[names[2] = null]            (8 bytes - string reference)
Total: 16 + (3 × 8) = 40 bytes

Here, ComponentSize = 8 (reference size on 64-bit). Each element is a reference; the string itself is elsewhere on the heap. The array only holds addresses. When reading names[1]:

1. Address:   0x00E100 + 16 + (1 × 8) = 0x00E118
2. Read:      *(string*)0x00E118 → 0x00F200 (address of the string)
3. Go to string: read the string object at 0x00F200

GC also tracks these references. As long as the string[] array itself is alive, the strings pointed to by the array's references also stay alive (even if nothing else references them).

Heap and MethodTable

Think of the heap as a large plot of land. When you need space, you allocate a parcel (allocation); when you're done, you clear it (GC collects). Unlike the stack, there's no order here - each parcel is independent.

But the real question is what CLR does behind the scenes when an object is created on the heap. Let's step through what happens the moment you write new User().

What Happens When CLR Creates an Object?

var user = new User { Name = "Erdinc", Age = 30 };

When this line executes, CLR follows these steps:

  1. Look up type info. Does CLR know the User type? Where is the MethodTable for the User class? (If it's the first use, the type is loaded - type load.)

  2. Allocate memory. It looks at the BaseSize field in the MethodTable. This field tells how many bytes an object of type User will occupy on the heap. That many bytes are allocated from the GC heap.

  3. Write the MethodTable pointer. The address of the MethodTable is written to the first 8 bytes (64-bit) or 4 bytes (32-bit) of the allocated memory. This pointer (TypeHandle) is a critical reference that tells the runtime which type the object belongs to.

  4. Zero out fields. The remaining space is zeroed. int → 0, string → null, bool → false.

  5. Call constructor. The User constructor runs, writing the actual values to the fields.

  6. Return the reference. The starting address of this heap-allocated object is assigned to the var user variable. This address is stored on the stack.

Visually:

flowchart LR
    subgraph Stack["Stack"]
        direction TB
        U["user = 0x001A3F (8 bytes)"]
    end

    subgraph Heap["Heap"]
        direction TB
        subgraph Obj["User object (address: 0x001A3F)"]
            direction LR
            MT["MethodTable* (8 bytes)"]
            F1["Age = 30 (4 bytes)"]
            PAD["padding (4 bytes)"]
            F2["Name = 0x00B210 (8 bytes)"]
        end
        subgraph MTBlock["User MethodTable"]
            direction LR
            MTInfo["BaseSize: 24
            EEClass*: 0xF0A100
            Parent MethodTable*: System.Object
            Interface count: 0
            Method slots: ..."]
        end
        subgraph Str["String 'Erdinc' (0x00B210)"]
            direction LR
            SM["MethodTable* (System.String)"]
            SL["Length = 6"]
            SC["chars: E r d i n c"]
        end
    end

    U -.->|"points to"| Obj
    MT -.->|"type info"| MTBlock
    F2 -.->|"Name points to"| Str

What is MethodTable and What Does It Do?

Every type has a MethodTable. CLR creates this table when the type is first loaded, and all objects of that type share the same MethodTable. Even if you create a million User objects, there is only one User MethodTable.

MethodTable contents (simplified):

FieldDescription
BaseSizeHow many bytes an object of this type occupies on the heap. The first field GC uses for allocation.
EEClass pointerPoints to the EEClass structure. Field offsets, interface list, property metadata live here.
Parent MethodTableThe inheritance chain. If User, its parent is System.Object's MethodTable.
Interface count and Interface mapWhich interfaces it implements. Used for casts and is checks.
Method slotsAddresses of virtual methods. Like ToString(), GetHashCode().

So when you call user.GetType(), CLR's only job is to read the MethodTable pointer from the first 8 bytes of the object and return the corresponding Type object to you.

Namespace and MethodTable Relationship

MethodTables are created per type, not per namespace. A namespace is a logical grouping used at compile time. At runtime, there is no such thing as a "namespace MethodTable."

A type's full runtime identity consists of three parts:

Assembly + Namespace + TypeName

For example:

namespace MyApp.Models
{
    public class User { ... }     // Identity: MyApp.dll + MyApp.Models + User
}

namespace MyApp.DTOs
{
    public class User { ... }     // Identity: MyApp.dll + MyApp.DTOs + User
}

These two User classes have the same name but different namespaces. Therefore, CLR creates two separate MethodTables for them. Each has its own BaseSize, field offsets, and method slots. Models.User and DTOs.User are completely different types that just happen to share a name.

Namespace information is not stored as a separate field inside the MethodTable; it's held as part of the type's name. When you call Type.FullName, you get "MyApp.Models.User"; this string comes from the type's metadata. CLR uses this full name when loading a type:

1. Does the assembly contain a type called "MyApp.Models.User"?
2. If yes, load its MethodTable (or use it if already loaded)
3. If not, throw TypeLoadException

So, the namespace is necessary for type resolution, but the MethodTable itself is type-based, not namespace-based. 50 classes in the same namespace = 50 separate MethodTables.

Accessing Properties: The Offset Mechanism

When you write user.Name, how does CLR know which bytes in the heap object belong to the Name property? Answer: field offset.

The EEClass holds an offset value for each field:

User Class EEClass (simplified):
  Field: Age      → offset: 8   (right after the MethodTable pointer)
  Field: Name     → offset: 16  (Age 4 bytes + padding 4 bytes = 8 bytes later)
  Total size      → 24 bytes    (8 MT + 4 Age + 4 pad + 8 Name)

Why is there padding? The CPU requires alignment for faster memory access. On a 64-bit system, 8-byte references must be at addresses that are multiples of 8. That's why 4 bytes of padding are inserted between Age (4 bytes) and Name (8 bytes).

CLR uses these offset values to translate the user.Name call into:

// user.Name is actually this:
string name = *(string*)((byte*)user + 16);  // read the reference at byte 16 of the object

The advantage of offset access: There's no need to maintain a separate property address map for each object. All User objects use the same offsets. Only the base address changes: object_address + Age_offset = Age's address.

Note: Since Age is a value type, its value is inside the heap object. But since Name is a reference type, only its address is inside the heap object. The actual string data is at a different heap address.

Offset in Inheritance

class Person
{
    public int Id { get; set; }
}

class User : Person
{
    public string Name { get; set; }
    public int Age { get; set; }
}

The User object on the heap:

[MethodTable* → User]  (8 bytes, at the very start)
[Id: int]              (offset 8, field inherited from Person)
[Name: string ref]     (offset 16, 4 bytes Id + 4 padding = 8)
[Age: int]             (offset 24)
Total: 32 bytes

The critical point here: parent class fields always come before child class fields. This way, even when you do Person person = user;, person.Id is read from the same offset. Because the Person MethodTable also knows Id is at byte 8 - and at byte 8 of the user object, Id really is there.

Value Type vs Reference Type: Summary Table

PropertyValue Type (struct, int, bool)Reference Type (class, string)
Where it livesWhere its owner lives (stack or inside heap)Always on the heap
Assignment behaviorValue is copiedReference (address) is copied
Can be null?No (yes with int? nullable wrapper)Yes
Default valueZeroed state (0, false, default)null
InheritanceFrom System.ValueType, sealedFrom Object
GC impactGoes away automatically when owner is collectedRequires GC collection
How it sits in an objectValue is inside the objectAddress is inside the object

The String Matter

String has a special place in this story. It's the type that confuses people the most.

string a = "hello";
string b = a;
b = "world";
// a is still "hello", right? Yes.

String is a reference type - it lives on the heap. But it's immutable. That's why the b = "world" assignment above doesn't affect a - it doesn't change the content of the string b points to; it assigns the address of a new string to b.

In fact, the heap structure of a string also starts with a MethodTable:

String object (heap):
  [MethodTable* → System.String]  (8 bytes)
  [Length: int = 5]               (4 bytes)
  [Padding]                       (4 bytes)
  [First Char: 'h']               (2 bytes)
  [Second Char: 'e']              (2 bytes)
  ...

Length and characters are inside the object, accessible at fixed offsets. But you cannot change the content because CLR doesn't know the string is writable - the object is marked as readonly.

Why Immutable?

  1. Thread safety. Multiple threads can read the same string without fear.
  2. Interning. Strings with the same content are stored once in memory.
  3. Security. Just because you passed a string as a parameter doesn't mean the method can modify it.

But this immutability comes at a cost:

string result = "";
for (int i = 0; i < 10000; i++)
{
    result += "x";   // New string object every loop. 10,000 allocations.
}

This code creates 10,000 new string objects, polluting the heap. The right way:

var sb = new StringBuilder();
for (int i = 0; i < 10000; i++)
{
    sb.Append("x");
}
string result = sb.ToString();  // Single allocation.

Even though string is a reference type, it behaves like a value type. The == operator compares content (not address). Copying is safe because it's immutable. But in memory, it's always on the heap - it can't live on the stack because its size is unknown at compile time and its lifetime can exceed method scope.

String Interning

CLR keeps frequently used strings in a pool called the intern pool:

string a = "dotnet";
string b = "dotnet";
Console.WriteLine(Object.ReferenceEquals(a, b)); // True - same object.

Strings that are constant at compile time are automatically interned. You can also do it manually at runtime:

string c = new string(new char[] { 'd', 'o', 't', 'n', 'e', 't' });
string d = string.Intern(c);

So when a method returns a string at runtime, does interning happen automatically? Two scenarios:

Scenario A: Method returns a compile-time literal - interned.

string GetHelloWorld()
{
    return "Merhaba dunya";   // compile-time literal
}

string a = "Merhaba dunya";
string b = GetHelloWorld();

Object.ReferenceEquals(a, b); // True - both are the same intern object

The "Merhaba dunya" literal in GetHelloWorld()'s body is embedded in the assembly metadata. When CLR loads the assembly, it also adds it to the intern pool. When the method is called, it returns the same object from the pool. If two literals are concatenated with + and the compiler can optimize them into a single literal at build time, the result is still interned.

Scenario B: Method creates the string at runtime - not interned.

string GetHelloWorld()
{
    var merhaba = "Merhaba ";
    var dunya = "dunya";
    return merhaba + dunya;   // runtime concat, new object
}

string a = "Merhaba dunya";
string b = GetHelloWorld();

Object.ReferenceEquals(a, b); // False - b was created at runtime, not in the intern pool

string c = string.Intern(b);  // Manual intern
Object.ReferenceEquals(a, c); // True - now it's the same pool object

Concatenation with + using variables, StringBuilder.ToString(), Substring(), ToUpper(), reading from a file, JSON deserialization - all of these create new heap objects at runtime. Even if the content is the same, it doesn't automatically enter the intern pool. You need to call string.Intern().

In short, the issue is not the string's content, but how it was created. We'll look into this more later.

Interning prevents the same string from occupying heap space multiple times. But excessive use can bloat GC's Gen 2 - because interned strings are never collected by the GC (they live until the AppDomain shuts down).

Summary: Stack, Heap, MethodTable Relationship

The complete journey of a new User() call from start to finish:

1. CLR looks at the User MethodTable → BaseSize: 24 bytes
2. Allocates 24 bytes from the GC heap
3. Writes the User MethodTable address to the first 8 bytes of allocated space
4. Zeroes the remaining space, calls the constructor
5. Assigns the heap address to the user variable on the stack
6. user.Name call: reads the string address at offset 16 from the user address

When you understand this mechanism, you start understanding most performance problems too. Every new is an allocation. GC has an allocation budget in its Gen 0 region; creating a new object consumes this budget. When the budget hits zero, GC is triggered and collects unused objects. So every allocation brings you one step closer to the next GC cycle. Every unnecessary allocation is a preventable cost.

What's Next

We've covered what stack and heap are, what CLR does when creating objects, the role of MethodTable, and why string is special. But the real question is:

"How do we avoid heap allocations when working with strings?"

The answer: Span<T>, Memory<T>, stackalloc, and the IDisposable pattern. In the next part of this series, I'll cover these with code examples. We'll also look at GC's generation mechanism, why finalizers are dangerous, and how to catch memory leaks.

References

Share