Dynamic Memory Allocation: Basic Concepts
References
- Slides adapted from CMU
Dynamic Memory Allocation
- Programmers use dynamic memory allocators (such as
malloc
) to acquire virtual memory (VM) at run time.- For data structures where the size is only known at runtime
- Dynamic memory allocators manage an area of process VM known as the heap.
Dynamic Memory Allocation
Allocator maintains the heap as a collection of variable sized blocks, which are either allocated or free.
- Types of allocators
- Explicit allocator: application allocates and frees space (for example,
malloc
andfree
in C) - Implicit allocator: application allocates, but does not free space (for example,
new
and garbage collection in Java)
- Explicit allocator: application allocates and frees space (for example,
This lecture: explicit memory allocation
The malloc
Package
void *malloc(size_t size)
- Success: returns a pointer to a memory block of at least
size
bytes aligned to a 16-byte boundary (on x86-64); ifsize == 0
, returnsNULL
- Unsuccessful: returns
NULL
and setserrno
toENOMEM
- Success: returns a pointer to a memory block of at least
void free(void *p)
- Returns the block pointed at by
p
to pool of available memory p
must come from a previous call tomalloc
,calloc
, orrealloc
- Returns the block pointed at by
- Other functions:
calloc
: version ofmalloc
that initializes allocated block to zerorealloc
: changes the size of a previously allocated blocksbrk
: used internally by allocators to grow or shrink the heap
malloc
Example
#include <stdio.h>
#include <stdlib.h>
void foo(long n) {
long i, *p;
/* Allocate a block of n longs */
p = (long *) malloc(n * sizeof(long));
if (p == NULL) {
perror("malloc");
exit(0);
}
/* Initialize allocated block */
for (i=0; i<n; i++)
p[i] = i;
/* Do something with p */
...
/* Return allocated block to the heap */
free(p);
}
Sample Implementation
- Code (location: CS:APP3e Code Examples
- File:
mm.c
- Manges fixed size heap
- Functions
mm_malloc
andmm_free
- File:
- Features
- Based on words of 8 bytes each
- Pointers returned by
malloc
are double-word aligned - Compile and run tests with command interpreter
Constraints
- Applications
- Can issue arbitrary sequence of
malloc
andfree
requests free
request must be to amalloc
’d block
- Can issue arbitrary sequence of
- Explicit Allocators
- Cannot control number or size of allocated blocks
- Must respond immediately to
malloc
requests - Must allocate blocks from free memory
- Must align blocks to satisfy alignment requirements
- Can manipulate and modify only free memory
- Cannot move the allocated blocks once they are
malloc
’d
Performance Goal: Throughput
Given some sequence of
malloc
andfree
requests:\[R_{0}, R_{1}, \ldots, R_{k}, \ldots, R_{n-1}\]
- Goals: maximize throughput and peak memory utilization
- these goals are often conflicting
- Throughput:
- Number of completed requests per unit time
- Example:
- 5,000
malloc
calls and 5,000free
calls in 10 seconds - Throughput is 1,000 operations per second
- 5,000
Performance Goal: Minimize Overhead
Given some sequence of
malloc
andfree
requests:\[R_{0}, R_{1}, \ldots, R_{k}, \ldots, R_{n-1}\]
- Definition: aggregate payload \(P_{k}\)
malloc(p)
results in a block with a payload ofp
bytes- After request \(R_{k}\) has completed, the aggregate payload \(P_{k}\) is the sum of currently allocated payloads
- Definition: current heap size \(H_{k}\)
- Assume \(H_{k}\) is montonically nondecreasing, that is, the heap only grows when the allocator uses
sbrk
- Assume \(H_{k}\) is montonically nondecreasing, that is, the heap only grows when the allocator uses
- Definition: Overhead after \(k+1\) requests
- Fraction of heap space not used for program data
- \(O_{k} = H_{k} / (max_{i \leq k} P_{i}) - 1\)
malloc
Heap Visualization Example
Fragmentation
Fragmentation causes poor memory utilization
Internal fragmentation: For a given block, internal framentation occurs if payload is smaller than block size
- Caused by
- overhead of maintaining heap data structures
- padding for alignment purposes
- explicit policy decisions (for example, to return a big block to satisfy a small request)
- Depends only on the pattern of previous requests
- Caused by
External fragmentation: occurs when there is enough aggregate heap memory, but no single free block is large enough
- Amount of external fragmentation depends on the pattern of future requests (difficult to measure)
Implementation Issues
How do we know how much memory to free given only a pointer?
How do we keep track of the free blocks?
What do we do with the extra space when allocating a structure that is smaller than the free block it is place?
How do we pick a block to use for allocation – many might fit?
How do we reuse a block that has been freed?
Knowing How Much to Free
- Standard method
Keep the length (in bytes) of a block in the word preceding the block, including the header
Requires an extra word for every allocated block
Keeping Track of Free Blocks
- Method 1: Implicit list using length; links all blocks
- Need to tag each block as allocated/free
- Method 2: Explicit list among the free blocks using pointers
- Need space for pointers
- Method 3: Segregated free list
- Different free lists for different size classes
- Method 4: Blocks sorted by size
- Can use a balanced tree with pointers within each free block, and the length used as a key
Method 1: Implicit Free List
- For each block we need both size and allocation status
- Could store this information in two words (wasteful)
- Standard trick
- When blocks are aligned, some low-order address bits are always zero
- Instead of storing the always zero bit, use it as an allocated/free flag
- When reading the size word, the bit must be masked out
Detailed Implicit Free List Example
- Allocated blocks: shaded
- Free blocks: unshaded
- Headers: labeled with “size in words/allocated bit”
- Headers are at non-aligned positions
- Payloads are aligned
Implicit List: Data Structures
Block declaration
typedef unint64_t word_t; typedef struct block { word_t header; unsigned char payload[0]; // zero length array } block_t;
Getting payload from block pointer
return (void *) (block->payload);
Getting header from payload
return (void *) ((unsigned char *) bp - offsetof(block_t, payload));
Implicit List: Header access
Getting allocated bit from header
return header & 0x1;
Getting size from header
return header & ~0xfL;
Initializing header
block->header = size | alloc;
Implicit List: Traversing the List
Find next block
static block_t *find_next(block_t *block) { return (block_t *) ((unsigned char *) block + get_size(block)); }
Implicit List: Finding a Free Block
- Search list from beginning and choose first free block that fits (including space for the header)
static block_t *find_fit(size_t asize) {
block_t *block;
for (block = heap_start; block != heap_end;
block = find_next(block))
{
if (!(get_alloc(block)) && (asize <= get_size(block)))
return block;
}
return NULL; // No fit found
}
Implicit List: Finding a Free Block
- First fit:
- Search list from the beginning and choose the first free block that fits
- Can take linear time in total number of blocks (allocated and free)
- In practice it can cause “splinters” at the beginning of the list
- Next fit:
- Like first fit, but search the list starting where the previous search finished
- Should often be faster than first fit since it avoids re-scanning unhelpful blocks
- Some research suggests that fragmentation is worse
- Best fit:
- Search the list and choose the best free block: fits with the fewest bytes left over
- Keeps fragments small; usually improves memory utilization
- Will typically run slower than first fit
- Still a greedy algorithm; no guarantee of optimality
Implicit List: Allocating in Free Block
- Allocating in a free block: splitting
- Since allocated space might be smaller than free space, we might want to split the block
Implicit List: Splitting Free Block
// Warning: This code is incomplete
static void split_block(block_t *block, size_t asize) {
size_t block_size = get_size(block);
if ((block_size - asize) >= min_block_size) {
write_header(block, asize, true);
block_t *block_next = find_next(block);
write_header(block_next, block_size - asize, false);
}
}
Implicit List: Freeing a Block
- Simplest implementation:
- Need to clear the “allocated” flag
- But, can lead to “false fragmentation”
Implicit List: Coalescing
Join (coalesce) with next/previous blocks, if they are free
- Coalesce with next block
- Simple because of forward search
- How do we coalesce with previous block?
- How do we know where it starts?
- How can we determine whether it is allocated?
Implicit List: Bidirectional Coalescing
- Boundary tags
- Replicate size/allocated word at “bottom” (end) of free blocks
- Allows us to traverse the “list” backwards, but requires extra space
- Important and general technique
Implementation with Footers
Locating footer of current block
const size_t dsize = 2 * sizeof(word_t); static word_t *header_to_footer(block_t *block) { size_t asize = get_size(block); return (word_t *) (block->payload + asize - dsize); }
Locating footer of previous block
static word_t *find_prev_footer(block_t *block) { return &(block->header) - 1); }
Splitting Free Block: Full Version
static void split_block(block_t *block, size_t asize) {
size_t block_size = get_size(block);
if ((block_size - asize) >= min_block_size) {
write_header(block, asize, true);
write_footer(block, asize, true);
block_t *block_next = find_next(block);
write_header(block_next, block_size - asize, false);
write_footer(block_next, block_size - asize, false);
}
}
Constant Time Coalescing (Case 1)
Constant Time Coalescing (Case 2)
Constant Time Coalescing (Case 3)
Constant Time Coalescing (Case 4)
Heap Structure
- Dummy footer before first header
- Marked as allocated
- Prevents accidental coalescing when freeing first block
- Dummy header after last footer
- Prevents accidental coalescing when freeing final block
Top-Level Malloc Code
const size_t dsize = 2*sizeof(word_t);
void *mm_malloc(size_t size)
{
size_t asize = round_up(size + dsize, dsize);
block_t *block = find_fit(asize);
if (block == NULL)
return NULL;
size_t block_size = get_size(block);
write_header(block, block_size, true);
write_footer(block, block_size, true);
split_block(block, asize);
return header_to_payload(block);
}
Top-Level Free Code
void mm_free(void *bp)
{
block_t *block = payload_to_header(bp);
size_t size = get_size(block);
write_header(block, size, false);
write_footer(block, size, false);
coalesce_block(block);
}
Disadvantages of Boundary Tags
Internal fragmentation
- Can it be optimized?
- Which blocks need the footer tag?
- What does that mean?
No Boundary Tag for Allocated Blocks
Boundary tag needed only for free blocks
When sizes are multiples of 16, have 4 spare bits
Header: Use 2 bits (address bits always zero due to alignment):
(prev_block) << 1 | (curr_block)
Summary of Key Allocator Policies
- Placement policy:
- First-fit, next-fit, best-fit, etc.
- Trades off lower throughput for less fragmentation
- Interesting observation: segregated free lists (next lecture) approximate a best fit placement policy without having to search entire free list
- Splitting policy:
- When do we go ahead and split free blocks?
- How much internal fragmentation are we willing to tolerate?
- Coalescing policy:
- Immediate coalescing: coalesce each time
free
is called - Deferred coalescing: try to improve performance of
free
by deferring coalescing until needed
- Immediate coalescing: coalesce each time
Implicit Lists: Summary
Implementation: very simple
Allocate cost: linear time worst case
Free cost: constant time worst case (even with coalescing)
Memory overhead: depends on placement policy
Not used in practice for
malloc
/free
because of linear time allocationThe concepts of splitting and boundary tag coalescing are general to all allocators