Algorithm Complexity Analysis

Last updated: December 2024

Disclaimer: These are my personal notes compiled for my own reference and learning. They may contain errors, incomplete information, or personal interpretations. While I strive for accuracy, these notes are not peer-reviewed and should not be considered authoritative sources. Please consult official textbooks, research papers, or other reliable sources for academic or professional purposes.

1. Introduction to Algorithm Complexity

Algorithm complexity analysis helps us understand how an algorithm's resource requirements (time and space) scale with input size. This is crucial for designing efficient algorithms and choosing appropriate solutions for different problem sizes.

2. Big O Notation

Big O notation describes the upper bound of an algorithm's growth rate.

2.1 Formal Definition

A function $f(n)$ is $O(g(n))$ if there exist positive constants $c$ and $n_0$ such that:

f(n) \leq c \cdot g(n) \text{ for all } n \geq n_0

2.2 Common Complexity Classes

$O(1)$ - Constant: Array access, hash table lookup (average case)
$O(\log n)$ - Logarithmic: Binary search, balanced tree operations
$O(n)$ - Linear: Linear search, single array traversal
$O(n \log n)$ - Linearithmic: Efficient sorting (merge sort, heap sort)
$O(n^2)$ - Quadratic: Bubble sort, selection sort, nested loops
$O(n^3)$ - Cubic: Matrix multiplication (naive algorithm)
$O(2^n)$ - Exponential: Recursive Fibonacci, subset enumeration
$O(n!)$ - Factorial: Traveling salesman (brute force)

3. Other Asymptotic Notations

3.1 Big Omega (Ω) - Lower Bound

$f(n)$ is $\Omega(g(n))$ if there exist positive constants $c$ and $n_0$ such that:

f(n) \geq c \cdot g(n) \text{ for all } n \geq n_0

3.2 Big Theta (Θ) - Tight Bound

$f(n)$ is $\Theta(g(n))$ if $f(n)$ is both $O(g(n))$ and $\Omega(g(n))$:

c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n) \text{ for all } n \geq n_0

3.3 Little o and Little ω

Little o: $f(n) = o(g(n))$ means $\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0$

Little ω: $f(n) = \omega(g(n))$ means $\lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty$

4. Time Complexity Analysis

4.1 Basic Rules

Sequential statements: Add complexities
Nested loops: Multiply complexities
Conditional statements: Take the maximum
Recursive calls: Use recurrence relations

4.2 Loop Analysis

# O(n) - Single loop
def linear_search(arr, target):
    for i in range(len(arr)):  # n iterations
        if arr[i] == target:
            return i
    return -1

# O(n²) - Nested loops
def bubble_sort(arr):
    n = len(arr)
    for i in range(n):        # n iterations
        for j in range(n-1):  # n-1 iterations each
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]

# O(log n) - Halving the search space
def binary_search(arr, target):
    left, right = 0, len(arr) - 1
    while left <= right:      # log n iterations
        mid = (left + right) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            left = mid + 1
        else:
            right = mid - 1
    return -1

# O(n log n) - Divide and conquer
def merge_sort(arr):
    if len(arr) <= 1:
        return arr
    
    mid = len(arr) // 2
    left = merge_sort(arr[:mid])    # T(n/2)
    right = merge_sort(arr[mid:])   # T(n/2)
    
    return merge(left, right)       # O(n)

5. Recurrence Relations

5.1 Master Theorem

For recurrences of the form $T(n) = aT(n/b) + f(n)$ where $a \geq 1$ and $b > 1$:

Case 1: If $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$, then $T(n) = \Theta(n^{\log_b a})$
Case 2: If $f(n) = \Theta(n^{\log_b a})$, then $T(n) = \Theta(n^{\log_b a} \log n)$
Case 3: If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$ and regularity condition, then $T(n) = \Theta(f(n))$

5.2 Common Recurrence Examples

T(n) = 2T(n/2) + O(n) \quad \Rightarrow \quad T(n) = O(n \log n)$$ $$T(n) = 2T(n/2) + O(1) \quad \Rightarrow \quad T(n) = O(n)$$ $$T(n) = T(n-1) + O(1) \quad \Rightarrow \quad T(n) = O(n)$$ $$T(n) = T(n-1) + O(n) \quad \Rightarrow \quad T(n) = O(n^2)

6. Space Complexity

6.1 Types of Space Complexity

Auxiliary Space: Extra space used by algorithm (not including input)
Total Space: Auxiliary space + input space

6.2 Space Complexity Examples

# O(1) auxiliary space - iterative approach
def factorial_iterative(n):
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result

# O(n) auxiliary space - recursive approach
def factorial_recursive(n):
    if n <= 1:
        return 1
    return n * factorial_recursive(n - 1)  # n recursive calls on stack

# O(n) auxiliary space - explicit array
def fibonacci_dp(n):
    if n <= 1:
        return n
    
    dp = [0] * (n + 1)  # O(n) space
    dp[1] = 1
    
    for i in range(2, n + 1):
        dp[i] = dp[i-1] + dp[i-2]
    
    return dp[n]

# O(1) auxiliary space - optimized DP
def fibonacci_optimized(n):
    if n <= 1:
        return n
    
    a, b = 0, 1
    for _ in range(2, n + 1):
        a, b = b, a + b
    
    return b

7. Amortized Analysis

7.1 Aggregate Method

Analyze total cost of n operations and divide by n.

7.2 Accounting Method

Assign credits to operations; expensive operations use credits from cheaper ones.

7.3 Potential Method

Define a potential function and analyze amortized cost using potential differences.

7.4 Dynamic Array Example

class DynamicArray:
    def __init__(self):
        self.capacity = 1
        self.size = 0
        self.data = [None] * self.capacity
    
    def append(self, item):
        if self.size == self.capacity:
            self._resize()  # O(n) occasionally
        
        self.data[self.size] = item
        self.size += 1
    
    def _resize(self):
        old_capacity = self.capacity
        self.capacity *= 2
        new_data = [None] * self.capacity
        
        for i in range(self.size):  # Copy O(n) elements
            new_data[i] = self.data[i]
        
        self.data = new_data

# Amortized analysis:
# - Most appends: O(1)
# - Resize operations: O(n) but infrequent
# - Total cost for n operations: O(n)
# - Amortized cost per operation: O(1)

8. Best, Average, and Worst Case Analysis

8.1 Quick Sort Analysis

def quicksort(arr, low=0, high=None):
    if high is None:
        high = len(arr) - 1
    
    if low < high:
        pivot_index = partition(arr, low, high)
        quicksort(arr, low, pivot_index - 1)
        quicksort(arr, pivot_index + 1, high)

def partition(arr, low, high):
    pivot = arr[high]
    i = low - 1
    
    for j in range(low, high):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]
    
    arr[i + 1], arr[high] = arr[high], arr[i + 1]
    return i + 1

# Time Complexity Analysis:
# Best Case: O(n log n) - pivot always divides array in half
# Average Case: O(n log n) - random pivot selection
# Worst Case: O(n²) - pivot is always smallest/largest element

9. Algorithm Design Paradigms

9.1 Divide and Conquer

Recurrence: $T(n) = aT(n/b) + f(n)$
Examples: Merge sort, binary search, Strassen's matrix multiplication

9.2 Dynamic Programming

Time-Space Tradeoff: Store solutions to subproblems
Examples: Fibonacci, longest common subsequence, knapsack

9.3 Greedy Algorithms

Strategy: Make locally optimal choices
Examples: Dijkstra's algorithm, Huffman coding

10. Advanced Complexity Topics

10.1 Complexity Classes

P: Problems solvable in polynomial time
NP: Problems verifiable in polynomial time
NP-Complete: Hardest problems in NP
NP-Hard: At least as hard as NP-Complete problems

10.2 Reduction and Approximation

# Example: Traveling Salesman Problem (TSP)
# Exact solution: O(n!) - factorial time
# Approximation algorithms can provide solutions within a factor of optimal

def tsp_nearest_neighbor(distances):
    """
    Nearest neighbor heuristic for TSP
    Time: O(n²), Approximation ratio: no bound
    """
    n = len(distances)
    if n == 0:
        return 0, []
    
    visited = [False] * n
    path = [0]  # Start from city 0
    visited[0] = True
    total_distance = 0
    current = 0
    
    for _ in range(n - 1):
        nearest_city = -1
        nearest_distance = float('inf')
        
        for city in range(n):
            if not visited[city] and distances[current][city] < nearest_distance:
                nearest_distance = distances[current][city]
                nearest_city = city
        
        visited[nearest_city] = True
        path.append(nearest_city)
        total_distance += nearest_distance
        current = nearest_city
    
    # Return to starting city
    total_distance += distances[current][0]
    path.append(0)
    
    return total_distance, path

# Dynamic Programming solution for TSP
def tsp_dp(distances):
    """
    Exact TSP solution using dynamic programming
    Time: O(n² * 2^n), Space: O(n * 2^n)
    """
    n = len(distances)
    if n == 0:
        return 0, []
    
    # dp[mask][i] = minimum cost to visit all cities in mask, ending at city i
    dp = {}
    parent = {}
    
    # Base case: start from city 0
    for i in range(1, n):
        dp[(1 << i) | 1, i] = distances[0][i]
    
    # Fill DP table
    for mask in range(1, 1 << n):
        if not (mask & 1):  # Must include starting city
            continue
            
        for u in range(n):
            if not (mask & (1 << u)):
                continue
                
            prev_mask = mask ^ (1 << u)
            if prev_mask == 0:
                continue
                
            min_cost = float('inf')
            best_prev = -1
            
            for v in range(n):
                if (prev_mask & (1 << v)) and (prev_mask, v) in dp:
                    cost = dp[prev_mask, v] + distances[v][u]
                    if cost < min_cost:
                        min_cost = cost
                        best_prev = v
            
            if min_cost != float('inf'):
                dp[mask, u] = min_cost
                parent[mask, u] = best_prev
    
    # Find minimum cost to visit all cities and return to start
    full_mask = (1 << n) - 1
    min_cost = float('inf')
    best_end = -1
    
    for i in range(1, n):
        if (full_mask, i) in dp:
            cost = dp[full_mask, i] + distances[i][0]
            if cost < min_cost:
                min_cost = cost
                best_end = i
    
    # Reconstruct path
    path = []
    mask = full_mask
    current = best_end
    
    while current != -1:
        path.append(current)
        if (mask, current) in parent:
            prev = parent[mask, current]
            mask ^= (1 << current)
            current = prev
        else:
            break
    
    path.reverse()
    path.append(0)  # Return to start
    
    return min_cost, path

11. Practical Complexity Analysis

11.1 Profiling and Benchmarking

import time
import random
import matplotlib.pyplot as plt

def benchmark_sorting_algorithms():
    """Compare different sorting algorithms empirically"""
    algorithms = {
        'Bubble Sort': bubble_sort,
        'Selection Sort': selection_sort,
        'Insertion Sort': insertion_sort,
        'Merge Sort': merge_sort,
        'Quick Sort': quicksort,
        'Python Built-in': sorted
    }
    
    sizes = [100, 500, 1000, 2000, 5000]
    results = {name: [] for name in algorithms}
    
    for size in sizes:
        print(f"Testing size {size}...")
        data = [random.randint(1, 1000) for _ in range(size)]
        
        for name, algorithm in algorithms.items():
            test_data = data.copy()
            
            start_time = time.time()
            if name == 'Python Built-in':
                sorted(test_data)
            else:
                algorithm(test_data)
            end_time = time.time()
            
            results[name].append(end_time - start_time)
    
    # Plot results
    plt.figure(figsize=(12, 8))
    for name, times in results.items():
        plt.plot(sizes, times, marker='o', label=name)
    
    plt.xlabel('Input Size')
    plt.ylabel('Time (seconds)')
    plt.title('Sorting Algorithm Performance Comparison')
    plt.legend()
    plt.yscale('log')
    plt.grid(True)
    plt.show()
    
    return results

# Memory profiling
import tracemalloc

def profile_memory(func, *args, **kwargs):
    """Profile memory usage of a function"""
    tracemalloc.start()
    
    result = func(*args, **kwargs)
    
    current, peak = tracemalloc.get_traced_memory()
    tracemalloc.stop()
    
    print(f"Current memory usage: {current / 1024 / 1024:.2f} MB")
    print(f"Peak memory usage: {peak / 1024 / 1024:.2f} MB")
    
    return result

12. Common Algorithmic Patterns

12.1 Two Pointers Technique

# O(n) solution for two sum in sorted array
def two_sum_sorted(nums, target):
    left, right = 0, len(nums) - 1
    
    while left < right:
        current_sum = nums[left] + nums[right]
        if current_sum == target:
            return [left, right]
        elif current_sum < target:
            left += 1
        else:
            right -= 1
    
    return None

12.2 Sliding Window

# O(n) solution for maximum sum subarray of size k
def max_sum_subarray(nums, k):
    if len(nums) < k:
        return None
    
    # Calculate sum of first window
    window_sum = sum(nums[:k])
    max_sum = window_sum
    
    # Slide the window
    for i in range(k, len(nums)):
        window_sum = window_sum - nums[i - k] + nums[i]
        max_sum = max(max_sum, window_sum)
    
    return max_sum

13. References

Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms.
Sedgewick, R., & Wayne, K. (2011). Algorithms.
Kleinberg, J., & Tardos, E. (2005). Algorithm Design.