Algorithm Complexity Analysis
Disclaimer: These are my personal notes compiled for my own reference and learning. They may contain errors, incomplete information, or personal interpretations. While I strive for accuracy, these notes are not peer-reviewed and should not be considered authoritative sources. Please consult official textbooks, research papers, or other reliable sources for academic or professional purposes.
1. Introduction to Algorithm Complexity
Algorithm complexity analysis helps us understand how an algorithm's resource requirements (time and space) scale with input size. This is crucial for designing efficient algorithms and choosing appropriate solutions for different problem sizes.
2. Big O Notation
Big O notation describes the upper bound of an algorithm's growth rate.
2.1 Formal Definition
A function $f(n)$ is $O(g(n))$ if there exist positive constants $c$ and $n_0$ such that:
2.2 Common Complexity Classes
- $O(1)$ - Constant: Array access, hash table lookup (average case)
- $O(\log n)$ - Logarithmic: Binary search, balanced tree operations
- $O(n)$ - Linear: Linear search, single array traversal
- $O(n \log n)$ - Linearithmic: Efficient sorting (merge sort, heap sort)
- $O(n^2)$ - Quadratic: Bubble sort, selection sort, nested loops
- $O(n^3)$ - Cubic: Matrix multiplication (naive algorithm)
- $O(2^n)$ - Exponential: Recursive Fibonacci, subset enumeration
- $O(n!)$ - Factorial: Traveling salesman (brute force)
3. Other Asymptotic Notations
3.1 Big Omega (Ω) - Lower Bound
$f(n)$ is $\Omega(g(n))$ if there exist positive constants $c$ and $n_0$ such that:
3.2 Big Theta (Θ) - Tight Bound
$f(n)$ is $\Theta(g(n))$ if $f(n)$ is both $O(g(n))$ and $\Omega(g(n))$:
3.3 Little o and Little ω
Little o: $f(n) = o(g(n))$ means $\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0$
Little ω: $f(n) = \omega(g(n))$ means $\lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty$
4. Time Complexity Analysis
4.1 Basic Rules
- Sequential statements: Add complexities
- Nested loops: Multiply complexities
- Conditional statements: Take the maximum
- Recursive calls: Use recurrence relations
4.2 Loop Analysis
# O(n) - Single loop
def linear_search(arr, target):
for i in range(len(arr)): # n iterations
if arr[i] == target:
return i
return -1
# O(n²) - Nested loops
def bubble_sort(arr):
n = len(arr)
for i in range(n): # n iterations
for j in range(n-1): # n-1 iterations each
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
# O(log n) - Halving the search space
def binary_search(arr, target):
left, right = 0, len(arr) - 1
while left <= right: # log n iterations
mid = (left + right) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
left = mid + 1
else:
right = mid - 1
return -1
# O(n log n) - Divide and conquer
def merge_sort(arr):
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid]) # T(n/2)
right = merge_sort(arr[mid:]) # T(n/2)
return merge(left, right) # O(n)
5. Recurrence Relations
5.1 Master Theorem
For recurrences of the form $T(n) = aT(n/b) + f(n)$ where $a \geq 1$ and $b > 1$:
- Case 1: If $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$, then $T(n) = \Theta(n^{\log_b a})$
- Case 2: If $f(n) = \Theta(n^{\log_b a})$, then $T(n) = \Theta(n^{\log_b a} \log n)$
- Case 3: If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$ and regularity condition, then $T(n) = \Theta(f(n))$
5.2 Common Recurrence Examples
6. Space Complexity
6.1 Types of Space Complexity
- Auxiliary Space: Extra space used by algorithm (not including input)
- Total Space: Auxiliary space + input space
6.2 Space Complexity Examples
# O(1) auxiliary space - iterative approach
def factorial_iterative(n):
result = 1
for i in range(1, n + 1):
result *= i
return result
# O(n) auxiliary space - recursive approach
def factorial_recursive(n):
if n <= 1:
return 1
return n * factorial_recursive(n - 1) # n recursive calls on stack
# O(n) auxiliary space - explicit array
def fibonacci_dp(n):
if n <= 1:
return n
dp = [0] * (n + 1) # O(n) space
dp[1] = 1
for i in range(2, n + 1):
dp[i] = dp[i-1] + dp[i-2]
return dp[n]
# O(1) auxiliary space - optimized DP
def fibonacci_optimized(n):
if n <= 1:
return n
a, b = 0, 1
for _ in range(2, n + 1):
a, b = b, a + b
return b
7. Amortized Analysis
7.1 Aggregate Method
Analyze total cost of n operations and divide by n.
7.2 Accounting Method
Assign credits to operations; expensive operations use credits from cheaper ones.
7.3 Potential Method
Define a potential function and analyze amortized cost using potential differences.
7.4 Dynamic Array Example
class DynamicArray:
def __init__(self):
self.capacity = 1
self.size = 0
self.data = [None] * self.capacity
def append(self, item):
if self.size == self.capacity:
self._resize() # O(n) occasionally
self.data[self.size] = item
self.size += 1
def _resize(self):
old_capacity = self.capacity
self.capacity *= 2
new_data = [None] * self.capacity
for i in range(self.size): # Copy O(n) elements
new_data[i] = self.data[i]
self.data = new_data
# Amortized analysis:
# - Most appends: O(1)
# - Resize operations: O(n) but infrequent
# - Total cost for n operations: O(n)
# - Amortized cost per operation: O(1)
8. Best, Average, and Worst Case Analysis
8.1 Quick Sort Analysis
def quicksort(arr, low=0, high=None):
if high is None:
high = len(arr) - 1
if low < high:
pivot_index = partition(arr, low, high)
quicksort(arr, low, pivot_index - 1)
quicksort(arr, pivot_index + 1, high)
def partition(arr, low, high):
pivot = arr[high]
i = low - 1
for j in range(low, high):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[high] = arr[high], arr[i + 1]
return i + 1
# Time Complexity Analysis:
# Best Case: O(n log n) - pivot always divides array in half
# Average Case: O(n log n) - random pivot selection
# Worst Case: O(n²) - pivot is always smallest/largest element
9. Algorithm Design Paradigms
9.1 Divide and Conquer
- Recurrence: $T(n) = aT(n/b) + f(n)$
- Examples: Merge sort, binary search, Strassen's matrix multiplication
9.2 Dynamic Programming
- Time-Space Tradeoff: Store solutions to subproblems
- Examples: Fibonacci, longest common subsequence, knapsack
9.3 Greedy Algorithms
- Strategy: Make locally optimal choices
- Examples: Dijkstra's algorithm, Huffman coding
10. Advanced Complexity Topics
10.1 Complexity Classes
- P: Problems solvable in polynomial time
- NP: Problems verifiable in polynomial time
- NP-Complete: Hardest problems in NP
- NP-Hard: At least as hard as NP-Complete problems
10.2 Reduction and Approximation
# Example: Traveling Salesman Problem (TSP)
# Exact solution: O(n!) - factorial time
# Approximation algorithms can provide solutions within a factor of optimal
def tsp_nearest_neighbor(distances):
"""
Nearest neighbor heuristic for TSP
Time: O(n²), Approximation ratio: no bound
"""
n = len(distances)
if n == 0:
return 0, []
visited = [False] * n
path = [0] # Start from city 0
visited[0] = True
total_distance = 0
current = 0
for _ in range(n - 1):
nearest_city = -1
nearest_distance = float('inf')
for city in range(n):
if not visited[city] and distances[current][city] < nearest_distance:
nearest_distance = distances[current][city]
nearest_city = city
visited[nearest_city] = True
path.append(nearest_city)
total_distance += nearest_distance
current = nearest_city
# Return to starting city
total_distance += distances[current][0]
path.append(0)
return total_distance, path
# Dynamic Programming solution for TSP
def tsp_dp(distances):
"""
Exact TSP solution using dynamic programming
Time: O(n² * 2^n), Space: O(n * 2^n)
"""
n = len(distances)
if n == 0:
return 0, []
# dp[mask][i] = minimum cost to visit all cities in mask, ending at city i
dp = {}
parent = {}
# Base case: start from city 0
for i in range(1, n):
dp[(1 << i) | 1, i] = distances[0][i]
# Fill DP table
for mask in range(1, 1 << n):
if not (mask & 1): # Must include starting city
continue
for u in range(n):
if not (mask & (1 << u)):
continue
prev_mask = mask ^ (1 << u)
if prev_mask == 0:
continue
min_cost = float('inf')
best_prev = -1
for v in range(n):
if (prev_mask & (1 << v)) and (prev_mask, v) in dp:
cost = dp[prev_mask, v] + distances[v][u]
if cost < min_cost:
min_cost = cost
best_prev = v
if min_cost != float('inf'):
dp[mask, u] = min_cost
parent[mask, u] = best_prev
# Find minimum cost to visit all cities and return to start
full_mask = (1 << n) - 1
min_cost = float('inf')
best_end = -1
for i in range(1, n):
if (full_mask, i) in dp:
cost = dp[full_mask, i] + distances[i][0]
if cost < min_cost:
min_cost = cost
best_end = i
# Reconstruct path
path = []
mask = full_mask
current = best_end
while current != -1:
path.append(current)
if (mask, current) in parent:
prev = parent[mask, current]
mask ^= (1 << current)
current = prev
else:
break
path.reverse()
path.append(0) # Return to start
return min_cost, path
11. Practical Complexity Analysis
11.1 Profiling and Benchmarking
import time
import random
import matplotlib.pyplot as plt
def benchmark_sorting_algorithms():
"""Compare different sorting algorithms empirically"""
algorithms = {
'Bubble Sort': bubble_sort,
'Selection Sort': selection_sort,
'Insertion Sort': insertion_sort,
'Merge Sort': merge_sort,
'Quick Sort': quicksort,
'Python Built-in': sorted
}
sizes = [100, 500, 1000, 2000, 5000]
results = {name: [] for name in algorithms}
for size in sizes:
print(f"Testing size {size}...")
data = [random.randint(1, 1000) for _ in range(size)]
for name, algorithm in algorithms.items():
test_data = data.copy()
start_time = time.time()
if name == 'Python Built-in':
sorted(test_data)
else:
algorithm(test_data)
end_time = time.time()
results[name].append(end_time - start_time)
# Plot results
plt.figure(figsize=(12, 8))
for name, times in results.items():
plt.plot(sizes, times, marker='o', label=name)
plt.xlabel('Input Size')
plt.ylabel('Time (seconds)')
plt.title('Sorting Algorithm Performance Comparison')
plt.legend()
plt.yscale('log')
plt.grid(True)
plt.show()
return results
# Memory profiling
import tracemalloc
def profile_memory(func, *args, **kwargs):
"""Profile memory usage of a function"""
tracemalloc.start()
result = func(*args, **kwargs)
current, peak = tracemalloc.get_traced_memory()
tracemalloc.stop()
print(f"Current memory usage: {current / 1024 / 1024:.2f} MB")
print(f"Peak memory usage: {peak / 1024 / 1024:.2f} MB")
return result
12. Common Algorithmic Patterns
12.1 Two Pointers Technique
# O(n) solution for two sum in sorted array
def two_sum_sorted(nums, target):
left, right = 0, len(nums) - 1
while left < right:
current_sum = nums[left] + nums[right]
if current_sum == target:
return [left, right]
elif current_sum < target:
left += 1
else:
right -= 1
return None
12.2 Sliding Window
# O(n) solution for maximum sum subarray of size k
def max_sum_subarray(nums, k):
if len(nums) < k:
return None
# Calculate sum of first window
window_sum = sum(nums[:k])
max_sum = window_sum
# Slide the window
for i in range(k, len(nums)):
window_sum = window_sum - nums[i - k] + nums[i]
max_sum = max(max_sum, window_sum)
return max_sum
13. References
- Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to Algorithms.
- Sedgewick, R., & Wayne, K. (2011). Algorithms.
- Kleinberg, J., & Tardos, E. (2005). Algorithm Design.