Aho-Corasick Algorithm for Multiple String Matching
Understanding the Aho-Corasick Algorithm
The Aho-Corasick algorithm is a powerful string-searching algorithm that can find all occurrences of a finite set of patterns within a text in linear time. It combines the data structures of a trie and a finite automaton. First, it builds a trie from all the patterns. Then, it adds failure links to the trie, transforming it into a state machine. This allows the search to proceed without backtracking, even when a mismatch occurs, by following failure links to find the next best partial match.
Example: Aho-Corasick Implementation
Here is a Python implementation of the Aho-Corasick algorithm. It involves building a trie, computing failure links, and then searching the text using the constructed automaton.
from collections import deque
class TrieNode:
def __init__(self):
self.children = {}
self.fail = None
self.output = []
def build_aho_corasick(patterns):
root = TrieNode()
# 1. Build the trie
for pattern in patterns:
node = root
for char in pattern:
node = node.children.setdefault(char, TrieNode())
node.output.append(pattern)
# 2. Build failure links using BFS
queue = deque()
for node in root.children.values():
node.fail = root
queue.append(node)
while queue:
current_node = queue.popleft()
for char, next_node in current_node.children.items():
fail_node = current_node.fail
while char not in fail_node.children and fail_node is not root:
fail_node = fail_node.fail
if char in fail_node.children:
next_node.fail = fail_node.children[char]
else:
next_node.fail = root
next_node.output.extend(next_node.fail.output)
queue.append(next_node)
return root
def search_aho_corasick(text, root):
results = []
node = root
for i, char in enumerate(text):
while char not in node.children and node is not root:
node = node.fail
if char in node.children:
node = node.children[char]
else:
node = root
if node.output:
for pattern in node.output:
results.append((i - len(pattern) + 1, pattern))
return results
# Example Usage
patterns = ["he", "she", "his", "hers"]
text = "ushers"
root = build_aho_corasick(patterns)
matches = search_aho_corasick(text, root)
print(f"Patterns found in '{text}':")
for index, pattern in matches:
print(f"- Pattern '{pattern}' found at index {index}")
# Expected Output:
# Patterns found in 'ushers':
# - Pattern 'he' found at index 2
# - Pattern 'she' found at index 1
# - Pattern 'hers' found at index 2
Related Algorithm Guides
Explore more algorithm interview guides powered by AI coaching
Mixed Reality Developer Interview Questions
AI-powered interview preparation guide
Nonprofit Management Interview Questions
AI-powered interview preparation guide
Real Time Interview Performance Feedback
AI-powered interview preparation guide
Public Speaking Interview Question Preparation
AI-powered interview preparation guide
Related Algorithm Resources
All Interview Solutions
Browse our complete collection of AI-powered interview preparation guides.
GeeksforGeeks Algorithms
Comprehensive algorithm tutorials and practice problems.
LeetCode Practice
Algorithm coding challenges and interview preparation.
Algorithm Visualizations
Interactive visualizations for understanding algorithms.