Published: 30 Apr 2025 | Reading Time: 5 min read
The Rabin Karp algorithm is famous in the field of reliable string matching due to its effective hashing technique. This algorithm, which was created in 1987 by Michael O. Rabin and Richard M. Karp, is well-known for its ease of use and efficiency in identifying patterns in lengthy texts. Applications such as search engines, DNA analysis, and plagiarism detection make extensive use of it.
Fundamentally, Rabin-Karp transforms strings into hash values so that comparisons can be made more quickly. By dynamically updating hash values, a rolling hash method is added to further optimize efficiency.
This blog delves into its fundamentals, step-by-step implementation, and practical applications in C++, Java, and Python.
The Rabin Karp algorithm is a method used for finding a specific pattern within a larger text. It works by using hashing, which is a way of converting text into numerical values. Instead of comparing the pattern with every substring directly (which would be slow), it first calculates hash values for the pattern and the substrings of the text.
The Rabin Karp algorithm is particularly useful when searching for multiple patterns within a large text, making it efficient for applications like plagiarism detection, search engines, and DNA sequencing.
Its use of hashing and the rolling hash technique allows for faster pattern matching compared to direct string comparison methods. Instead of checking every substring individually, it first compares hash values, reducing the number of direct character comparisons.
Hashing: This is a process of converting a string (or any data) into a fixed numerical value, called a hash. It helps in quick data retrieval and comparison.
Rolling Hash: Instead of recalculating the entire hash from scratch when shifting a window over a string (such as in substring searches), a rolling hash updates the hash value efficiently by removing the effect of the outgoing character and adding the effect of the incoming character.
Prime Number Selection: Using prime numbers in hashing functions helps reduce the chances of different strings producing the same hash (hash collisions). Primes provide a more even distribution of hash values.
Before starting the Rabin Karp algorithm, certain parameters need to be chosen carefully to provide efficient and accurate pattern matching.
To compute the hash values, two important parameters are selected:
A hash value is computed for the given pattern using a mathematical formula. This formula converts the pattern into a unique numerical representation, making it easier to compare with segments of the text.
To optimize the hash computation process, the power terms of the base value are calculated in advance. These pre-computed values help speed up the rolling hash computation, reducing redundant calculations as the algorithm progresses.
Before scanning the entire text, the hash value for the first window of text is computed. This initial computation sets up the rolling hash technique, allowing efficient comparisons as the window slides across the text.
Once the initial window is set, the algorithm slides the window across the text one character at a time. The hash value for each new window is updated efficiently rather than recalculating it from scratch. This process involves three essential steps:
When a match is found between the hash values of the pattern and a text window, it is not guaranteed to be an exact match due to potential hash collisions. Therefore, a direct character-by-character comparison is performed to confirm the match. If the pattern and the text segment are identical the position of the match is recorded.
#include <iostream>
#include <vector>
using namespace std;
vector<int> rabin_karp(const string& pattern, const string& text) {
const int p = 31, mod = 1e9+9;
int m = pattern.size(), n = text.size();
long long pattern_hash = 0, text_hash = 0, p_pow = 1;
vector<int> occurrences;
// Compute hash values for pattern and the first window of text
for (int i = 0; i < m; i++) {
pattern_hash = (pattern_hash + (pattern[i] - 'a' + 1) * p_pow) % mod;
text_hash = (text_hash + (text[i] - 'a' + 1) * p_pow) % mod;
if (i < m - 1) p_pow = (p_pow * p) % mod;
}
// Check for pattern match in text using rolling hash
for (int i = 0; i <= n - m; i++) {
if (pattern_hash == text_hash && text.substr(i, m) == pattern)
occurrences.push_back(i);
// Compute hash for next window
if (i < n - m) {
text_hash = (text_hash - (text[i] - 'a' + 1) + mod) % mod;
text_hash = (text_hash * p + (text[i + m] - 'a' + 1)) % mod;
}
}
return occurrences;
}
int main() {
string text = "ababcabcab";
string pattern = "abc";
vector<int> positions = rabin_karp(pattern, text);
cout << "Pattern found at indices: ";
for (int pos : positions) {
cout << pos << " ";
}
cout << endl;
return 0;
}
The Rabin Karp algorithm finds all occurrences of a pattern in a given text using hashing. First, it computes the hash values for the pattern and the first substring of the text of the same length. Then, it slides through the text while updating the hash efficiently, checking for matches. If the hash matches, it is verified by direct comparison to confirm an exact match.
Pattern found at indices: 2 5
public class RabinKarpSearch {
private static final int PRIME = 101;
private static final int BASE = 31;
public static int searchPattern(String pattern, String text) {
int patternLength = pattern.length();
int textLength = text.length();
long patternHash = computeHash(pattern, patternLength);
long textHash = computeHash(text.substring(0, patternLength), patternLength);
for (int i = 0; i <= textLength - patternLength; i++) {
if (patternHash == textHash && text.substring(i, i + patternLength).equals(pattern)) {
return i;
}
if (i < textLength - patternLength) {
textHash = recalculateHash(text, textHash, i, patternLength);
}
}
return -1;
}
private static long computeHash(String str, int length) {
long hash = 0;
for (int i = 0; i < length; i++) {
hash = (hash * BASE + str.charAt(i)) % PRIME;
}
return hash;
}
private static long recalculateHash(String text, long oldHash, int index, int patternLength) {
oldHash = (oldHash - text.charAt(index) * (long) Math.pow(BASE, patternLength - 1)) % PRIME;
oldHash = (oldHash * BASE + text.charAt(index + patternLength)) % PRIME;
if (oldHash < 0) {
oldHash += PRIME;
}
return oldHash;
}
public static void main(String[] args) {
String text = "ababcabcab";
String pattern = "abc";
int result = searchPattern(pattern, text);
System.out.println("Pattern found at index: " + result);
}
}
This program implements the Rabin Karp algorithm for pattern matching. It computes a rolling hash for efficient substring comparison, checking if the pattern exists in the text. If a match is found, it returns the starting index; otherwise, it returns -1.
Pattern found at index: 2
def rabin_karp_search(text, pattern):
base = 256 # Base value for hash computation (number of possible characters)
prime = 101 # A prime number to minimize hash collisions
m, n = len(pattern), len(text)
result = []
if m == 0 or n < m:
return result
hash_multiplier = pow(base, m - 1, prime)
pattern_hash, window_hash = 0, 0
# Compute initial hash values
for i in range(m):
pattern_hash = (base * pattern_hash + ord(pattern[i])) % prime
window_hash = (base * window_hash + ord(text[i])) % prime
# Slide through the text
for i in range(n - m + 1):
if pattern_hash == window_hash:
if text[i:i + m] == pattern:
result.append(i)
if i < n - m:
window_hash = (base * (window_hash - ord(text[i]) * hash_multiplier) + ord(text[i + m])) % prime
if window_hash < 0:
window_hash += prime
return result
# Example usage
text_sample = "ABCCDABCDABCD"
pattern_sample = "ABCD"
print(rabin_karp_search(text_sample, pattern_sample))
The Rabin Karp algorithm is a string-searching technique that uses hash values to find occurrences of a pattern in a text efficiently. It first calculates hash values for the pattern and the first window of text then slides through the text while updating the hash dynamically. If the hash values match, a direct character comparison confirms the match.
[4, 8]
The Rabin Karp algorithm is an efficient pattern-matching technique that utilizes hashing and the rolling hash method to search for patterns in a given text. It is useful for applications like plagiarism detection, DNA sequencing, and search engines due to its ability to handle multiple pattern searches efficiently.
While its average time complexity is linear, it can slow down in cases of excessive hash collisions. Instead of this limitation, its balance between simplicity and performance makes it a highly used algorithm for string-searching tasks.
When two different substrings have the same hash (a collision), the algorithm performs a direct character-by-character comparison to verify if it's a real match.
On average, it runs in O(n + m) time, where n is the text length and m is the pattern length. However, in the worst case, it can take O(nm) time.
Instead of recalculating the hash from scratch, the rolling hash updates it efficiently when moving to the next substring. It removes the first character's contribution and adds the new character at the end.
It's commonly used in plagiarism detection, DNA sequence matching, data deduplication, and network security.
A large prime number helps spread hash values more evenly, reducing hash collisions and improving efficiency.
Yes! By storing multiple hash values, Rabin-Karp can search for multiple patterns in one pass through the text, making it useful for large-scale searches.
Prime Number Program in Java: Explained with Examples - Learn how to write a prime number program in Java with clear logic, optimized algorithms, examples, and interview-ready explanations. (04 Jan 2026, 8 min read)
Why Encapsulation in Java Matters: Learn with Code Snippets - Understand encapsulation in Java, a key object-oriented principle used to restrict direct access and protect internal data from unauthorized changes. (04 Jan 2026, 5 min read)
Master Binary Search in Java: Fast and Efficient Searching Explained - Learn Binary Search in Java with clear logic, easy code examples, and real-world applications. Boost your coding skills with this step-by-step guide! (03 Jan 2026, 5 min read)
Learn Bubble Sort in Java: Easy Sorting Technique Explained - Get a complete guide on bubble sort in Java, including coding examples and tips to understand this basic but important sorting algorithm. (02 Jan 2026, 6 min read)
Best Java Training Institutes in Hyderabad: Your Guide to Career Success - Find the best Java training institutes in Hyderabad for hands-on learning, placement support, and industry-recognized certifications. (02 Jan 2026, 5 min read)
Understanding Inheritance in Java: Key Concepts & Benefits Explained - Learn all about inheritance in Java with clear examples. Understand types, benefits & how it supports code reusability in object-oriented programming. (02 Jan 2026, 8 min read)
Source: NxtWave - CCBP Blog
Original URL: https://www.ccbp.in/blog/articles/rabin-karp-algorithm
Contact: [email protected] | +919390111761 (WhatsApp only)