Brute Force

Brute force pattern matching involves executing a search at every point in the target string, and attempting to look over the length of the pattern.

Worst Case

T = aaaaaa....aaaah
P = aaaaaah

Complexity

O(nm), every item in T must be matched to every item in P.


Boyer-Moore

Booyer Moore looks to improve basic Brute Force Matching.

It works by lining T and P up, and checking the last character in P against T[len(P)-1]. If this character matches, the second last character is checked, and so on. Upon failure, rather than just shifting the Pattern across by one, it can be observed that if c is the character in T that we are trying to match, we can re-align P such that c lines up with the last occurrence of c in P.

Worst Case

T = aaaaaaaaaaaaa
P = baaaaa

Complexity

O(nm + s), whilst this algorithm appears to be equal in complexity to the Brute force method, it is actually significantly faster for English Text (Around 1/4 of the operations required).


KMP (Knuth-Morris-Pratt)

KMP basically calculates how far the pattern can be shifted if it fails after a particular point.

T abaab|x
P abaab|a
P    ab|aaba

If we fail after successfully matching abaab, then we can infer that we can shift the pattern 3 spaces, and know that the first ab will already match.

Complexity

O(m+n), Much better than Boyer-Moore!