Chances are that if you've worked with strings then you have also tried to locate a specific substring in the entire source string. In computer science this is called pattern matching or string matching. And to accomplish this task, there are a few classic algorithms. In this two-part series weíre going to discuss string searching algorithms. We will start with Knuth-Morris-Pratt.
Pattern Matching Algorithms Demystified: KMP - Taking a Break (Page 4 of 4 )
We have just arrived at the end of this part. During this half, we delved into one of the classic string searching algorithms. By now you should know what string matching or pattern searching is all about. You should also know the reason why we need efficient algorithms, since naÔve solutions donít cut it most of the time.
Before we say goodbye, please glance over the attached screenshot below. Thatís the output on the command window when we run the application we just created for the ďabababacĒ source text string and the ďababacĒ pattern string. It matches on 2nd position in the array since we all know that the first position is 0 in C/C++ programming languages.
Please donít forget to tune in frequently for the second half of this series. You wonít want to miss it. In the upcoming part, weíll discuss, tackle, and implement the Boyer-Moore algorithm, which is one of the best and is considered a standard. But have no fear, we are going to be able to understand the ďblack artĒ behind it.
There are strong reasons why Iíve chosen these two algorithms. Surely, both are classics and great to know. However, there is a significant difference between these two. The K-M-P is based on the computation of a deterministic finite automaton (DFA), which is sometimes called deterministic finite state machine (DFSM). After this stage, it starts searching for suffixes from left to right using, of course, the prefix.
On the other hand, Boyer-Moore approaches this situation on a totally different level. It begins searching from the end of the needle. Then, once the conditions arenít met, it is able to jump ahead an entire needle-length. As a result, the best case scenario turns out O(n/m) complexity, otherwise itís O(n). But Iím getting ahead of myself. All of this will be discussed in the next part, so tune in!
Until then, donít hesitate to join our friendly community of tech professionals, experts, and enthusiasts in our fields of expertise at DevHardware Forums. We debate the most provocative issues and solve hardware, software, and consumer electronics mysteries. You may also join the communities at any one of our sister-sites.
DISCLAIMER: The content provided in this article is not warranted or guaranteed by Developer Shed, Inc. The content provided is intended for entertainment and/or educational purposes in order to introduce to the reader key ideas, concepts, and/or product reviews. As such it is incumbent upon the reader to employ real-world tactics for security and implementation of best practices. We are not liable for any negative consequences that may result from implementing any information covered in our articles or tutorials. If this is a hardware review, it is not recommended to open and/or modify your hardware.