Development Cycles
  Home arrow Development Cycles arrow Page 3 - Pattern Matching Algorithms Demystified: K...
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
DEVELOPMENT CYCLES

Pattern Matching Algorithms Demystified: KMP
By: Barzan "Tony" Antal
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 8
    2008-07-29

    Table of Contents:
  • Pattern Matching Algorithms Demystified: KMP
  • The Theory
  • C Implementation
  • Taking a Break

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Pattern Matching Algorithms Demystified: KMP - C Implementation


    (Page 3 of 4 )

    First things first, I want to make it clear that there are numerous K-M-P implementations. Therefore, the solution you’re going to read here isn’t the only one. You can surely find others with a quick Google search. The coding language is C++ and it is cross-platform compatible. I’ve compiled it with Visual Studio 2005.

    I have opted for the “string” C++ objects. Moreover, we are reading from the stdin and printing out to the stdout. Check out the following code snippet. It’s the “main” section of the program. Notice the necessary header files as well as the prototypes of the function and the method we’re going to use later on. It’s straight-forward.

    #include <iostream>

    #include <stdlib.h>

    #include <string>

    using namespace std;

    int* genprefix(string);

    void kmp_match(string, string);

    int main()

    {

    string t;

    string p;

    cout << "Enter the T string: ";

    cin >> t;

    cout << "Enter the P string: ";

    cin >> p;

    kmp_match(t,p);

    return 0;

    }

    Here comes the part where the fun begins. The following function generates the prefix array. It receives the pattern string (p) as a parameter. The key parts of the code snippet are the while loop and the if section: “if (p[k] == p[q]) k++” since this is where the current substring is increased to the maximum length (if there’s a match).

    int* genprefix(string p)

    { // declaration and initialization of variables

    int *a;

    int m = (int)p.length();

    int k = 0;

    // we allocate dynamically the soon-to-be prefix array

    a = (int *)malloc(m*sizeof(int));

    a[0] = 0;

    // beginning of the prefix generation algorithm

    for(int q = 1; q < m; q++)

    {

    while (k > 0 && p[k] != p[q])

    k = a[k-1];

    if (p[k] == p[q])

    k++;

    a[q] = k;

    }

    // end of the prefix generation algorithm

    return a;

    }

    Finally, here’s where we are coding the searching algorithm. The method has two parameters: string t and string p - source text string and pattern string, respectively. The algorithm is intuitive since we are moving from left to right, taking into consideration the prefix array. Thanks to that said method we can eliminate lots of unnecessary comparisons and switches. We are jumping right to the next possible substrings.

    void kmp_match(string t, string p)

    { // declaration and initialization of variables

    int *a;

    int q = 0;

    int n = (int)t.length();

    int m = (int)p.length();

     

    // we generate the prefix array

    a = genprefix(p);

    // we print the content of the prefix array

    cout << endl << "Content of the prefix array: ";

    for (int i = 0; i < m; i++)

    cout << a[i] << " ";

    cout << endl;

     

    // beginning of the K-M-P algorithm

    for (int i = 0; i < n; i++)

    {

    while(q > 0 && p[q] != t[i])

    q = a[q-1];

    if (p[q] == t[i])

    q++;

    if (q == m)

    {

    cout << "Matches on " << i - m + 1 << " position." << endl;

    q = a[q-1];

    }

    }

    // end of the K-M-P algorithm

    }

    The above block of code also prints out the content of the prefix array. It will be useful by the time you learn how the algorithm works. All in all, if you would like to download the complete source code with the compiled executable, then just click on the button below. It is archived in ZIP format.

    More Development Cycles Articles
    More By Barzan "Tony" Antal


       · Thanks for reading this article; hope you've found it informative and...
       · thanks 4 the article. really found it informative....
     

    DEVELOPMENT CYCLES ARTICLES

    - Division of Large Numbers
    - Branch and Bound Algorithm Technique
    - Dynamic Programming Algorithm Technique
    - Genetic Algorithm Techniques
    - Greedy Strategy as an Algorithm Technique
    - Divide and Conquer Algorithm Technique
    - The Backtracking Algorithm Technique
    - More Pattern Matching Algorithms: B-M
    - Pattern Matching Algorithms Demystified: KMP
    - Coding Standards
    - A Peek into the Future: Transactional Memory
    - Learning About the Graph Construct using Gam...
    - Learning About the Graph Construct using Gam...
    - Learning About the Graph Construct using Gam...
    - How to Strike a Match







    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 4 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek