Development Cycles
  Home arrow Development Cycles arrow More Pattern Matching Algorithms: B-M
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
DEVELOPMENT CYCLES

More Pattern Matching Algorithms: B-M
By: Barzan "Tony" Antal
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 3
    2008-08-05

    Table of Contents:
  • More Pattern Matching Algorithms: B-M
  • The Theory
  • Implementation
  • Final Words

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    More Pattern Matching Algorithms: B-M


    (Page 1 of 4 )

    This is the second and final half of our two-part series on pattern matching, or string searching algorithms. In the first part, we covered the Knuth-Morris-Pratt (KMP) algorithm and in this segment, we’re going to present a new algorithm that originates from Boyer-Moore. It is currently considered the most efficient and practical algorithm, serving as a benchmark standard.

    Before we begin, I’d like to suggest reading the first part of this series. You can find it published here on Dev Articles. It contains much of what you should know in order to fully grasp the new methodology of the Boyer-Moore exact pattern matching algorithm. This article will also follow the same scheme as the first one. We’ll start with the theory first.

    It all started back in 1977, when Bob Boyer and J. Strother Moore published their work. You can find the scanned copy of the original published abstract here; kudos to the Univ. of Texas (host). This algorithm surprised most people at that time because it approached the theory of string searching differently in that it works backwards, from right to left. And unlike some other algorithms, it preprocesses the pattern, not the source.

    Its preprocessing time is Θ(m + |Σ|) in complexity and its matching time is Ω(n / m) (best performance) or O(n) (worst). It performs 3n text comparisons on worst case. Worst case is limited only to non-periodic patterns. For a detailed overview of the asymptotic growth of functions and computational complexity theory, please check out this course from Jack Baskin School of Engineering, UC Santa Cruz.

    The efficiency of this algorithm lies in the fact that it does not inspect the source string (in which we are searching for a pattern) in its entirety. The preprocessing phase analyzes the pattern and by using a heuristic approach, it is able to reduce the number of comparisons altogether. The longer the pattern becomes, the fewer comparisons are to be done. Using the preprocessed table(s), the algorithm performs large jumps, which saves time.

    Compared to the Knuth-Morris-Pratt pattern matching algorithm, which we all know is a linear algorithm, Boyer-Moore’s variation is sub-linear. Usually, that is. This is mathematically proven in their official publication. Knuth also pointed out that the Boyer-Moore algorithm becomes linear in worst case. As a result, if it's efficiently implemented, it gives the best overall results considering complexity and resources.

    Let’s begin with the theory part first.

    More Development Cycles Articles
    More By Barzan "Tony" Antal


     

    DEVELOPMENT CYCLES ARTICLES

    - Branch and Bound Algorithm Technique
    - Dynamic Programming Algorithm Technique
    - Genetic Algorithm Techniques
    - Greedy Strategy as an Algorithm Technique
    - Divide and Conquer Algorithm Technique
    - The Backtracking Algorithm Technique
    - More Pattern Matching Algorithms: B-M
    - Pattern Matching Algorithms Demystified: KMP
    - Coding Standards
    - A Peek into the Future: Transactional Memory
    - Learning About the Graph Construct using Gam...
    - Learning About the Graph Construct using Gam...
    - Learning About the Graph Construct using Gam...
    - How to Strike a Match
    - Entity Relationship Modeling






    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 6 hosted by Hostway
    Stay green...Green IT