Home arrow Java arrow The Mechanics of Expression Processing

The Mechanics of Expression Processing

This article, the first of a four-part series, discusses how a regular expression engine works. It is excerpted from chapter four of the book Mastering Regular Expressions, Third Edition, written by Jeffrey E.F. Friedl (O'Reilly, 2006; ISBN: 0596528124). Copyright © 2006 O'Reilly Media, Inc. All rights reserved. Used with permission from the publisher. Available from booksellers or direct from O'Reilly Media.

Author Info:
By: O'Reilly Media
Rating: 5 stars5 stars5 stars5 stars5 stars / 2
December 28, 2006
  1. · The Mechanics of Expression Processing
  2. · Regex Engine Types
  3. · Rule 1: The Match That Begins Earliest Wins
  4. · Rule 2: The Standard Quantifiers Are Greedy

print this article

The Mechanics of Expression Processing
(Page 1 of 4 )

The previous chapter started with an analogy between cars and regular expressions. The bulk of the chapter discussed features, regex flavors, and other “glossy brochure” issues of regular expressions. This chapter continues with that analogy, talking about the all-important regular-expression engine, and how it goes about its work.

Why would you care how it works? As we’ll see, there are several types of regex engines, and the type most commonly used—the type used by Perl, Tcl, Python, the .NET languages, Ruby, PHP, all Java packages I’ve seen, and more—works in such a way that how you craft your expression can influence whether it can match a particular string, where in the string it matches, and how quickly it finds the match or reports the failure. If these issues are important to you, this chapter is for you.

Start Your Engines!

Let’s see how much I can milk this engine analogy. The whole point of having an engine is so that you can get from Point A to Point B without doing much work. The engine does the work for you so you can relax and enjoy the sound system. The engine’s primary task is to turn the wheels, and how it does that isn’t really a concern of yours. Or is it?

Two Kinds of Engines

Well, what if you had an electric car? They’ve been around for a long time, but they aren’t as common as gas cars because they’re hard to design well. If you had one, though, you would have to remember not to put gas in it. If you had a gasoline engine, well, watch out for sparks! An electric engine more or less just runs, but a gas engine might need some babysitting. You can get much better performance just by changing little things like your spark plug gaps, air filter, or brand of gas. Do it wrong and the engine’s performance deteriorates, or, worse yet, it stalls.

Each engine might do its work differently, but the end result is that the wheels turn. You still have to steer properly if you want to get anywhere, but that’s an entirely different issue.

New Standards

Let’s stoke the fire by adding another variable: the California Emissions Standards.Some engines adhere to California’s strict pollution standards, and some engines don’t. These aren’t really different kinds of engines, just new variations on what’s already around. The standard regulates a result of the engine’s work, the emissions, but doesn’t say anything about how the engine should go about achieving those cleaner results. So, our two classes of engine are divided into four types: electric (adhering and non-adhering) and gasoline (adhering and non-adhering).

Come to think of it, I bet that an electric engine can qualify for the standard without much change—the standard just “blesses” the clean results that are
already par for the course. The gas engine, on the other hand, needs some major tweaking and a bit of re-tooling before it can qualify. Owners of this kind of engine need to pay particular care to what they feed it—use the wrong kind of gas and you’re in big trouble.

The impact of standards

Better pollution standards are a good thing, but they require that the driver exercise more thought and foresight (well, at least for gas engines). Frankly, however, the standard doesn’t impact most people since all the other states still do their own thing and don’t follow California’s standard.

So, you realize that these four types of engines can be classified into three groups (the two kinds for gas, and electric in general). You know about the differences, and that in the end they all still turn the wheels. What you don’t know is what the heck this has to do with regular expressions! More than you might imagine.

blog comments powered by Disqus

- Java Too Insecure, Says Microsoft Researcher
- Google Beats Oracle in Java Ruling
- Deploying Multiple Java Applets as One
- Deploying Java Applets
- Understanding Deployment Frameworks
- Database Programming in Java Using JDBC
- Extension Interfaces and SAX
- Entities, Handlers and SAX
- Advanced SAX
- Conversions and Java Print Streams
- Formatters and Java Print Streams
- Java Print Streams
- Wildcards, Arrays, and Generics in Java
- Wildcards and Generic Methods in Java
- Finishing the Project: Java Web Development ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 

Developer Shed Affiliates


© 2003-2018 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials