Home arrow ASP arrow Page 3 - The Complete Regular Expression Guide

The Complete Regular Expression Guide

Thought regular expressions were too tough to master? If so, think again! In this article Jan runs you through everything you need to know, no matter which programming language you use!

Author Info:
By: Jan Borsodi
Rating: 5 stars5 stars5 stars5 stars5 stars / 58
December 27, 2002
  1. · The Complete Regular Expression Guide
  2. · Usage
  3. · Assertions
  4. · Wildcards
  5. · Conclusion

print this article

The Complete Regular Expression Guide - Assertions
(Page 3 of 5 )

The next type of meta-character is the assertion. These will match if a given assertion is true. The first pair of assertions are

^ and $

...which match the beginning of the line and the end of the line. Note that some regular expression implementations allow you to change their behavior so that they will instead match the beginning of the text and the end of the text. These assertions always match a zero length string, or in other words they match a position. For instance if you wrote this expression:


Then it would match any line that began with the word "The". The next assertion characters match at the beginning and end of a word, they are:

< and >

They come in handy when you want to match a word precisely, for instance:


...would match any of the following words:


One last thing to be said is that all literal characters are in fact assertions themselves. The difference between those and the ones above is that literal characters have a size. So for cleanliness sake, we only use the word assertions for those that are zero-width.

Groups and Alternation
One thing that you might have noticed when we explained quantifiers is that they only worked on the character to the left. Since this pretty much limits our expressions, I'll explain other uses for quantifiers. Quantifiers can also be used on meta-characters. Using them on assertions is silly since they are zero-width and matching one, two, three or more of them doesn't do us any good. However, the grouping and sequence meta-characters are perfect for being quantified. Let's first start with grouping.

You can form groups (or sub expressions as they are frequently called) by using the begin and end parenthesis characters:

( and )

The ( character starts the sub expression and the ) character ends it. It is also possible to have one or more sub expressions inside a sub expression. The sub expression will match if the contents match. So mixing this with quantifiers and assertions you can do something like this:

( ?ho)+

...which matches all of the following lines:

ho ho
ho ho ho

Another use for sub expressions is to extract a portion of the match if it matches. This is often used in conjunction with sequences which are discussed later.

Next up are alternations, which allow you to match on more than one pattern. The alternation character looks like this:


Here's how we can use it:


The regular expression above would match either Bill, Linus, Steve or Larry, and mixing this with sub expressions and quantifiers, we can do something like this:


...which matches any of the following words but no other combinations:


I mentioned earlier in the article that not all of the expressions must match for the match to be successful. This can happen when you're using sub expressions together with alternations. For example:

((Donald|Dolly) Duck)|(Scrooge McDuck)

As you see, only the left or right top sub expression will match, not both. This is sometimes handy when you want to run a complex pattern in one sub expression and if it fails try another one.

Lastly, we have sequences, which define sequences of characters that can match. Sometimes you don't want to match a word directly but would rather match something that resembles one. The sequence characters are

[ and ]

Any characters put inside the sequence brackets are treated as literal characters -- even meta-characters. The only special characters are the - which denotes character ranges, and the ^, which is used to negate a sequence (i.e. return a match for anything that doesn’t match the sequence).

For example,


...will match any small characters which are in the English alphabet (a to z). Another common sequence is


...which matches any small or capital characters in the English alphabet, as well as numbers. Sequences are also mixed with quantifiers and assertions to produce more elaborate searches. For example


...matches all whole words. This will match


...but will not match


Now, what if you wanted to find anything but complete words? The expression


...would find any sequences of characters which do not contain all alphanumeric characters (remember that ^ negates a match, i.e. returns true if the match failed).

Some implementations of regular expressions allow you to use shorthand versions for commonly used sequences. They are:

\d, a digit [0-9]
\D, a non-digit [^0-9]
\w, a word (alphanumeric) [a-zA-Z0-9]
\W, a non-word [^a-zA-Z0-9]
\s, a whitespace [ \t\n\r\f]
\S, a non-whitespace [^ \t\n\r\f]

blog comments powered by Disqus

- Central Scoreboard with Flash and ASP
- Calorie Counter Using WAP and ASP
- Creating PGP-Encrypted E-Mails Using ASP
- Be My Guest in ASP
- Session Replacement in ASP
- Securing ASP Data Access Credentials Using t...
- The Not So Ordinary Address Book
- Adding and Displaying Data Easily via ASP an...
- Sending Email From a Form in ASP
- Adding Member Services in ASP
- Removing Unconfirmed Members
- Trapping HTTP 500.100 - Internal Server Error
- So Many Rows, So Little Time! - Case Study
- XDO: An XML Engine Class for Classic ASP
- Credit Card Fraud Prevention Using ASP and C...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 

Developer Shed Affiliates


© 2003-2019 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials