Java
  Home arrow Java arrow Page 5 - Regular Expressions
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
JAVA

Regular Expressions
By: Apress Publishing
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 10
    2005-07-28

    Table of Contents:
  • Regular Expressions
  • Creating Patterns
  • Common and Boundary Characters
  • Character Classes
  • Back References
  • Integrating Java with Regular Expressions
  • Confirming Name Formats Example
  • Finding Duplicate Words Example
  • Regular Expression Operations
  • Search and Replace
  • Comparing Regex and Perl

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Regular Expressions - Back References


    (Page 5 of 11 )

    Back references are one of the most powerful features offered by regular expressions. Unfortunately, programmers often skip over them because they’re not explained well in the regular expression literature. That’s a mistake I hope to rectify here.

    Back references allow a pattern to refer back to parts of itself. They always refer back to groups that were enclosed by the “(” and the “)”characters. Table 1-17 presents the syntax for back references.

    Table 1-17. Back References  

    Regex

    Description

    \1

    The first group in the pattern

    \2

    The second group in the pattern

    \n

    The nth group in the pattern

    NOTE   There are some idiosyncratic behaviors associated with how back references work in Java, which I explain later in this chapter and in Chapter 3. For right now, you have enough information on back references to get started.

    Back References Example

    Say you need to find matches in which a word is duplicated. That is, you don’t know what the word you’re looking for is, but you want to be alerted when the same word is repeated twice in a row. If you’ve used a word processor such as Microsoft Word, you’ll notice that the application does this automatically. Let’s explore how you might do this in Java.

    You’ll use the pattern \b(\w+) \1\b, which is dissected in Table 1-18. This pattern matches pizza pizza, Faster pussycat kill kill, or Never Never Never Never Never because each contains a word that’s immediately repeated. It won’t match 222 2222, sara sarah, or Faster pussycat kill, kill because these don’t contain a word that’s immediately repeated. The latter group won’t match because 222 2222has a lingering 2 in the second set, sara sarah has a lingering h in the second word, and in Faster pussycat kill, kill the second kill is separated from the first by a comma. 

    Table 1-18. The Pattern face="courier new, courier, mono" size=2>\b(\w+) \1\b

    A

    Regex

    Description

    \b

    A word boundary

    (

    Followed by a group consisting of

    \w

    Any alphanumeric character

    +

    Repeated one for more times

    )

    Close group

    <space>

    Followed by a space

    \1

    Followed by the exact group of characters captured previously a

    \b

    Followed by a word boundary

    * In English: Look for a word boundary, followed by a group of alphanumeric characters, followed by a space, followed by the exact same group of alphanumeric characters found previously, followed by a word boundary. In short, look for duplicate words.

     

     

    In the next section, you’ll examine some practical examples with corresponding Java code.

    More Java Articles
    More By Apress Publishing


     

    Buy this book now. This article is excerpted from Java Regular Expressions: Taming the java.util.regex Engine, written by Mehran Habibi (Apress, 2004; ISBN: 1590591070). Check it out at your favorite bookstore. Buy this book now.

    JAVA ARTICLES

    - Deploying Multiple Java Applets as One
    - Deploying Java Applets
    - Understanding Deployment Frameworks
    - Database Programming in Java Using JDBC
    - Extension Interfaces and SAX
    - Entities, Handlers and SAX
    - Advanced SAX
    - Conversions and Java Print Streams
    - Formatters and Java Print Streams
    - Java Print Streams
    - Wildcards, Arrays, and Generics in Java
    - Wildcards and Generic Methods in Java
    - Finishing the Project: Java Web Development ...
    - Generics and Limitations in Java
    - Getting Started with Java Web Development in...






    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 1 hosted by Hostway
    Stay green...Green IT