Home arrow Java arrow Page 8 - Introduction to the Java.util.regex Object Model
JAVA

Introduction to the Java.util.regex Object Model


If you have ever wanted to know all about the Pattern and Matcher classes of Java's new java.util.regex package, this article is an excellent place to start. It is taken from chapter 2 of the book Java Regular Expressions: Taming the java.util.regex Engine, written by Mehran Habibi (Apress, 2004; ISBN: 1590591070).

Author Info:
By: Apress Publishing
Rating: 4 stars4 stars4 stars4 stars4 stars / 15
August 18, 2005
TABLE OF CONTENTS:
  1. · Introduction to the Java.util.regex Object Model
  2. · public static Pattern compile(String regex, int flags) Throws a PatternSyntaxException
  3. · public String[] split(CharSequence input)
  4. · The Matcher Object
  5. · public int start(int group)
  6. · public int end(int group)
  7. · public String group(int group)
  8. · public boolean find()
  9. · public Matcher appendReplacement (StringBuffer sb, String replacement)
  10. · Special Notes
  11. · New String Rejex-Friendly Methods
  12. · Summary

print this article
SEARCH DEVARTICLES

Introduction to the Java.util.regex Object Model - public boolean find()
(Page 8 of 12 )

The find() method parses just enough of the candidate string to find a match. If such a substring is successfully found, then true is returned and find stops parsing the candidate. If no part of the candidate string matches the pattern, then find returns false.

Thus, for the pattern

Pattern p = Pattern.compile("Bond");

and candidate String My name is Bond. James Bond.

Matcher matcher = p.matcher("My name is Bond. James Bond");

calling find() parses My name is Bond. James Bond. until the substring My name is Bond meets the first Bond, as follows:

 

The boxed section is the part of the candidate that has been parsed; thus, itís the part that calls to the start, end, or group methods we will be concerned with. Why? Because the find method only had to parse up to d in Bond to find a match. Having accomplished that mission, the find method doesnít waste resources parsing the rest of the candidate String.

Calling find is a necessary preamble to using methods such as start, end, and group. Without first evoking find, calling these methods will cause an IllegalStateException to be thrown.

One common use of this method is as a control condition in a while loop, so that the start, end, or group method isnít called when they might throw an IllegalStateException. Listing 2-17 is an example of a simple regular expression that loops through the String I love Java. Java is my favorite language. Java Java Java. and finds the pattern Java.

Listing 2-17. Using the find() Method

import java.util.regex.*;
/**
 
* Demonstrates the usage of the
 
* Matcher.find method
 */
public class MatcherFindExample{
  public static void main(String args[]){
     
test();
  }
  public static void test(){
     //create a Pattern
      Pattern p = Pattern.compile("Java");
   
//create the candidate String
    String candidateString =
     "I love Java. Java is my favorite language. Java Java Java.";
   
//Attempt to match the candidate String.
    Matcher matcher = p.matcher(candidateString);
   
//loop through and display all matches
    while (matcher.find()){
      System.out.println(matcher.group());
    }
  }
}

In this example, the candidate String is

String candidateString =
"I love Java. Java is my favorite language. Java Java Java.";

When the while loop is fist entered, find() is immediately called on the Matcher, which results in the boxed area in the following image. Within that boxed region, the matching part of the region is circled, as shown in the images that follow.

The boxed area is the region parsed, and the circled part is the matching substring:

 

The boxed area is the next region parsed, and the circled part is the matching substring:

 

The boxed area is the next region parsed, and the circled part is the matching substring:

 

The boxed area is the next region parsed, and the circled part is the matching substring:

 

The boxed area is the next region parsed, and the circled part is the matching substring:

 

public boolean find(int start)

The find(int start) method works exactly like its overloaded counterpart, except for where it starts searching. The int parameter in start simply tells the Matcher at which character to start its search on.

Thus, for the candidate String I love Java. Java is my favorite language. Java Java Java. and the pattern Java, if you only want to start searching at character index 11, you use the command find(11). The area parsed is boxed in the following image, and the actual matching group is circled:

 

If the index given is greater than the length of the candidate string, then this method will throw an IndexOutOfBoundsException. Thus, for the preceding candidate string, calling find(58) will cause an IndexOutOfBoundsException, because the length of the string is only 57.

You can also use this method to set the start of the searching point. Thus, you could execute find(11) to start searching at character 11, and then use find(0) to start searching at character 0.

Listing 2-18 provides an example for the candidate String I hate mice. I really hate MICE. and the pattern MICE, in which the comparison is made is case insensitive. The code uses a case-insensitive comparison to demonstrate that the first match is, in fact, for the String that matches after character number 11.

Listing 2-18. Using the find(int) Method

import java.util.regex.*;
/**
 
* Demonstrates the usage of the
 
* Matcher.find(int) method
 */
public class MatcherFindParamExample{
  public static void main(String args[]){
     
test();
  }
 
public static void test(){
     //create a Pattern
      Pattern p = Pattern.compile("mice", Pattern.CASE_INSENSITIVE);
   
//create the candidate String
    String candidateString =
     "I hate mice. I really hate MICE.";
   
//Attempt to match the candidate String.
   
Matcher matcher = p.matcher(candidateString);
   
//display the latter match
   
System.out.println(candidateString);
   
matcher.find(11);
   
System.out.println(matcher.group());
   
//display the earlier match
   
System.out.println(candidateString);
   
matcher.find(0);
   
System.out.println(matcher.group());
 
}
}

When you execute the find(11) method, the search region starts character 11, as illustrated in the following image:

 

Next, you execute find(0), which moves the search index back to 0. The following image illustrates the resulting search region:

 

public boolean lookingAt()

The lookingAt() method is a more relaxed version of the matches method. It simply compares as little of the String against the Pattern as necessary to achieve a match. If such a subsection exists, then this method returns true.

Thus, for the pattern J2SE

Pattern = Pattern.compile("J2SE");

and the candidate J2SE is the only one for me

Matcher matcher_1 = Pattern.matcher("J2SE is the only one for me");

the lookingAt method returns true. However, calling lookingAt() for the candidate string For me, it's J2SE, or nothing at all

Matcher matcher_2 = Pattern.matcher("For me, it's J2SE, or nothing at all");

will return false, because the first part of For me, it's J2SE, or nothing at all doesnít match the pattern J2SE.

Like the matches method, the lookingAt method always starts looking at the candidate string at the beginning of the input sequence; unlike matches, the lookingAt method doesnít require that the entire input sequence be matched. If the match succeeds, then more information can be obtained by using the start, end, and group methods. Listing 2-19 provides an example of the lookingAt methodís use.

Listing 2-19. Using the lookingAt Method

import java.util.regex.*;
/**
 
* Demonstrates the usage of the
 
* Matcher.LookingAt method
 */
public class MatcherLookingAtExample{
  public static void main(String args[]){
    
test();
  }
  public static void test(){
    
//create a Pattern
      Pattern p = Pattern.compile("J2SE");
   
//create the candidate Strings
    String candidateString_1 = "J2SE is the only one for me";
    String candidateString_2 =
    
"For me, it's J2SE, or nothing at all";
    String candidateString_3 = "J2SEistheonlyoneforme";
   
//Attempt to match the candidate Strings.
    Matcher matcher = p.matcher(candidateString_1);
    //display the output for the candidate
    String msg = ":" + candidateString_1 + ": matches?: ";
    System.out.println( msg + matcher.lookingAt
());
    matcher.reset(candidateString_2);
    //display the output for the candidates
    msg = ":" + candidateString_2 + ": matches?: ";
    System.out.println( msg + matcher.lookingAt());
   
matcher.reset(candidateString_3);
    //display the output for the candidate
    msg = ":" + candidateString_3 + ": matches?: "; 
    System.out.println( msg + matcher.lookingAt());
   
/*
   
*returns
   
*:J2SE is the only one for me: matches?: true
   
*:For me, it's J2SE, or nothing at all: matches?: false
   
*:J2SEistheonlyoneforme: matches?: true
   
*/
 
}
}


blog comments powered by Disqus
JAVA ARTICLES

- Java Too Insecure, Says Microsoft Researcher
- Google Beats Oracle in Java Ruling
- Deploying Multiple Java Applets as One
- Deploying Java Applets
- Understanding Deployment Frameworks
- Database Programming in Java Using JDBC
- Extension Interfaces and SAX
- Entities, Handlers and SAX
- Advanced SAX
- Conversions and Java Print Streams
- Formatters and Java Print Streams
- Java Print Streams
- Wildcards, Arrays, and Generics in Java
- Wildcards and Generic Methods in Java
- Finishing the Project: Java Web Development ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials