Home arrow Java arrow Page 7 - Introduction to the Java.util.regex Object Model
JAVA

Introduction to the Java.util.regex Object Model


If you have ever wanted to know all about the Pattern and Matcher classes of Java's new java.util.regex package, this article is an excellent place to start. It is taken from chapter 2 of the book Java Regular Expressions: Taming the java.util.regex Engine, written by Mehran Habibi (Apress, 2004; ISBN: 1590591070).

Author Info:
By: Apress Publishing
Rating: 4 stars4 stars4 stars4 stars4 stars / 15
August 18, 2005
TABLE OF CONTENTS:
  1. · Introduction to the Java.util.regex Object Model
  2. · public static Pattern compile(String regex, int flags) Throws a PatternSyntaxException
  3. · public String[] split(CharSequence input)
  4. · The Matcher Object
  5. · public int start(int group)
  6. · public int end(int group)
  7. · public String group(int group)
  8. · public boolean find()
  9. · public Matcher appendReplacement (StringBuffer sb, String replacement)
  10. · Special Notes
  11. · New String Rejex-Friendly Methods
  12. · Summary

print this article
SEARCH DEVARTICLES

Introduction to the Java.util.regex Object Model - public String group(int group)
(Page 7 of 12 )

This method is a more powerful counterpart to the group() method. It allows you to extract parts of a candidate String that match a subgroup within your pattern. The use of the group(int) method is demonstrated shortly in Listing 2-14.

In the following example, the regex pattern is again B(ond), which means you have a subgroup within the pattern. The portion the candidate parsed when find() is called for the first time is shown here:

 

Thus, when you call the group(0) method, youíre implicitly calling it only for the region that has already been parsed, which is boxed in the preceding image. As far as the Matcher is currently concerned, this boxed region is the only one we can discuss.

Calling group(0) returns Bond because thatís the first group that matches your criteria in the region of the candidate String currently under inspection. Again, that area is shown in the box in the preceding image. The actual matching group is shown in the following image:

 

Similarly, when you call group(1), youíre calling it only for the region that has already been parsedóagain, the boxed area. This time, youíre asking for the second grouping in the parsed region. group(1) is circled in the following image:

 

Next, you call matcher.find() again, which results in a new region of the candidate String coming under inspection, as shown here:

 

Calling the group(0) method implicitly calls it only for the new region that has already been parsed, which is boxed in the preceding image. The group(0) method returns the String Bond. group(0) is circled in the following image:

 

Calling group(1) only considers the new region that been parsedóagain, the boxed region. Within that region, group(1) refers to ond. group(1) is circled in the following image:

 

Listing 2-14 presents an example using the group(int) method, and Output 2-8 shows the output of this example.

Listing 2-14. Matcher.group(int) Method Example

import java.util.regex.*;
/**
 
* Demonstrates the usage of the
 * Matcher.group(int) method
 */
public class MatcherGroupParamExample{
  public static void main(String args[]){
     
test();
  }
  public static void test(){
   
//create a Pattern
     Pattern p = Pattern.compile("B(ond)");
   
//create a Matcher and use the Matcher.group(int) method
    String candidateString = "My name is Bond. James Bond.";
    //create a helpful index for the sake of output
    Matcher matcher = p.matcher(candidateString);
    //Find group number 0 of the first find
    
matcher.find();
     String group_0 = matcher.group(0);
     String group_1 = matcher.group(1);
     System.out.println("Group 0 " + group_0); 
     System.out.println("Group 1 " + group_1);
     System.out.println(candidateString);
   
//Find group number 1 of the second find
     matcher.find();
     group_0 = matcher.group(0);
     group_1 = matcher.group(1);
     System.out.println("Group 0 " + group_0);
     System.out.println("Group 1 " + group_1);
     System.out.println(candidateString);
 
}
}

Output 2-8. Output of the Matcher.Group(int) Example

------------------------------------------------------------------My name is Bond. James Bond.
Group 0 Bond
Group 1 ond
My name is Bond. James Bond.
Group 0 Bond
Group 1 ond

If you execute another find() method

matcher.find();

and then execute group(0)

String tmp = matcher.group(0); //throws IllegalStateException

the group(0) method will throw an IllegalStateException because the find method call wasnít successful. Similarly, it will throw an IllegalStateException if find hadnít been called at all. If you try to refer to a group number that doesnít exist, it will throw an IndexOutOfBoundsException.

public int groupCount()

This method simply returns the number of groups that the Pattern defined. In Listing 2-15, the groupCount method displays the number of possible groups a given pattern might have.

Listing 2-15. MatcherGroupCountExample Example

import java.util.regex.*;
/**
 
* Demonstrates the usage of the
 
* Matcher.groupCount() method
 */
public class MatcherGroupCountExample{
  public static void main(String args[]){
      test();
  }
  public static void test(){
    //create a Pattern Pattern
    p = Pattern.compile("B(ond)");
   
//create a Matcher and use the Matcher.group() method
    String candidateString = "My name is Bond. James Bond.";
    Matcher matcher = p.matcher(candidateString);
   
//extract the possible number of groups.
   
//It's important to be aware that this
   
//represents only the number of groups that
   
//are possible: not the actual number of groups
   
//found in the candidate string
   
int numberOfGroups = matcher.groupCount();
   
System.out.println("numberOfGroups ="+numberOfGroups);
 
}
}

Thereís a very important, and somewhat counterintuitive, subtlety to notice about this method. It returns the number of possible groups based on the original Pattern, without even considering the candidate String. Thus, itís not really information about the Matcher object; rather, itís information about the Pattern that helped spawn it. This can be tricky, because the fact that this method lives on the Matcher object could be interpreted to mean that itís providing feedback about the state of the Matcher. It just isnít. Itís telling you how many matches are theoretically possible for the given Pattern.

public boolean matches()

This method is designed to help you match a candidate String against the Matcherís Pattern. If it returns true ifóand only ifóthe candidate String under consideration matches the pattern exactly.

Listing 2-16 demonstrates how you might use this method. Three strings, j2se, J2SE, and J2SE (notice the space after the E), are compared to the Pattern J2SE.

Listing 2-16. Matcher.matches Example

    import java.util.regex.*;
/**
 
* Demonstrates the usage of the
 
* Matcher.matches method
 */
public class MatcherMatchesExample{
  public static void main(String args[]){
      test();
  }
  public static void test(){
    
//create a Pattern
      Pattern p = Pattern.compile("J2SE");
   
//create the candidate Strings
    String candidateString_1 = "j2se";
    String candidateString_2 = "J2SE ";
    String candidateString_3 = "J2SE";
   
//Attempt to match the candidate Strings.
    Matcher matcher_1 = p.matcher(candidateString_1); 
    Matcher matcher_2 = p.matcher(candidateString_2);
    Matcher matcher_3 = p.matcher(candidateString_3);
   
//display the output for first candidate
    String msg = ":" + candidateString_1 + ": matches?: ";
    System.out.println( msg + matcher_1.matches());
   
//display the output for second candidate
    msg = ":" + candidateString_2 + ": matches?: ";
    System.out.println(msg + matcher_2.matches());
   
//display the output for third candidate
    msg = ":" + candidateString_3 + ": matches?: ";
    System.out.println(msg + matcher_3.matches());
 
}
}

Only one of the three candidates successfully matches here. j2se is rejected because it is the wrong case. J2SE is again rejected because it contains a space character after the E, which means that it isnít a perfect match. The only perfect match is J2SE.


blog comments powered by Disqus
JAVA ARTICLES

- Java Too Insecure, Says Microsoft Researcher
- Google Beats Oracle in Java Ruling
- Deploying Multiple Java Applets as One
- Deploying Java Applets
- Understanding Deployment Frameworks
- Database Programming in Java Using JDBC
- Extension Interfaces and SAX
- Entities, Handlers and SAX
- Advanced SAX
- Conversions and Java Print Streams
- Formatters and Java Print Streams
- Java Print Streams
- Wildcards, Arrays, and Generics in Java
- Wildcards and Generic Methods in Java
- Finishing the Project: Java Web Development ...

Watch our Tech Videos 
Dev Articles Forums 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Contact Us 
Site Map 
Privacy Policy 
Support 

Developer Shed Affiliates

 




© 2003-2017 by Developer Shed. All rights reserved. DS Cluster - Follow our Sitemap
Popular Web Development Topics
All Web Development Tutorials