Java
  Home arrow Java arrow Page 6 - Introduction to the Java.util.regex Object...
Dev Articles Forums 
ADO.NET  
Apache  
ASP  
ASP.NET  
C#  
C++  
ColdFusion  
COM/COM+  
Delphi-Kylix  
Design Usability  
Development Cycles  
DHTML  
Embedded Tools  
Flash  
Graphic Design  
HTML  
IIS  
Interviews  
Java  
JavaScript  
MySQL  
Oracle  
Photoshop  
PHP  
Reviews  
Ruby-on-Rails  
SQL  
SQL Server  
Style Sheets  
VB.Net  
Visual Basic  
Web Authoring  
Web Services  
Web Standards  
XML  
Dedicated Servers  
Moblin 
JMSL Numerical Library 
IBM® developerWorks 
Sun Developer Network 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
JAVA

Introduction to the Java.util.regex Object Model
By: Apress Publishing
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 7
    2005-08-18

    Table of Contents:
  • Introduction to the Java.util.regex Object Model
  • public static Pattern compile(String regex, int flags) Throws a PatternSyntaxException
  • public String[] split(CharSequence input)
  • The Matcher Object
  • public int start(int group)
  • public int end(int group)
  • public String group(int group)
  • public boolean find()
  • public Matcher appendReplacement (StringBuffer sb, String replacement)
  • Special Notes
  • New String Rejex-Friendly Methods
  • Summary

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Introduction to the Java.util.regex Object Model - public int end(int group)


    (Page 6 of 12 )

    Like the start(int) method, this method allows you to specify which subgroup within a matching you’re interested in. It returns the last index of the matching character sequence plus 1. Listing 2-12 demonstrates the usage of the end(int) method shortly.

    In the following example, the regex pattern is B(on)d, which means you have a subgroup within the pattern. The area that has been examined by the Matcher after find() is initially called is highlighted in the box shown in the following image:

     

    By calling the end(0) method, you’re implicitly calling it only for the region that has already been parsed, which is boxed in the preceding image. As far as the Matcher is currently concerned, this boxed region is the only one we can discuss at present.

    The end(0) method returns the index of last character in group(0) plus 1. Remember that group(0) is the entire expression B(on)d. In this region, the last character is the d in Bond, which is at position 14. Because end(int) adds 1 to that last index, 15 is returned. group(0) is circled in the following image:

     

    Similarly, when you call end(1), you’re calling it only for the region that has already been parsed—again, the boxed region. This time, you’re asking for the second grouping in that region. The end(1) method returns the index of the last character in group(1) plus 1. The last character in group(1) is the n in Bond, because the pattern is B(on)d, and the index of that n is 13. Because end adds 1 to the index, 14 is returned. group(1) is circled in the following image:

     

    Next, you call matcher.find() again, which results in a new region of the candidate String coming under consideration, as shown here:

     

    Calling the end(0) method implicitly calls it only for the new region that has already been parsed, which is boxed in the preceding image. The end(0) method returns the index of last character in group(0) plus 1, which is the d in Bond. The index of d is 26, and because end adds 1 to that number, 27 is returned. group(0) is circled in the following image:

     

    Calling end(1) only considers the new region that been parsed—again, the boxed region. This time, you’re asking for the second grouping in the parsed region. The end(1) method returns the index of last character in group(1) plus 1. That last character is the o in Bond, which is at index 25, as shown in the following image. Because end(int) adds 1 to that number, 26 in returned. The result of calling group(1) is as follows:

     

    Please refer back to the preceding images as necessary when you read Listing 2-12. The listing is simply a fully working example of the steps you just went through.

    Listing 2-12. Matcher.end(int) Example

    import java.util.regex.*;
    /**
     
    * Demonstrates the usage of the
     
    * Matcher.end(int) method
     */
    public class MatcherEndParamExample{
      public static void main(String args[]){ 
       
    test();
      }
      public static void test(){
       
    //create a Pattern
         Pattern p = Pattern.compile("B(on)d");
        //create a Matcher and use the Matcher.start(int) method
       
    String candidateString = "My name is Bond. James Bond.";
        //create a helpful index for the sake of output
        String matchHelper [] =
                               
    {"               ^",
                                 "              ^",
                                 "                       ^",
                                 "                      ^"};
       
    Matcher matcher = p.matcher(candidateString);
       
    //Find the end point of the first 'B(ond)'
         matcher.find();
         int endIndex = matcher.end(0);
         System.out.println(candidateString); 
         System.out.println(matchHelper[0] + endIndex);
        
    //find the end point of the first subgroup (ond)
         int nextIndex = matcher.end(1);
         System.out.println(candidateString); 
         System.out.println(matchHelper[1] + nextIndex);
       
    //Find the end point of the second 'B(ond)' 
         matcher.find();
         endIndex = matcher.end(0);
         System.out.println(candidateString); 
         System.out.println(matchHelper[2] + endIndex);
        
    //find the end point of the second subgroup (ond)
         nextIndex = matcher.end(1);
         System.out.println(candidateString);
         System.out.println(matchHelper[3] + nextIndex);
      
    }
    }

    Output 2-7 shows the output of running Listing 2-12.

    Output 2-7. Output for the Matcher.end(int) Example

    -------------------------------------------------------------------My name is Bond. James Bond.
                  ^15
    My name is Bond. James Bond.
                 ^14
    My name is Bond. James Bond.
                             
    ^27
    My name is Bond. James Bond.
                            ^26

    If you execute another find() method

    matcher.find();

    and then execute end()

    int nonIndex = matcher.end(0); //throws IllegalStateException

    the end(int) method will throw an IllegalStateException if the find method isn’t successful or if it isn’t called in the first place. Similarly, it will throw an IndexOutOfBoundsException if you try to refer to a group number that doesn’t exist.

    public String group()

    The group method can be a powerful and convenient tool in the war against jumbled code. It simply returns the substring of the candidate String that matches the original regex pattern. For example, say you want to extract occurrences of the pattern Bond

    Pattern p = Pattern.compile("Bond");

    from the candidate String  My name is Bond. James Bond.. You extract the Matcher

    Matcher matcher = p.matches("My name is Bond. James Bond.");

    and call find() on it.

    Matcher.find();

    Now the boxed region in the following image is ready to be scrutinized by the Matcher:

     

    You can now extract the part of the candidate String that matches your criteria by using the group() method:

    String tmp = matcher.group(); \\return "Bond";

    This method extracts the matching part of the region under consideration. That area is circled in the following image:

     

    A clumsier way of achieving the same result is to use the start and end methods to find the starting and ending indexes of the group within the candidate String, and use a String.substring method to extract that text.

    The group() method will throw an IllegalStateException if the find() method is unsuccessful or if it’s never initially called. Listing 2-13 presents a complete working example of this method and the algorithm discussed.

    Listing 2-13. The Matcher.group() Method

    import java.util.regex.*;
    /**
     * Demonstrates the usage of the
     * Matcher.group() method
     */
    public class MatcherGroupExample{
      public static void main(String args[]){
         test();
      }
      public static void test(){
         
    //create a Pattern
          Pattern p = Pattern.compile("Bond");
         
    //create a Matcher and use the Matcher.group() method
          String candidateString = "My name is Bond. James Bond.";
          Matcher matcher = p.matcher(candidateString);
          //extract the group
          matcher.find();
          System.out.println(matcher.group());
     
    }
    }

    More Java Articles
    More By Apress Publishing


     

    Buy this book now. This article is excerpted from chapter three of Java Regular Expressions Taming the Java.util.regex Engine, written by Mehran Habibi (Apress, 2004; ISBN: 1590591070). Check it out at your favorite bookstore. Buy this book now.

    JAVA ARTICLES

    - Deploying Multiple Java Applets as One
    - Deploying Java Applets
    - Understanding Deployment Frameworks
    - Database Programming in Java Using JDBC
    - Extension Interfaces and SAX
    - Entities, Handlers and SAX
    - Advanced SAX
    - Conversions and Java Print Streams
    - Formatters and Java Print Streams
    - Java Print Streams
    - Wildcards, Arrays, and Generics in Java
    - Wildcards and Generic Methods in Java
    - Finishing the Project: Java Web Development ...
    - Generics and Limitations in Java
    - Getting Started with Java Web Development in...







    © 2003-2008 by Developer Shed. All rights reserved. DS Cluster 3 hosted by Hostway