Introduction to the Java.util.regex Object Model - public int start(int group)
(Page 5 of 12 )
This method allows you to specify which subgroup within a match you’re interested in. If there are no matches, or if no matches have been attempted, this method throws an IllegalStateException. Listing 2-10 demonstrates the use of the start(int) method shortly. But before examining the code, let’s take a step back and consider what the code is actually trying to demonstrate.
In the following example, the regex pattern is B(ond), which means that you have a subgroup within the pattern (the parentheses indicate a subgroup). The following is the portion of the candidate parsed when find() is called for the first time:

Thus, when you call the start(0) method, you’re implicitly calling it only for the region that has already been parsed, which is outlined in the box. As far as the Matcher is concerned, this boxed region is the only one we can currently discuss. This is simply the nature of the find method, and it has nothing to do with the start(int) method yet.
The start(0) method returns the index of the first character in group(0), which is the B in Bond. group(0) is circled in the following image.

Similarly, when you call start(1), you’re calling it only for the region that has already been parsed—again, the boxed region in the preceding image. This time, you’re asking for the second grouping in the parsed region. The start(1) method returns the index of the first character in group(1), which is the o in Bond. group(1) is circled in the following image:

Next, you call matcher.find() again, which results in a new region of the candidate string coming under consideration, as shown in the following image:

Calling the start(0) method here implicitly calls it only for the new region that has already been parsed, which appears in the box in the preceding image. This is the only region the associated Matcher will consider. start(0) returns the index of first character in group(0), which is the B in Bond. group(0) is circled in the following image:

Again, calling start(1) asks the Matcher to consider only the new region that has been parsed—again, the boxed region. This time, you’re asking for the second grouping in the parsed region. start(1) returns the index of first character in group(1), which is the o in Bond. group(1) is circled in the boxed region.

When you consider the process visually, it’s easy to understand how the start(int) method interacts with groups, group numbers, and the find() method. find() parses just enough of the candidate string for all groups to match and works in that limited region. Keep this in mind as you read through Listing 2-10. Listing 2-10 is a fully working example of the algorithm discussed in this section. Please refer back to the preceding images as necessary when you read the example.
Listing 2-10. Matcher.start(int) Example
import java.util.regex.*;
/**
* Demonstrates the usage of the
* Matcher.start(int) method
*/
public class MatcherStartParamExample{
public static void main(String args[]){
test();
}
public static void test(){
//create a Pattern
Pattern p = Pattern.compile("B(ond)");
//create a Matcher and use the Matcher.start(int) method
String candidateString = "My name is Bond. James Bond.";
//create a helpful index for the sake of output
String matchHelper[] =
{" ^",
" ^",
" ^",
" ^"};
Matcher matcher = p.matcher(candidateString);
//Find the starting point of the first 'B(ond)'
matcher.find();
int startIndex = matcher.start(0);
System.out.println(candidateString);
System.out.println(matchHelper[0] + startIndex);
//find the starting point of the first subgroup (ond)
int nextIndex = matcher.start(1);
System.out.println(candidateString);
System.out.println(matchHelper[1] + nextIndex);
//Find the starting point of the second 'B(ond)'
matcher.find();
startIndex = matcher.start(0);
System.out.println(candidateString);
System.out.println(matchHelper[2] + startIndex);
//find the starting point of the second subgroup (ond)
nextIndex = matcher.start(1);
System.out.println(candidateString);
System.out.println(matchHelper[3] + nextIndex);
}
}
Output 2-5 shows the output of running the start() method.
Output 2-5. Output for the Matcher.start(int) Example
--------------------------------------------------------------------My name is Bond. James Bond.
^11
My name is Bond. James Bond.
^12
My name is Bond. James Bond.
^23
My name is Bond. James Bond.
^24
If you execute another find() method
matcher.find();
and then execute start()
int nonIndex = matcher.start(0); //throws IllegalStateException
the start(int) method will throw an IllegalStateException because the find() method wasn’t successful. Similarly, it will throw an IndexOutOfBoundsException if you try to refer to a group number that doesn’t exist.
public int end() The end method returns the ending index of the last successful match the Matcher object had plus 1. If no matches exist, or if no matches have been attempted, this method throws an IllegalStateException. Listing 2-11 demonstrates the use of the end method.
Listing 2-11. Matcher.end() Example
/**
* Demonstrates the usage of the
* Matcher.end() method
*/
public class MatcherEndExample{
public static void main(String args[]){
test();
}
public static void test(){
//create a Matcher and use the Matcher.end() method
String candidateString = "My name is Bond. James Bond.";
String matchHelper[] =
{" ^"," ^"};
Pattern p = Pattern.compile("Bond");
Matcher matcher = p.matcher(candidateString);
//Find the end point of the first 'Bond'
matcher.find();
int endIndex= matcher.end();
System.out.println(candidateString);
System.out.println(matchHelper[0] + endIndex);
//Find the end point of the second 'Bond'
matcher.find();
int nextIndex = matcher.end();
System.out.println(candidateString);
System.out.println(matchHelper[1] + nextIndex);
}
}
Output 2-6 shows the output of running the end method.
Output 2-6. Output for the Matcher.end() Example
-------------------------------------------------------------------My name is Bond. James Bond.
^15
My name is Bond. James Bond.
^27
If you execute another find method
matcher.find();
and then execute end
int nonIndex = matcher.end(); //throws IllegalStateException
the end method will throw an IllegalStateException, because there isn’t a valid group to find the end of.
Next: public int end(int group) >>
More Java Articles
More By Apress Publishing
|
This article is excerpted from chapter three of Java Regular Expressions Taming the Java.util.regex Engine, written by Mehran Habibi (Apress, 2004; ISBN: 1590591070). Check it out at your favorite bookstore. Buy this book now.
|
|