Introduction to the Java.util.regex Object Model - public static Pattern compile(String regex, int flags) Throws a PatternSyntaxException
(Page 2 of 12 )
The (String regex, int flags) method is a more powerful form of the compile(String) method. The first parameter for this method, regex, is a String that represents a regular expression, as detailed in the Pattern.compile(String regex) method presented earlier. For details on how you must format the String parameter, please see the “public static Pattern compile(String regex) Throws a PatternSyntaxException” section.
The flexibility of this compile method is fully realized by using the second parameter, int flags. The int flags parameter can consist of the following flags or a bit mask created by OR-ing combinations thereof:
- CANON_EQ
- CASE_INSENSTIVE
- COMMENTS
- DOTALL
- MULTILINE
- UNICODE_CASE
- UNIX_LINES
For example, if you want a match to be successful regardless of the case of the candidate String, then your pattern might look like the following:
Pattern p = Pattern.compile regex,Pattern.CASE_INSENSITIVE);
You can combine the flags by using the | operator. For example, to achieve case-insensitive Unicode matches that include a comment, you might use the following:
Pattern p =
Pattern.compile("t # a compound flag example",
Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE|
Pattern.COMMENT);
The compile(String regex, int flags) method returns a Pattern object.
public String pattern()
This method returns a simple String representation of the regex compiled. It can sometimes be misleading in two ways. First, the string that’s returned doesn’t reflect any flags that were set when the pattern was compiled. Second, the regex String you pass in isn’t always the pattern String you get back out. Specifically, the original String delimitations aren’t shown. Thus, if your original code was this:
Pattern p = Pattern.compile("\\d");
you should expect your output to be \d, with a single \character.
A question naturally arises here: If this method strips out the original delimiting, can you use the resulting String as a regular expression to feed another expression? For example, does Listing 2-2 work?
Listing 2-2. Pattern Matching Example
import java.util.regex.*;
public class PatternMethodExample{
public static void main(String args[]){
reusePatternMethodExample();
}
public static void reusePatternMethodExample(){
//match a single digit
Pattern p = Pattern.compile(\\d);
Matcher matcher = p.matcher("5");
boolean isOk = matcher.matches();
System.out.println("original pattern matches " + isOk);
//recycle the pattern
String tmp = p.pattern();
Pattern p2 = Pattern.compile(tmp);
matcher = p.matcher("5");
isOk = matcher.matches();
System.out.println("second pattern matches " + isOk);
}
}
Will this method throw a RuntimeException? After all, the pattern()method returns \d, and an attempt to create a regex pattern using \d as a String will fail to compile.
The answer is no, it won’t throw an exception. Remember that the doubling of the \character is a requirement of the String object’s constructor—it has nothing to do with the regex pattern that the String represents. Thus, once the String is created, the conflict disintegrates.
public Matcher matcher(CharSequence input) Remember that you create a Pattern object by compiling a description of what you’re looking for. A Pattern is a bit like a personal ad: It lists the features of the thing you’re looking for. Speaking purely conceptually, your patterns might look like the following:
Pattern p = Pattern.compile("She must have red hair, and a temper");
Correspondingly, you’ll need to compare that description against candidates. That is, you’ll want to examine a given String to see if it matches the description you provided.
The Matcher object is designed specifically to help you do this sort of interrogation. I discuss Matcher in detail in the next major section of this chapter, but for now you should know that the Pattern.matcher(CharSequence input) method returns the Matcher that will help get details about how your candidate String compares with the description you passed in.
Pattern.matcher(CharSequence input) takes a CharSequence parameter as an input parameter. CharSequence is a new interface introduced in J2SE 1.4 and retroactively implemented by the String object. Because String implements CharSequence, you can simply pass a String object as the parameter to the Pattern.matcher(CharSequence input) method. I discuss the CharSequence parameter in detail shortly.
In the preceding example, again speaking purely conceptually, you might get your Matcher object as follows:
Matcher m = pattern.matches("Anna");
In J2SE, this Matcher object’s matches() would return true. In real life, YMMV.
public int flags() Earlier I discussed the constant flags that you can use in compiling your regex pattern. The flags method simply returns an int that represents those flags. For example, to see whether your Pattern class is currently using a given flag (say, the Pattern.COMMENTS flag), simply extract the flag:
int flgs = myPattern.flags();
then “and” (&) that flag to the Pattern.COMMENTS flag:
boolean isUsingCommentFlag =( Pattern.COMMENTS == (Pattern.COMMENTS & flgs)) ;
Similarly, to see if you’re using the CASE_INSENSITIVE flag, use the following code:
boolean isUsingCaseInsensitiveFlag =
(Pattern.CASE_INSENSITIVE == (Pattern. CASE_INSENSITIVE & flgs));
public static boolean matches(String regex,CharSequence input)
Very often, you’ll find that all you need to know about a String is whether it matches a given regular expression exactly. You don’t want to have to create a Pattern object, extract its Matcher object, and interrogate that Matcher.
This static utility method is designed to do exactly that. Internally, it creates the Pattern and Matcher objects you need, compares the regex to the input String, and returns a boolean that tells you whether the two match exactly. Listing 2-3 presents an example of its use.
Listing 2-3. Matches Example
import java.util.regex.*;
public class PatternMatchesTest{
public static void main(String args[]){
String regex = "ad*";
String input = "add";
boolean isMatch = Pattern.matches(regex,input);
System.out.println(isMatch);//return true
}
}
If you’re going to do a lot of comparisons, then it’s more efficient to explicitly create a Pattern object and do your matches manually. However, if you aren’t going to do a lot of comparisons, then matches is a handy utility method.
The Pattern.matches(String regex, CharSequence input) method is also used internally by the String class. As of J2SE 1.4, String has a new method called matches that internally defers to the Pattern.matches method. You might already be using this method without being aware of it.
Of course, this method can throw a PatternSyntaxException if the regex pattern under consideration isn’t well formed.
Next: public String[] split(CharSequence input) >>
More Java Articles
More By Apress Publishing
|
This article is excerpted from chapter three of Java Regular Expressions Taming the Java.util.regex Engine, written by Mehran Habibi (Apress, 2004; ISBN: 1590591070). Check it out at your favorite bookstore. Buy this book now.
|
|