Regular expressions in Java are used for pattern matching. Java has a built-in package java.util.regex that supports the use of regular expressions. It is used to validate user input, parse logs, or transform strings. In this guide, we will learn the key concepts of Java regular expressions, the core classes and interfaces involved, common patterns like character classes and quantifiers, and practical use cases with code examples.
Table of Contents:
What is a Regular Expression in Java?
Java Regular Expression (Java regex) is a sequence of characters that defines a search pattern, usually used for string matching or string manipulation. Java provides built-in support for regular expressions through the java.util.regex package. Java regex allows you to
- Validate inputs (like email, phone, date, etc.)
- Search within strings
- Replace substrings
- Split strings based on complex patterns
Rules of Writing Java Regex
Below are the rules to write a regular expression in Java.
- Java regex has special meaning, if you want to match them, you must escape them using double backslashes.
- Combine patterns like characters, anchors, quantifiers, and so on to maintain an accurate and consistent regex pattern.
- Use anchors like ^ and $ to restrict the match for the matching input, it will help you to avoid mismatched patterns.
- Empty patterns in regular expressions are invalid, hence, don’t use them, as they can lead to confusion.
Master Java Today - Accelerate Your Future
Enroll Now and Transform Your Future
Regex Classes and Interfaces in Java
The Java regex package includes the following classes and interfaces.
Regex Classes in Java
The java.util.regex package includes the following main classes:
1. Pattern class: The Pattern class in Java represents a compiled regular expression. It is used to define, compile, and match regex patterns of character sequences like a String. It is immutable and thread-safe, and is used with the Matcher class to match a pattern.
Methods of the Pattern class in Java regex are as follows:
Method |
Description |
static Pattern compile(String regex) |
This method compiles the given regular expression into a pattern. |
static Pattern compile(String regex, int flags) |
This method compiles the given regular expression into a pattern with the specified match flags. |
int flags() |
This method returns the match flags specified when this pattern was compiled. |
Matcher matcher(CharSequence input) |
It creates a matcher that can be used to match the given input against this pattern. |
static boolean matches(String regex, CharSequence input) |
Compiles the given regex and matches it against the given input. Returns true if the entire input matches. |
String pattern() |
It returns the regex string from which this pattern was compiled. |
static String quote(String s) |
This method returns a literal pattern String for the given input s. It escapes all regex meta characters. |
String[] split(CharSequence input) |
It splits the input sequence around matches of this pattern. |
String[] split(CharSequence input, int limit) |
It splits the input around matches of this pattern, with control over the number of splits via limit. |
String toString() |
This method returns the string representation of the pattern (same as pattern()). |
Example:
Output:
Explanation: The above Java program first compiles a regular expression using Pattern.compile(“\d+”), where \d+ is a regex that matches one or more digits. Then, a Matcher object is created with the matcher() method to apply this pattern to the input string “Order 12345”. If the digits are found in the find() method, then it is printed.
2. Matcher class: The Matcher class in Java is used to perform matching operations like searching, replacing, and verifying patterns on character sequences using a compiled Pattern. It works with the Pattern class and provides a method to find, match, and replace text using regex.
Methods of the Matcher class in Java regex are as follows:
Method |
Description |
find() |
This method searches for the next occurrence of the pattern in the input. Useful for finding multiple matches. |
find(int start) |
This method searches for the next occurrence starting from the given index. |
start() |
It returns the start index of the last match found using find(). |
end() |
It returns the index after the last character of the last match found. |
group() |
This method returns the exact substring that was matched by the previous find() operation. |
groupCount() |
This method returns the number of capturing groups in the pattern (excluding group 0, which is the entire match). |
matches() |
It returns true if the entire input string matches the pattern. It does not search for partial matches like find() does. |
Example:
Output:
Explanation: In the above example, the Matcher class is used to apply the regular expression by the object to a specific input string, Order 12345. The find() method is used to search the input string for the next substring that matches the pattern, i.e., one or more digits.
Regex Interface in Java
Java Regex has the MatchResult interface that provides only read-only access to the results of a regular expression match performed by the Matcher class. It provides a safe access to match groups and allows extraction of matched substrings, start and end indices without modifying the matcher.
Methods of the MatchResult interface in Java regex are as follows:
Method |
Description |
group() |
This method returns the matched substring. |
group(int i) |
This method returns the i-th matched group. |
start() |
It returns the start index of the last match. |
end() |
It returns the end index of the last match. |
groupCount() |
This method returns the number of capturing groups in the pattern. |
Example:
Output:
Explanation: In the above example, the MatchResult interface is used to find the matched string, the start index of the matched string, and its end index. The Matcher class implements MatchResult, which is used to find the above output.
Basic Regular Expression in Java
Below are some basic Java regex:
1. Literals
A literal is used to match the exact character or sequence of characters as written. When a literal is given in a regex, it is used to find the exact match of it. For example, the regex cat will only match the string “cat”, not “Cat”, “catalog”, or “scat”.
Example:
Output:
Explanation: In the above example, the literal input is matched with the pattern string; if the literal is matched, the output is displayed as match found with the matched; otherwise, no match found is displayed.
2. Character Classes
A character class is used to match exactly one character from a set of characters. It is defined using square brackets [ ]. For example,
Pattern |
Matches |
[aeiou] |
Any single vowel |
[0-9] |
Any single digit |
[A-Z] |
Any uppercase letter |
[a-zA-Z] |
Any uppercase or lowercase letter |
[abc123] |
Either a, b, c, 1, 2, or 3 |
[^abc] |
Any character except a, b, or c |
Example:
Output:
Explanation: In the above example, the digits from the string input are matched with the regex string, using a character class. The output is displayed as a digit found with the digit.
3. Predefined Character Classes
The predefined character classes are the shorthand notations in regular expressions that match common character sets like digits, white spaces, or word characters. They are the shortcuts to avoid writing long character classes like those shown above.
Pattern |
Matches |
Equivalent To |
d |
A digit |
[0-9] |
D |
A non-digit |
[^0-9] |
w |
A word character (letters, digits, underscore) |
[a-zA-Z0-9_] |
W |
A non-word character |
[^a-zA-Z0-9_] |
Example:
Output:
Explanation: In the above example, the string input is matched with the regex string using a character class. The output is displayed without whitespaces.
4. Quantifiers
Quantifiers in regex specify how many times a character, group, or character class must occur in a match, and are used for repeat patterns.
Quantifier |
Description |
Example |
* |
0 or more times |
a* |
+ |
1 or more times |
a+ |
? |
0 or 1 time |
a? |
{n} |
Exactly n times |
a{3} |
{n,} |
At least n times |
a{2,} |
{n,m} |
Between n and m times (inclusive) |
a{2,4} |
Example:
Output:
Explanation: In the above example, the string input is matched with the regex string using quantifiers to match 2 to 4 consecutive lowercase letters.
Get 100% Hike!
Master Most in Demand Skills Now!
5. Anchors
Anchors are special characters in regex that match the positions of the characters in an expression.
Anchor |
Description |
Matches |
^ |
Start of a line or string |
The match is at the beginning |
$ |
End of a line or string |
The match is at the end |
b |
Word boundary |
The match is at the edge of a word |
B |
Non-word boundary |
Match is inside a word (not at the edge) |
Example:
Output:
Explanation: In the above example, the string input is matched with the regex string using anchors that find the match of end at the end of the input.
6. Special Symbols
In regular expressions, certain characters have special meanings; they are meta-characters used to control pattern matching. If you want to match them as normal characters, you must escape them using a backslash ().
Note: In Java, since is also an escape character in strings, you need to double it as \\.
Symbol |
Meaning in Regex |
To match it literally, use |
. |
Matches any character except newline |
\. |
* |
0 or more repetitions |
\* |
+ |
1 or more repetitions |
\+ |
? |
0 or 1 repetition |
\? |
^ |
Start of string or line |
\^ |
$ |
End of string or line |
\$ |
Note: In Java code, use \\. to match a literal dot because backslash must be escaped in strings.
Example:
Output:
Explanation: In the above example, the string input is matched with the regex string to find a symbol $ as a literal special symbol in the regular expression.
7. Flags
Flags are optional parameters that can be applied when compiling a regex using Pattern.compile() method to modify the behaviour of the pattern matching, such as making it case-insensitive or allowing multiline matching.
Flag |
Description |
Pattern.CASE_INSENSITIVE |
Makes the pattern case-insensitive (e.g., a matches A) |
Pattern.MULTILINE |
^ and $ match start/end of each line, not just entire input |
Pattern.DOTALL |
. matches any character, including line breaks (n) |
Pattern.UNICODE_CASE |
Enables Unicode-aware case-insensitive matching |
Pattern.COMMENTS |
Allows whitespace and comments in regex (ignored during matching) |
Pattern.LITERAL |
Treats the entire pattern as literal text (disables all metacharacters) |
Example:
Output:
Explanation: In the above example, the string input is matched with the regex string irrespective of the case sensitivity of the characters, by using the Pattern.CASE_INSENSITIVE flag.
Common Use Cases of Regular Expressions in Java
Below are some common use cases of regular expressions in Java.
1. Validating an Email Address
Below is a simple program to validate an email address in Java.
Output:
Explanation: In the above Java program, an array of strings is named emails. The array is further matched with the regex string and matched using the Pattern.compile() method.
2. Splitting Text Based on a Pattern
Below is a simple program to split a text based on a comma in Java.
Output:
Explanation: In the above Java program, the string input is split into different parts with the context of a comma, by the help of .split() method.
3. Replacing Text with Regex
Java has built-in regex methods to replace substrings based on a pattern. These are
- String.replaceAll(String regex, String replacement): This method is used to replace all the occurrences of the regex pattern with the replacement string.
- String.replaceFirst(String regex, String replacement): This method is used to replace only the first occurrence of the regex pattern with the replacement string.
Example:
Output:
Explanation: In the above example, the string input is first used with the .replaceAll() method that replaces all the occurrences of apple with orange. Then the method replaceFirst() is used to replace the first occurrence of apple with orange.
Tips for Working with Regular Expressions in Java
Some of the tips while you are using the Java regex are as follows:
- To improve your code performance, compile the regex using Pattern.compile() only once, and reuse it. Because every time you call this method, Java has to parse and compile the whole regular expression, hence wasting resources.
- Keep the regex pattern simple and readable, so that it can be quickly understood and debugged by the compiler, which also helps reduce the risk of errors.
- Java regex has multiple predefined character classes to make your patterns shorter, readable, and easier to maintain, hence, instead of writing complex custom characters, you can use built-in classes or methods to match the common character types.
- Use quantifiers to match how many times the pattern has occurred, which lets you match single, multiple, or optional occurrences of elements, depending on your data.
Unlock Your Future in Java
Start Your Java Journey for Free Today
Conclusion
Java regex is very helpful for developers to work with text processing, data manipulation, and so on. The package java.util.regex allows the user to use different kinds of interfaces and classes with the built-in methods like find(), start(), end(), and so on to use the regular expression easily. You can use anchors, characters, quantifiers, flags, special symbols, and so on to solve many problems, like validating an email address.
Java Regex – FAQs
Q1. What are Regular Expressions in Java?
Regular expressions in Java are patterns used to match and work with text. They help search, validate, and manipulate strings using specific rules.
Q2. How do I escape special characters in a regex?
To use special characters like . or * as normal text in a regex, you need to put two backslashes before them like this:
Q3. How do I handle PatternSyntaxException?
If your regex has a mistake, Java throws a PatternSyntaxException. You can catch it using a try-catch block to handle errors safely.
Q4. What is the difference between matches(), find(), and lookingAt()?
The matches() method checks the whole string. find() looks for parts of the string that match. lookingAt() checks only the start of the string.
Q5. Can I make a regex case-insensitive in Java?
Yes, you can ignore letter case in a regex by adding (?i) at the start of the pattern or using Pattern.CASE_INSENSITIVE.