Regular Expressions in Java (Java Regex)

Java-Regex.jpg

Regular expressions in Java are used for pattern matching. Java has a built-in package java.util.regex that supports the use of regular expressions. It is used to validate user input, parse logs, or transform strings. In this guide, we will learn the key concepts of Java regular expressions, the core classes and interfaces involved, common patterns like character classes and quantifiers, and practical use cases with code examples.

Table of Contents:

What is a Regular Expression in Java?

Java Regular Expression (Java regex) is a sequence of characters that defines a search pattern, usually used for string matching or string manipulation. Java provides built-in support for regular expressions through the java.util.regex package. Java regex allows you to

  • Validate inputs (like email, phone, date, etc.)
  • Search within strings
  • Replace substrings
  • Split strings based on complex patterns

Rules of Writing Java Regex

Below are the rules to write a regular expression in Java.

  • Java regex has special meaning, if you want to match them, you must escape them using double backslashes.
  • Combine patterns like characters, anchors, quantifiers, and so on to maintain an accurate and consistent regex pattern.
  • Use anchors like ^ and $ to restrict the match for the matching input, it will help you to avoid mismatched patterns.
  • Empty patterns in regular expressions are invalid, hence, don’t use them, as they can lead to confusion.
Master Java Today - Accelerate Your Future
Enroll Now and Transform Your Future
quiz-icon

Regex Classes and Interfaces in Java

The Java regex package includes the following classes and interfaces.

Regex Classes and Interfaces in Jav

Regex Classes in Java

The java.util.regex package includes the following main classes:

1. Pattern class: The Pattern class in Java represents a compiled regular expression. It is used to define, compile, and match regex patterns of character sequences like a String. It is immutable and thread-safe, and is used with the Matcher class to match a pattern.

Methods of the Pattern class in Java regex are as follows:

Method Description
static Pattern compile(String regex) This method compiles the given regular expression into a pattern.
static Pattern compile(String regex, int flags) This method compiles the given regular expression into a pattern with the specified match flags.
int flags() This method returns the match flags specified when this pattern was compiled.
Matcher matcher(CharSequence input) It creates a matcher that can be used to match the given input against this pattern.
static boolean matches(String regex, CharSequence input) Compiles the given regex and matches it against the given input. Returns true if the entire input matches.
String pattern() It returns the regex string from which this pattern was compiled.
static String quote(String s) This method returns a literal pattern String for the given input s. It escapes all regex meta characters.
String[] split(CharSequence input) It splits the input sequence around matches of this pattern.
String[] split(CharSequence input, int limit) It splits the input around matches of this pattern, with control over the number of splits via limit.
String toString() This method returns the string representation of the pattern (same as pattern()).

Example:

Java

Output:

Regex Classes in Java

Explanation: The above Java program first compiles a regular expression using Pattern.compile(“\d+”), where \d+ is a regex that matches one or more digits. Then, a Matcher object is created with the matcher() method to apply this pattern to the input string “Order 12345”. If the digits are found in the find() method, then it is printed.

2. Matcher class: The Matcher class in Java is used to perform matching operations like searching, replacing, and verifying patterns on character sequences using a compiled Pattern. It works with the Pattern class and provides a method to find, match, and replace text using regex.

Methods of the Matcher class in Java regex are as follows:

Method Description
find() This method searches for the next occurrence of the pattern in the input. Useful for finding multiple matches.
find(int start) This method searches for the next occurrence starting from the given index.
start() It returns the start index of the last match found using find().
end() It returns the index after the last character of the last match found.
group() This method returns the exact substring that was matched by the previous find() operation.
groupCount() This method returns the number of capturing groups in the pattern (excluding group 0, which is the entire match).
matches() It returns true if the entire input string matches the pattern. It does not search for partial matches like find() does.

Example:

Java

Output:

Matcher class

Explanation: In the above example, the Matcher class is used to apply the regular expression by the object to a specific input string, Order 12345. The find() method is used to search the input string for the next substring that matches the pattern, i.e., one or more digits.

Regex Interface in Java

Java Regex has the MatchResult interface that provides only read-only access to the results of a regular expression match performed by the Matcher class. It provides a safe access to match groups and allows extraction of matched substrings, start and end indices without modifying the matcher.

Methods of the MatchResult interface in Java regex are as follows:

Method Description
group() This method returns the matched substring.
group(int i) This method returns the i-th matched group.
start() It returns the start index of the last match.
end() It returns the end index of the last match.
groupCount() This method returns the number of capturing groups in the pattern.

Example:

Java

Output:

Regex Interface in Java

Explanation: In the above example, the MatchResult interface is used to find the matched string, the start index of the matched string, and its end index. The Matcher class implements MatchResult, which is used to find the above output.

Basic Regular Expression in Java

Below are some basic Java regex:

1. Literals

A literal is used to match the exact character or sequence of characters as written. When a literal is given in a regex, it is used to find the exact match of it. For example, the regex cat will only match the string “cat”, not “Cat”, “catalog”, or “scat”.

Example:

Java

Output:

Literals

Explanation: In the above example, the literal input is matched with the pattern string; if the literal is matched, the output is displayed as match found with the matched; otherwise, no match found is displayed.

2. Character Classes

A character class is used to match exactly one character from a set of characters. It is defined using square brackets [ ]. For example,

Pattern Matches
[aeiou] Any single vowel
[0-9] Any single digit
[A-Z] Any uppercase letter
[a-zA-Z] Any uppercase or lowercase letter
[abc123] Either a, b, c, 1, 2, or 3
[^abc] Any character except a, b, or c

Example:

Java

Output:

Character Classes

Explanation: In the above example, the digits from the string input are matched with the regex string, using a character class. The output is displayed as a digit found with the digit.

3. Predefined Character Classes

The predefined character classes are the shorthand notations in regular expressions that match common character sets like digits, white spaces, or word characters. They are the shortcuts to avoid writing long character classes like those shown above.

Pattern Matches Equivalent To
d A digit [0-9]
D A non-digit [^0-9]
w A word character (letters, digits, underscore) [a-zA-Z0-9_]
W A non-word character [^a-zA-Z0-9_]

Example:

Java

Output:

Predefined Character Classes

Explanation: In the above example, the string input is matched with the regex string using a character class. The output is displayed without whitespaces.

4. Quantifiers

Quantifiers in regex specify how many times a character, group, or character class must occur in a match, and are used for repeat patterns.

Quantifier Description Example
* 0 or more times a*
+ 1 or more times a+
? 0 or 1 time a?
{n} Exactly n times a{3}
{n,} At least n times a{2,}
{n,m} Between n and m times (inclusive) a{2,4}

Example:

Java

Output:

Quantifiers

Explanation: In the above example, the string input is matched with the regex string using quantifiers to match 2 to 4 consecutive lowercase letters.

Get 100% Hike!

Master Most in Demand Skills Now!

5. Anchors

Anchors are special characters in regex that match the positions of the characters in an expression.

Anchor Description Matches
^ Start of a line or string The match is at the beginning
$ End of a line or string The match is at the end
b Word boundary The match is at the edge of a word
B Non-word boundary Match is inside a word (not at the edge)

Example:

Java

Output:

Anchors

Explanation: In the above example, the string input is matched with the regex string using anchors that find the match of end at the end of the input.

6. Special Symbols

In regular expressions, certain characters have special meanings; they are meta-characters used to control pattern matching. If you want to match them as normal characters, you must escape them using a backslash ().

Note: In Java, since is also an escape character in strings, you need to double it as \\.

Symbol Meaning in Regex To match it literally, use
. Matches any character except newline \.
* 0 or more repetitions \*
+ 1 or more repetitions \+
? 0 or 1 repetition \?
^ Start of string or line \^
$ End of string or line \$

Note: In Java code, use \\. to match a literal dot because backslash must be escaped in strings.

Example:

Java

Output:

Special Symbols

Explanation: In the above example, the string input is matched with the regex string to find a symbol $ as a literal special symbol in the regular expression.

7. Flags

Flags are optional parameters that can be applied when compiling a regex using Pattern.compile() method to modify the behaviour of the pattern matching, such as making it case-insensitive or allowing multiline matching.

Flag Description
Pattern.CASE_INSENSITIVE Makes the pattern case-insensitive (e.g., a matches A)
Pattern.MULTILINE ^ and $ match start/end of each line, not just entire input
Pattern.DOTALL . matches any character, including line breaks (n)
Pattern.UNICODE_CASE Enables Unicode-aware case-insensitive matching
Pattern.COMMENTS Allows whitespace and comments in regex (ignored during matching)
Pattern.LITERAL Treats the entire pattern as literal text (disables all metacharacters)

Example:

Java

Output:

Flags

Explanation: In the above example, the string input is matched with the regex string irrespective of the case sensitivity of the characters, by using the Pattern.CASE_INSENSITIVE flag.

Common Use Cases of Regular Expressions in Java

Below are some common use cases of regular expressions in Java.

1. Validating an Email Address

Below is a simple program to validate an email address in Java.

Java

Output:

Validating an Email Address

Explanation: In the above Java program, an array of strings is named emails. The array is further matched with the regex string and matched using the Pattern.compile() method.

2. Splitting Text Based on a Pattern

Below is a simple program to split a text based on a comma in Java.

Java

Output:

Splitting Text Based on a Pattern

Explanation: In the above Java program, the string input is split into different parts with the context of a comma, by the help of .split() method.

3. Replacing Text with Regex

Java has built-in regex methods to replace substrings based on a pattern. These are

  • String.replaceAll(String regex, String replacement): This method is used to replace all the occurrences of the regex pattern with the replacement string.
  • String.replaceFirst(String regex, String replacement): This method is used to replace only the first occurrence of the regex pattern with the replacement string.

Example:

Java

Output:

Replacing Text with Regex

Explanation: In the above example, the string input is first used with the .replaceAll() method that replaces all the occurrences of apple with orange. Then the method replaceFirst() is used to replace the first occurrence of apple with orange.

Tips for Working with Regular Expressions in Java

Some of the tips while you are using the Java regex are as follows:

  • To improve your code performance, compile the regex using Pattern.compile() only once, and reuse it. Because every time you call this method, Java has to parse and compile the whole regular expression, hence wasting resources.
  • Keep the regex pattern simple and readable, so that it can be quickly understood and debugged by the compiler, which also helps reduce the risk of errors.
  • Java regex has multiple predefined character classes to make your patterns shorter, readable, and easier to maintain, hence, instead of writing complex custom characters, you can use built-in classes or methods to match the common character types.
  • Use quantifiers to match how many times the pattern has occurred, which lets you match single, multiple, or optional occurrences of elements, depending on your data.
Unlock Your Future in Java
Start Your Java Journey for Free Today
quiz-icon

Conclusion

Java regex is very helpful for developers to work with text processing, data manipulation, and so on. The package java.util.regex allows the user to use different kinds of interfaces and classes with the built-in methods like find(), start(), end(), and so on to use the regular expression easily. You can use anchors, characters, quantifiers, flags, special symbols, and so on to solve many problems, like validating an email address.

Java Regex – FAQs

Q1. What are Regular Expressions in Java?

Regular expressions in Java are patterns used to match and work with text. They help search, validate, and manipulate strings using specific rules.

Q2. How do I escape special characters in a regex?

To use special characters like . or * as normal text in a regex, you need to put two backslashes before them like this:

Q3. How do I handle PatternSyntaxException?

If your regex has a mistake, Java throws a PatternSyntaxException. You can catch it using a try-catch block to handle errors safely.

Q4. What is the difference between matches(), find(), and lookingAt()?

The matches() method checks the whole string. find() looks for parts of the string that match. lookingAt() checks only the start of the string.

Q5. Can I make a regex case-insensitive in Java?

Yes, you can ignore letter case in a regex by adding (?i) at the start of the pattern or using Pattern.CASE_INSENSITIVE.

fullstack