What is Regular Expression in Python

Tutorial Playlist

Python Regular Expressions or Python RegEx are patterns that permit us to ‘match’ various string values in a variety of ways.

A pattern is simply one or more characters that represent a set of possible match characters. In regular expression matching, we use a character (or a set of characters) to represent the strings we want to match in the text.

Table of content

Let us get started then.

All About Regular Expression Characters in Python

Symbol Meaning
. (period) Matches any character except the newline character in a given regular expression in Python
^ (caret sign) Matches the start of any string of a given regular expression in Python
$ (dollar sign) Matches the end of any string of a given regular expression in Python
* (asterisk) Matches zero or more repetitions of a given regular expression in Python
? Matches zero or one of the previous regular expressions in Python
{} Used as either {m}, where m means to match exactly ‘m’ instances of the previous regular expression, or as {m,n} where n > m, meaning to match between ‘m’ and ‘n’ instances of the previous regular expression in Python
(backslash) Either a special character, such as one of the other regular expression characters (i.e., * matches an asterisk) or one of the special regular expression sequences

How to Use the Match Function of RegEx in Python?

The match function matches the Python RegEx word to the string with optional flags.

Character Input String Pattern Output
. “abc123” r”.” [‘a’, ‘b’, ‘c’, ‘1’, ‘2’, ‘3’]
^ “abcdef” r”^abc” abc
$ “123xyz” r”xyz$” xyz
* “aaabbb” r”a*” [‘aaa’, ”, ”, ”]
? “baab” r”ba?” [‘ba’, ‘ba’]
{} “aaaaabbb” r”a{2,4}” aaaa
\ “a.b.c” r”\.” [‘.’, ‘.’]

Syntax:

re.match(pattern, string, flags=0)

Where ‘pattern’ is a regular expression to be matched, and the second parameter is a Python String that will be searched to match the pattern at the starting of the string.

Example of Python regex match:

import re
print (re.match("i", "intellipaat"))
Output:
<re.Match object; span=(0, 1), match='i'>

Python then outputs a line signifying that a new object, i.e., sre.SRE type has been created. The hex number following is the address at which it was created.

import re
print(re.match("b", "intellipaat"))
Output:
None

How do you use the ‘findall()’ function of Regex in Python?

findall() function is used for finding all occurrences of a pattern in a string and returningthem as a list.

Syntax:

re.findall(pattern,string,flag = 0)

Where ‘pattern’ is the regular expression to search for and ‘string’ is the input string to search in.

Example of python regex findall:

import re
print(re.findall(r"[aeiou]","hello world"))

 

Output:
['e', 'o', 'o']

Special Sequence Characters of RegEx in Python

The six most important sequence characters are:

  • d: Matches any decimal digit. This is really the same as writing [0-9] but is done so often that it has its own shortcut sequence.
  • D: Matches any non-decimal digit. This is the set of all characters that are not in [0-9] and can be written as [^0-9].
  • s: Matches any white space character. White space is normally defined as a space, carriage return, tab, and non-printable character. Basically, white space is what separates words in a given sentence.
  • S: Matches any non-whitespace character. This is simply the inverse of the s sequence mentioned above.
  • w: Matches any alphanumeric character. This is the set of all letters and numbers in both lower and uppercase.
  • W: Matches any non-alphanumeric character. This is the inverse of the w sequence mentioned above.

Get 100% Hike!

Master Most in Demand Skills Now!

Search Function of RegEx in Python

It searches for the primary occurrence of a Regular Expression pattern within a string with optional flags.

Syntax:

re.search(pattern, string, flags=0)

Example of Python regex search:

m = re.search(‘bopenb’, ‘please open the door’)
print m
Output:
None

This output so occurred because the ‘b’ escape sequence is treated as a special backspace character. Metacharacters are those characters that include ‘/’.

import re
m = re.search(‘\bopen\b’, “please open the door”)
print m
Output:
<-sre.SRE-Match object at 0x00A3F058>

RegEx Replace Function in Python

The idea is to use the very normal form of the re.sub() method with only the first 3 arguments.

import re
def substitutor():
sen1 = "It is sunny outside."
print(re.sub(r"sunny", "raining", sen1))
sen2 = "Intellipaat Python Course"
print(re.sub(r"Course", "Tutorial", sen2))
substitutor()
The output will be
It is raining outside.
Intellipaat Python Tutorial

Python Regular Expression Modifiers (Option Flags)

The following table contains the list of all Python Regular Expression or Python RegEx modifiers, along with their descriptions.

Modifier Description
re.I Performs case-insensitive matching
re.L Interprets words according to the current locale. This interpretation affects the alphabetic group (w and W), as well as the word boundary behavior (b and B)
re.M Makes $ match the end of a line (not just the end of the string) and makes ^ match the start of any line (not just the start of the string)
re.S Makes a period (dot) match any character, including a new line
re.U Interprets letters according to the Unicode character set. This flag affects the behavior of w, W, b, and B
re.X Allows ‘cuter’ regular expression syntax

Data Analysis Mastery with Python and Excel
Gain Expertise in Data Wrangling, Statistical Analysis, and Reporting
quiz-icon

This brings us to the end of this module about regular expression in python Tutorial. Here, we learned what Python 3 RegEx is, Regular Expression Characters in Python, the Match Function of RegEx in Python, Special Sequence Characters of RegEx in Python, the Search Function of RegEx in Python, also talked about Python Regex Modifiers. Now, if you want to know why Python is the most preferred language for data science, you can go through this blog on Python Data Science tutorial. To dive deep into the ocean of data, check out Intellipaat’s Executive Post Graduate Certification in Data Science & Artificial Intelligence and master this technology.

Our Python Courses Duration and Fees

Program Name
Start Date
Fees
Cohort starts on 11th Jan 2025
₹20,007
Cohort starts on 11th Jan 2025
₹20,007
Cohort starts on 11th Jan 2025
₹20,007

About the Author

Technical Research Analyst - Full Stack Development

Kislay is a Technical Research Analyst and Full Stack Developer with expertise in crafting Mobile applications from inception to deployment. Proficient in Android development, IOS development, HTML, CSS, JavaScript, React, Angular, MySQL, and MongoDB, he’s committed to enhancing user experiences through intuitive websites and advanced mobile applications.