bing
Flat 10% & upto 50% off + 10% Cashback + Free additional Courses. Hurry up
×
UPTO
50%
OFF!
Intellipaat
Intellipaat
  • Live Instructor-led Classes
  • Expert Education
  • 24*7 Support
  • Flexible Schedule

Regular Expressions in Python

Python Regular Expressions are patterns that permit you to “match” various string values in a variety of ways. The module re provides regular expressions in Python.

A pattern is simply one or more characters that represent a set of possible match characters. In regular expression matching, you use a character (or set of characters) to represent the strings you want to match in the text.

In this module, we will learn about RegEx and cover the following topics

Learn Python in 16 hrs from experts

Regular Expression Characters In Python

Symbol Meaning
. (period)Matches any character except the newline character.
^ (caret sign)Matches the start of any string.
$ (dollar sign)Matches the end of any string.
* (asterisk)Matches zero or more repetitions of a given regular expression.
?Matches zero or one of the previous regular expressions.
{}Used as either {m} where m means to match exactly “m” instances of the previous regular expression or {m,n} where n > m, meaning to match between m and n instances of the previous regular expression.
\ (backslash)Either a special character, such as one of the other regular expression characters(i.e., \* matches an asterisk) or one of the special regular expression sequences

Wish to Learn Python? Click Here

The match Function

The match function matches Regular expression pattern to string with optional flags.

Syntax:

re.match(pattern, string, flags=0)

Where pattern is a regular expression to be matched, 2nd parameter is a string that will be searched to match pattern at the starting of the string.

Example:

import re
print re.match(“i”, “intellipaat”)

Output:

<-sre.SRE-Match object at 0x7f9cac95d78>

Python then outputs a line signifying that a new object i.e. sre.SRE type has been created. The hex number following it is the address at which it was created.

import reprint re.match(“b”, “intellipaat”)

Output:
None

Become Python Certified in 16 hrs.
CLICK HERE

Special Sequence Characters

The six most important sequence characters are:

  • \d: Matches any decimal digit. This is really the same as writing [0-9], but is done so often that it has its own shortcut sequence.
  • \D: Matches any non-decimal digit. This is the set of all characters that are not in [0-9] and can be written as [^0-9]
  • \s: Matches any white space character. White space is normally defined as a space, carriage return, tab, and non-printable character. Basically, white space is what separates words in a given sentence.
  • \S: Matches any non white space character. This is simply the inverse of the \s sequence above.
  • \w: Matches any alphanumeric character. This is the set of all letters and numbers in both lower- and uppercase.
  • \W: Matches any non-alphanumeric character. This is the inverse of the \w sequence above.

Search Function

It searches for primary occurrence of RE pattern within string with optional flags.

Syntax:

re.search(pattern, string, flags=0)

Example:

m = re.search(‘\bopen\b’, ‘please open the door’)
print m

Output
None

This output is occurred because the ‘\b’ escape sequence is treated as a special backspace character. Meta characters are those characters which include /.

import re
m = re.search(‘\\bopen\\b’, “please open the door”)
print m

Output

<-sre.SRE-Match object at 0x00A3F058>

Python Regular Expression Modifiers (Option Flags)

Following table contains the list of all Python regular expression modifiers along with their respective description.

ModifierDescription
re.IPerforms case-insensitive matching.
re.LInterprets words according to the current locale. This interpretation affects the alphabetic group (\w and \W), as well as word boundary behavior (\b and \B).
re.MMakes $ match the end of a line (not just the end of the string) and makes ^ match the start of any line (not just the start of the string).
re.SMakes a period (dot) match any character, including a newline.
re.UInterprets letters according to the Unicode character set. This flag affects the behavior of \w, \W, \b, \B.
re.XAllows “cuter” regular expression syntax.

This brings us to the end of this module. The next module highlights Python PIP. See you there!

Previous Next

Download Interview Questions asked by top MNCs in 2018?

"0 Responses on Python RegEx"

    Leave a Message

    100% Secure Payments. All major credit & debit cards accepted Or Pay by Paypal.
    top

    Sales Offer

    Sign Up or Login to view the Free Python RegEx.