+1 vote
1 view
in Python by (21.3k points)
edited by

I need some help on declaring a regex. My inputs are like the following:

this is a paragraph with<[1> in between</[1> and then there are cases ... where the<[99> number ranges from 1-100</[99>. and there are many other lines in the txt files with<[3> such tags </[3>

The required output is:

this is a paragraph with in between and then there are cases ... where the number ranges from 1-100. and there are many other lines in the txt files with such tags

I've tried this:

#!/usr/bin/python

import os, sys, re,

glob for infile in glob.glob(os.path.join(os.getcwd(), '*.txt')):

    for line in reader:

         line2 = line.replace('<[1> ', '')

         line = line2.replace('</[1> ', '')

         line2 = line.replace('<[1>', '')

         line = line2.replace('</[1>', '')

         print line

I've also tried this (but it seems like I'm using the wrong regex syntax):

line2 = line.replace('<[*> ', '')

line = line2.replace('</[*> ', '')

line2 = line.replace('<[*>', '')

line = line2.replace('</[*>', '')

I don't want to hard-code replace from 1 to 99 . . .

1 Answer

0 votes
by (52.8k points)
edited ago by

To input a regex in string.replace you can use the following code:-

import re

line = re.sub(r"</?\[\d+>", "", line)

To know more about this you can have a look at the following video:-

...