Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Data Science by (18.4k points)

I am having difficulty getting the following regex pattern to work. For the sample text strings below, REF1 is matched for the entire line, ignoring the optional REF2 group that should be matched if the "//[text]" is found in the line.

At the moment, regex is not acknowledging the //[text] and incorrectly matching the entire text as REF1. I am assuming this is a characteristic of greedy matching .. however, I was unsuccessful at implementing a non-greedy pattern, and also lookahead/look behind (did not appear to work) either.

Any help or guidance would be greatly appreciated ... not sure what I am missing as I would think my current regex pattern should work without issue. 

^(?P<ID>[A-Z][A-Z0-9]{3})?(?P<REF1>.+)(//(?P<REF2>.+))?(\n?(?P<EXTRA>.+))?$

TEX1CNS0P5-AA//CAT-523-VID-00EOS-0

XUX PETER LAB RANDOM TEXT DM5.

TEX2BFTBSH9999SBRT2L

RATRACE201

TEX3GWS0P2-AA//D-14839048-99-3

THERE were 200 COALS IN HIS STOCKING.

Expected Matches:

  • String 1:
    • id: TEX1
    • ref1: CNS0P5-AA
    • ref2: CAT-523-VID-00EOS-0
    • extra: XUX PETER LAB RANDOM TEXT DM5.
  • String 2:
    • id: TEX2
    • ref1: BFTBSH9999SBRT2L
    • ref2: (no match, since "//" does not appear in this text)
    • extra: RATRACE201
  • String 3:
    • id: TEX3
    • ref1: GWS0P2-AA
    • ref2: D-14839048-99-3
    • extra: THERE were 200 COALS IN HIS STOCKING.

1 Answer

0 votes
by (36.8k points)

^(?P<ID>[A-Z][A-Z0-9]{3})?(?P<REF1>[^/\n]+)(//(?P<REF2>.+))?(\n?(?P<EXTRA>.+))?$

I have updated it. It is passing the required cases now.

The issue with the original implementation is REF1 matches everything apart from line terminators. So it matched // as well.

Improve your knowledge in data science from scratch using Data science online courses

Browse Categories

...