Back

Explore Courses Blog Tutorials Interview Questions
0 votes
1 view
in RPA by (5.3k points)

I have captured the full text of a PDF-file in a string calledpdfText

Next I am looping through an array containing substrings to be found/searched for in the pdfText-string.

One of the substrings is Invoice.

Both pdfText and the substrings I am searching for are converted to lower case.

If at least one of the substrings are found in the pdfText, a boolean is set to true.

Now, I have an example where the pdtText contains '...Net amount to be invoiced...'. This is the only variant of 'invoice' in the text. This of course returns true if I use

substring = "Invoice" ... pdfText.contains(substring.ToLower).

But in this case I need it to return false. I need to find only exact matches.

Another example, if the pdfText contains '...This is an invoice. Please pay....Net amount to be invoiced...' the boolean should be set to true because of the first invoice-match, but not the second invoiced-(non)match.

So what I am looking for is to find a substring Invoice in a string pdfText and make sure, that the substring is not part of a longer word invoiced, invoice-process etc.. Note, that invoice. should return True.

I believe this should be possible, but cannot wrap my head around it currently. I might need to use regex?

1 Answer

0 votes
by (9.5k points)

Lets implement it as :

  • programming>> string

  • Use the “activity is match”

  • Use it inside the loop

  • Refer to these settings

image

The RegEx is: substring+"[^a-zA-Z]"

I have declared the following variables:

enter image description here

Welcome to Intellipaat Community. Get your technical queries answered by top developers!

28.4k questions

29.7k answers

500 comments

94k users

Browse Categories

...