Explore Courses Blog Tutorials Interview Questions
0 votes
in RPA by (5.3k points)

I have captured the full text of a PDF-file in a string calledpdfText

Next I am looping through an array containing substrings to be found/searched for in the pdfText-string.

One of the substrings is Invoice.

Both pdfText and the substrings I am searching for are converted to lower case.

If at least one of the substrings are found in the pdfText, a boolean is set to true.

Now, I have an example where the pdtText contains '...Net amount to be invoiced...'. This is the only variant of 'invoice' in the text. This of course returns true if I use

substring = "Invoice" ... pdfText.contains(substring.ToLower).

But in this case I need it to return false. I need to find only exact matches.

Another example, if the pdfText contains '...This is an invoice. Please pay....Net amount to be invoiced...' the boolean should be set to true because of the first invoice-match, but not the second invoiced-(non)match.

So what I am looking for is to find a substring Invoice in a string pdfText and make sure, that the substring is not part of a longer word invoiced, invoice-process etc.. Note, that invoice. should return True.

I believe this should be possible, but cannot wrap my head around it currently. I might need to use regex?

1 Answer

0 votes
by (9.5k points)

Lets implement it as :

  • programming>> string

  • Use the “activity is match”

  • Use it inside the loop

  • Refer to these settings


The RegEx is: substring+"[^a-zA-Z]"

I have declared the following variables:

enter image description here

Browse Categories