KAPOW

Welcome to the Kapow forum. Here you can get help, use your skills to help others and enjoy hanging out in the company of other Kapow Robot Developers.


Can Kofax Kapow perform OCR on PDF file?

Share

knights

Posts : 6
Points : 100
Join date : 2018-09-17

Can Kofax Kapow perform OCR on PDF file?

Post by knights on Fri Dec 07, 2018 1:28 pm

I have a PDF file which is basically a scanned document with texts and table.

I will like to ask if Kofax Kapow is able  to perform OCR of the image in the PDF file for me to use Regular Expression to extract specific texts (in tabular form).

Thanks in advance for your answer.

I am using Design Studio 10.3.0.2.
avatar
pavel.vraj

Posts : 86
Points : 866
Join date : 2016-11-04
Location : Prague, Czech Republic

Re: Can Kofax Kapow perform OCR on PDF file?

Post by pavel.vraj on Mon Dec 10, 2018 9:10 pm

Hi,

In case the texts are in the PDF as texts, then you don't need to do any OCR, just load the PDF and extract the text into a variable. Then you can do RegEx for example to get, what you need.

In case the PDF is in image format, then with your version, you can do OCR, but via Device Automation, where you can open the PDF in Reader and do OCR from the Screen. This has many limitations.

There's also third option. From version 10.3.2.x Kapow contains full Kofax Transformation, which enables you to do OCR on any type of document and extract any data you need. Then via integrated Transformation you can do any extraction even the source file (and not only PDF) is in image format. This is of course a matter of licensing, but with current Kofax RPA (Kapow is now renamed to this name) 1-year free trial you have all features free of charge for one year.

Best regards,
Pavel Vraj

knights

Posts : 6
Points : 100
Join date : 2018-09-17

Re: Can Kofax Kapow perform OCR on PDF file?

Post by knights on Wed Dec 12, 2018 7:11 am

Dear Pavel,

Thank you for the information.

For version 10.3.2.x of Kapow unfortunately I do not have access to it.

In this case I managed to use Device Automation to exact the information I desired because my pdf is in the form of scanned copy of a document (legacy equipment
hence the printer only output the document via printer instead of saving in pdf!).

The information that I want to exact is actually a table inside the scanned image.

Using Device Automation I managed to extract the texts inside the table. The output of my Device Automation are as such:

x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x

where x is strings of alphabets, numbers

Do you know how can I convert these texts in table form and output as excel?
avatar
pavel.vraj

Posts : 86
Points : 866
Join date : 2016-11-04
Location : Prague, Czech Republic

Re: Can Kofax Kapow perform OCR on PDF file?

Post by pavel.vraj on Thu Dec 13, 2018 11:07 pm

Hi,

If you have legal license for 10.3.0.2 you should be able to download the latest 10.3.2.1 version as well from Kofax. Check it.

Other option is to register for one year free trial, which is for everyone and you will get the latest version too.

To your question, if the output contains strings separated by a space and new line, then you just do a loop with "For Each Text Part" and as separator use the new line "\n". Then it will loop via all lines and you can then separate the values by space. Of course, there could be spaces inside the values, so you should find something unique, like two spaces together, or similar and you can do again new loop with "For Each Text Part" with this unique string as separator. Saving into Excel is then easy.

Best regards,
Pavel

knights

Posts : 6
Points : 100
Join date : 2018-09-17

Re: Can Kofax Kapow perform OCR on PDF file?

Post by knights on Fri Dec 14, 2018 7:17 am

Dear Pavel,

Thank you for your explanation.
Currently my license of Kofax Kapow is actually licensed from another company (actually due to expiry soon) and the latest version is 10.3.0.2 (I download the installation file for Kapow from this company website instead of Kofax).
Yes will consider the one-year free trial from Kofax directly.

Will try your "For Each Text Part" loop method.


Sponsored content

Re: Can Kofax Kapow perform OCR on PDF file?

Post by Sponsored content


    Current date/time is Tue Dec 18, 2018 11:07 pm