5 posters

PDF Looping

Tejaswini H R: Posts : 17
Points : 2230
Join date : 2018-04-26

Post n°1

PDF Looping

by Tejaswini H R Wed Jul 18, 2018 7:57 pm

Hi,

PDF has Tables inserted with it. when PDF is loaded into Kapow, tables are not aligned and i am not able to loop through these tables.
Table contains Product, Title, Quantity and Price fields
I have used "Loop for Each Tag" so as to loop through these fields and fetch the next Line item.
But, Kapow is not able to identify the next line item.
Can somebody help me resolve this issue?

Thanks
Tejaswini H R

PDF Looping F9d82f10

pavel.vraj

pavel.vraj: Posts : 86
Points : 2823
Join date : 2016-11-04
Location : Prague, Czech Republic

Post n°2

Re: PDF Looping

by pavel.vraj Thu Jul 19, 2018 4:24 pm

Hi,
the problem with PDF is, that in most cases it doesn't have structures, as we see it, because it uses streams and these could be mixed throughout the whole file. So this is something you would never be able to 100% handle with Kapow. In most cases you're able to prepare the robot, but another PDF will come and the streams could be designed different way and it won't work.
For the preparation of the robot (because the screenshot is not visible) I suggest to try to use advanced extraction.
Best regards,
Pavel Vraj

leedle

leedle: Posts : 18
Points : 2122
Join date : 2018-07-31

Post n°3

Re: PDF Looping

by leedle Wed Aug 01, 2018 6:04 am

Your best bet is to extract all of the text and use javascript/regex to get what you need.

Shyam Kumar

Shyam Kumar: Ranks; Posts : 113
Points : 4142
Join date : 2013-07-05
Location : Kerala, India

Post n°4

Re: PDF Looping

by Shyam Kumar Wed Aug 01, 2018 11:10 am

Hi Tejaswini H R,

For the PDF extraction is little bit difficult, I think the above mentioned PDF we can easily extracted the content.

You can implement various logic here..

You done is right, but need few changes here.. in the Loop for Each Tag mentioned above I think you did't use any tag pattern.
Here you can use tag pattern, I think the heading Handsets is the common to all table. So we can set this in the tag finder and the condition is not satisfied try next loop. Here you set the span tag and First tag number is 36. It may possible to skip any record. because in PDF almost everything in a span tag.

Another method is to extract whole HTML to a variable and using For Each Text Part, you can split each table, here you need to give the delimiter as a common word; here you can use Handsets and the output store in another variable. In this variable display only the table content. in the output you can easily split the values or using create page action step you can load and extract the content.

Please try and didn't get any results, please share you source/pdf.

Thank you.

Regards,
Shyam kumar

Tejaswini H R

Tejaswini H R: Posts : 17
Points : 2230
Join date : 2018-04-26

Post n°5

Re: PDF Looping

by Tejaswini H R Thu Aug 09, 2018 6:05 pm

Hi Shyam,

Thanks for your suggestions.
I tried the second step it is working fine:)
But i want to explore first step too.

We have another requirement where the entire row needs to be extracted and merged with another PDF
we are able to extract contents(using second step) and we are appending to an existing file and using "Write File" option
but, when we are opening merged PDF we are getting "corrupted PDF(51KB) file".

Thanks!
Tejaswini H R

PDF Looping Screen11

Last edited by Tejaswini H R on Mon Aug 13, 2018 12:05 pm; edited 1 time in total

chrismemo

chrismemo: Posts : 15
Points : 2216
Join date : 2018-04-24
Location : Singapore

Post n°6

Re: PDF Looping

by chrismemo Mon Aug 13, 2018 10:33 am

Hi Teja,
One of the Other options is using any Enterprise Capture solutions liked combination Kofax KTA/KC + KTM, Opentext Captiva, Abbyy Flexicapture and this will make the life easier ^^
It will can OCR and extract the tables even span on multiple pages.
Regards,
Chris

Re: PDF Looping

by Sponsored content

KAPOW

PDF Looping

PDF Looping

Re: PDF Looping

Re: PDF Looping

Re: PDF Looping

Re: PDF Looping

Re: PDF Looping

Re: PDF Looping

Similar topics

Similar topics