Welcome to the Kapow forum. Here you can get help, use your skills to help others and enjoy hanging out in the company of other Kapow Robot Developers.

How we can avoid CAPTCHA in websites

Shyam Kumar
Shyam Kumar

Posts : 108
Points : 2243
Join date : 2013-07-05
Location : Kerala, India

How we can avoid CAPTCHA in websites

Post by Shyam Kumar on Tue Oct 27, 2015 11:53 am

Hi all,

Recently some of the websites blocked by captcha. all captchas are images. How we can overcome this problem using kapow robot.

These types of websites need manual efforts.. Some of these captchas can overcome. how we can avoid this manual efforts? Any possibility to avoid this issue?

Thank you.

Shyam kumar

Posts : 68
Points : 1913
Join date : 2014-03-01
Location : USA

Re: How we can avoid CAPTCHA in websites

Post by jking on Tue Oct 27, 2015 9:06 pm

PLEASE NOTE:  It may be a felony to violate a website’s terms of service and fool a CAPTCHA.  The following comments are intended to be used only on websites where the designer has agreed to the site's Terms of Service and is using a Kapow Robot as an extension to the agreed upon T.O.S.

One cannot configure  robot to work directly with a Captcha image, but these is a way to work indirectly.  The indirect process requires 3 steps:  Extract the captcha image. Send the captcha image to a human to interpret and provide a response, have the robot pickup and process the response.

The website generally passes a new image each time the page refreshes.  Therefore, the robot must be configured to save the images to load locally.

To start out, you need to get an idea of where the image value being generated from, that is what is the unique url that is generating the capthca values.  In Design Studio, click on the image and find the tag where the image appears.  Look for source attribute that is listed that should indicate the server from which the image came.  You will need to tell the robot that any image that comes from that source should be stored in the browser.

To do this, go to the Robot Confuration Default Options.  On the Page Loading Images to Load change to Depends on URL.  For URL Conditions, enter a pattern that will identify the source url mentioned earlier.  Image will load into browser engine itself, so that when you work with the captcha image the engine will not call out the url again, which will produce a new captcha image.

After the robot has been configured, when you load the page that contains a captcha image, you will be able to extract the captcha image to a variable.  Once you have extracted the captcha image, you will need to use a Save Session step to keep the current state.

The next step is to send the saved image to a human user who can provide the correct response.  The robot will send the image and wait for a response from a human user.  The user will provide the correct response (via email, html, website, etc.) which the robot will pick up.  The robot will then use a restore session and enter the correct response and continue from there.

Another option would be to have a user work with the robot in Design Studio.  This would involve putting a toggle breakpoint in the robot so that the user can see the captcha image.  The user would run the robot in debug to the toggle breakpoint.  When the robot gets to the breakpoint, the user would look at the State to determine the correct response and pass the response to the robot to pickup (save the response in a file the robot can open, or use a second robot to store the correct response in a database, etc.).  

Once the correct response was provided, the user would then continue running the robot in debug, which would pickup the response provided and then move on.

Both solutions require manual efforts, and there is really no way to avoid this.  The strategy described results in a 'hybrid' robot that will automate most steps, while allowing for the manual input to comply with Captcha/T.O.S.

    Current date/time is Fri Feb 22, 2019 10:23 am