by jking Tue Oct 27, 2015 9:06 pm
PLEASE NOTE: It may be a felony to violate a website’s terms of service and fool a CAPTCHA. The following comments are intended to be used only on websites where the designer has agreed to the site's Terms of Service and is using a Kapow Robot as an extension to the agreed upon T.O.S.
One cannot configure robot to work directly with a Captcha image, but these is a way to work indirectly. The indirect process requires 3 steps: Extract the captcha image. Send the captcha image to a human to interpret and provide a response, have the robot pickup and process the response.
The website generally passes a new image each time the page refreshes. Therefore, the robot must be configured to save the images to load locally.
To start out, you need to get an idea of where the image value being generated from, that is what is the unique url that is generating the capthca values. In Design Studio, click on the image and find the tag where the image appears. Look for source attribute that is listed that should indicate the server from which the image came. You will need to tell the robot that any image that comes from that source should be stored in the browser.
To do this, go to the Robot Confuration Default Options. On the Page Loading Images to Load change to Depends on URL. For URL Conditions, enter a pattern that will identify the source url mentioned earlier. Image will load into browser engine itself, so that when you work with the captcha image the engine will not call out the url again, which will produce a new captcha image.
After the robot has been configured, when you load the page that contains a captcha image, you will be able to extract the captcha image to a variable. Once you have extracted the captcha image, you will need to use a Save Session step to keep the current state.
The next step is to send the saved image to a human user who can provide the correct response. The robot will send the image and wait for a response from a human user. The user will provide the correct response (via email, html, website, etc.) which the robot will pick up. The robot will then use a restore session and enter the correct response and continue from there.
Another option would be to have a user work with the robot in Design Studio. This would involve putting a toggle breakpoint in the robot so that the user can see the captcha image. The user would run the robot in debug to the toggle breakpoint. When the robot gets to the breakpoint, the user would look at the State to determine the correct response and pass the response to the robot to pickup (save the response in a file the robot can open, or use a second robot to store the correct response in a database, etc.).
Once the correct response was provided, the user would then continue running the robot in debug, which would pickup the response provided and then move on.
Both solutions require manual efforts, and there is really no way to avoid this. The strategy described results in a 'hybrid' robot that will automate most steps, while allowing for the manual input to comply with Captcha/T.O.S.