Do some CAPTCHAs go too far?
CAPTCHAs - Completely Automated Public Turing test to tell Computers and Humans Apart.
In other words, an attempt at verification that a human is filling out a web form as opposed to an automated agent/bot.
Or, in other other words, a test that has become almost impossible for humans to even pass due to the increased levels of obfuscation being put into the tests themselves.
Usually CAPTCHAs are done via some kind of image where the user types in the contents of said image into a text box at the end of a web form. If the user's guess is correct, then the form is successfully submitted, and whatever follow up action that is supposed to happen afterward is performed (e.g. successful signup to a mailing list, comment post to a blog, etc).
The problem is that in an effort to make these CAPTCHA images more and more difficult for software to break down to allow bots to bypass them, they have also been made very difficult for humans, those who are supposed to be able to read them, to figure out.
Take the following image that I was presented with on Facebook, a popular social networking site, this morning:
Are you kidding me?
Obviously the second word is "mountains", but I challenge even the most competent forensic experts to tell me what the first word is supposed to be.
Despite it's fallibilities, I can understand as a technical person the need to have technologies like this in place. As a technical community, we need to make sure that we aren't making our products and systems impossible to use "in the name of security." Users will only accept a certain amount of inconvenience before they go find solutions that are simpler to use while still providing acceptable levels of security.
Regarding your example though, there is a project that uses captchas to help OCR document scanning. The name of the project escapes me, but to my knowledge it is the only one that uses the two words method you show in this post. What they do is they take one word the OCR program is sure about (in this case likely mountains) and one word the OCR program has no clue about (in this case... I don't blame the program. Clearly a bad scan). If a user gets the control word right (mountains) it "assumes" that the user also got the other word right. So you could type anything really, as long as the word after the space says mountains.
The project then does fault checking of course, a certain amount of users have to agree on the spelling of a word for it to be seen as probable fact... so even if you guess on a case like this you will not pollute their database.
Oh, also, I am pretty sure the word is peas.
