Scribblings of a TechnoBuff

Exchange IIS ASP.NET OCS Sharepoint Windows

Archive for the ‘Internet’ Category

Google in Telugu

Posted by Sujeeth on September 19, 2008

I am very happy with what Google is doing in terms of bringing new services and features for the web community. I just noticed that they have a search facility for Telugu which works very clever. Since Telugu is my mother tongue, I am very excited about this.

Google search in Telugu enables users to start typing in English and automatically get query suggestions in Telugu.

If you wanted to search for ”pustakam” in Telugu, just start typing it in English – e.g. “pustak” and it will show the Telugu suggestions

Now it becomes very easy to enter any telugu word with existing keyboards. Google have the same features for other Indian languages that converts English text to phonetically equivalent text.

Posted in Internet | Tagged: , , , | 5 Comments »

Digitizing Books using CAPTCHA

Posted by Sujeeth on August 21, 2008

I came across this morning, a very interesting service that prevents spam using CAPTCHA and uses those results to digitize books.

What is a CAPTCHA?
A CAPTCHA is a program that can generate and grade tests that humans can pass but current computer programs cannot. The term CAPTCHA (for Completely Automated Turing Test To Tell Computers and Humans Apart) was coined in 2000 by Luis von Ahn, Manuel Blum, Nicholas Hopper and John Langford of Carnegie Mellon University. At the time, they developed the first CAPTCHA to be used by Yahoo.

reCAPTCHA is a free CAPTCHA service that helps to digitize books.

reCAPTCHA is a system developed at Carnegie Mellon University which utilizes CAPTCHA to assist in the process of digitizing the text of books, while protecting websites from bots attempting to access restricted areas. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA.

But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

More info: http://recaptcha.net/learnmore.html

get it for FREE

Posted in Internet | Tagged: , , , | Leave a Comment »

 
Follow

Get every new post delivered to your Inbox.