📜 ⬆️ ⬇️

Spell check with Google

Sometimes a project requires checking data for spelling errors without relying on the user's knowledge of some languages. Here Google will be able to help us, with their service such as spell checking, used in the Google Toolbar. But, unfortunately, Google does not provide an open API to work with it.

So, a brief description:
In order to check the text, we need to send it to POST to https://google.com/tbproxy/spell?lang=en , where to change the language, replace the value of the lang parameter with the corresponding ISO 3166-1 alpha-2 . The text is formatted as XML:
<? xml version ="1.0" encoding ="UTF-8" ? > <br/>
< spellrequest textalreadyclipped ="0" ignoredups ="0" ignoredigits ="1" ignoreallcaps ="1" > <br/>
< text > </ text > <br/>
</ spellrequest >
ignoredups - highlight replays
ignoredigits - count digits as errors
ignoreallcaps - do not check words written by caps (suggested by pointum )

If everything is successful, we get a response like
<? xml version ="1.0" encoding ="UTF-8" ? > <br> < spellresult error ="0" clipped ="0" charschecked ="272" > <br> < c o ="27" l ="13" s ="0" ></ c > <br> < c o ="73" l ="11" s ="1" > </ c > <br> < c o ="190" l ="11" s ="1" > </ c > <br> < c o ="226" l ="13" s ="0" > - </ c > <br> </ spellresult >
Attributes of the spellresult tag:
error - whether an error occurred
charschecked - the number of checked characters
')
It lists the errors made (tags c ), their parameters:
o - the beginning of the source word in the text
l is the length of this word
s - accuracy of the result

The c tag itself contains the suggested spellings of the words, separated by the \ t character .

PS When this text was already written, accidentally stumbled upon an article in the blog Paul Welter , which, in principle, describes the same thing ...


UPD: A wayly user wrote a PHP class for checking text using this service, you can download it at the link proxysoft.ru/files/spellchecker.zip ( mirror ).

UPD2: mezhevikin prompted ajax solution using this service - orangoo.com/labs/?page_id=3

UPD3: List of Supported Languages

_________
The text was prepared in Habra Editor
Code highlighted in Source Code Highlighter

Source: https://habr.com/ru/post/50137/


All Articles