📜 ⬆️ ⬇️

What are the words of Google Instant blacklisted?

The authors of the publication “2600: The Hacker Quarterly” decided to compile a list of words prohibited by Google Live Search (Google Instant).

With the exception of some very specific cases, Google can be suspected of anything, but not censored. However, as we have said, there are a number of words with which the search volume, gigantic in scope, refuses to deal with.

We perfectly understand the intentions of Google. His team is trying to make sure that no one will see links to pornographic or violent resources that could cause a lot of anxiety (unless you really are looking for them). Asked about this a couple of weeks ago, Joanna Wright of Google replied that restrictions were imposed to protect children.

But it is easy to notice that by doing so Google is putting its own image at considerable risk, for these gaps will at best lead to a dead end, and at worst - will inflame feelings of a special category of scrupulous (and advanced) users who cannot understand The rules are running Google Live.
For example, the words "bisexual" and "lesbian" are among the forbidden. Type them in Google - and Live Search will immediately stop issuing search options in your window. And you will have to click on “Enter” to verify: yes, I really want to learn something related to bisexual or lesbian love.

Why does Google block these words?

Of course, you have the opportunity to search and find similar words in Google. The only question is that when they are typed, the Live Search will literally stop issuing your list of options literally, and you have to press “Enter” yourself to see the links you need.

This is because Alive Search does not use the fact that you typed in the search bar to display the results. He simply reads the data that has been collected over the years with respect to all previous user searches, trying ahead of time to predict what you intend to type on a blank line. Exactly the same algorithm is used as a basis for auto-filling a line using Google Suggest technology - in the old, not so “instant” Google search engine.

And if there are no ready-made words and phrases in the drop-down box - after you have typed “lesbian” or “asshole” - this is not at all for the reason that these results are blocked by internal censorship. Just trying to prevent Google from appearing before you the text of offensive content that users have already searched for and found in the past, if in fact you are looking for something completely harmless and harmless. (Earlier we had a controversy on this topic).

Countless users associate the word "lesbian" with the concept of "porn", introducing phrases that should not catch the eyes of children. That is why the Google's algorithm decides not to dump you immediately 20 links to lesbian porn sites, even if these links are the most frequent in the database of the algorithm.

Contacting this question to Google itself, we received the following comment from an authorized person:

“There are a number of reasons why you may not see the words you are looking for on a particular topic. For example, we put a hard filter on pornography, scenes of violence and discriminatory stories. I would like to note that removing a request from the auto-complete function is quite a difficult, technologically, task, far from being as simple as just a black list of forbidden words and phrases.

We receive more than a billion inquiries daily - and therefore we apply an algorithmic approach to filtering and deletions, which, of course, is far from perfect (as is the search algorithm). But we continue to work hard to improve it, carefully reading all your wishes and objections.

Our algorithm is directed not only to special words, but also to complex queries that are based on them, and in all of the languages ​​represented in Google. For example, if it is a bad word in Russian, we also delete the corresponding compound word, including its transliteration in Latin. In addition, we pay attention to the very result of your search. For example, if it appears to be pornographic, our algorithm may prevent the application of the auto-complete option to it, even if the request itself does not violate the above rules. Today, our system, of course, is not ideal and is not as fast as we would like, but we are constantly working to improve it. "

The highly efficient SafeSearch algorithm is still active in Live Search. He is able to quite effectively filter out potentially offensive content that may appear after the user presses on “Enter”. For example, the first results page, issued to a query about “lesbian” with a moderate degree of protection, turned out to be completely harmless.

Yes, the current development of Google is far from perfect - and company representatives confirm this. At the very least, we would like to be able to manually set parameters on some common notions and words that are prohibited only because they are sometimes associated with sexual, violent or discriminatory content.

Google representatives claim that they are constantly working to improve their system, but do not provide the slightest clarification of what changes we can expect in the future. So for now, if there is a need, you can check the full list of censored words on the site “2006: The Hacker Quarterly”.

Source: https://habr.com/ru/post/105513/

All Articles