πŸ“œ ⬆️ ⬇️

Kana-captcha for PHP - it's easy!


In this topic, I will briefly tell you about Kana-Capche, what it is, how it works and how to make it using PHP.



Katakana


In Japanese, two syllable alphabets (katakana, hiragana) and pro-Chinese characters (kanji) are used for writing. By the way, kanji have little in common in pronunciation with Chinese characters. They were brought to Japan at the beginning of our era and they developed in their own way (man-yogan). There are countless kanjas (knowledge of 2-4 thousand kanji is considered to be the limiting degree of education of the Japanese). In the case of syllabic alphabets, everything is simpler, hiragana (can be used to write anything) has 47 basic characters, the same number of characters is also present in katakana (used mainly to write borrowed words, for example ニ γ‚Ώ (m-ni-ta) - monitor).
')
The syllabary alphabet is called so precisely because each character is a syllable consisting of one consonant and one vowel (with the exception of the vowels and the consonant "n", these signs have their own symbols). Hiragana is considered to be the main one (in some cases, hiragana transcription is prescribed over kanjas, for the convenience of poorly educated people), katakana is secondary. That is what we consider today.

And we will consider it because the katakana characters are the easiest to write, therefore it will be easier for us, ε€–δΊΊ, to enter and recognize them. By the way, it is katakana that Japanese children study first of all, for the same reason.

A small addition - in the picture at the beginning of the topic in Katakana it is written "チ γƒ₯ γƒΌ γƒͺ ン γ‚° テ γ‚Ή γƒˆ" (ti-yu-ri-n-hoo-te-s-something), which means "Turing Test". "Turing" is written in Katakana, as this is a foreign name proper. In the case of "Test" the reason is that this word is borrowed.

Development


In the meantime, let us remember what for us, 馬鹿 ε€–δΊΊ, kana-kapcha. This is an image with the symbols of katakana \ hiragana, in the field you need to enter the transcription . Just because not all ε€–δΊΊ in the system have support for entering the characters we need.

I will take my script as a basis, and modify it only slightly. To begin with, let us define what our kana-captcha will be. These will be two or three black katakana symbols on a white background + intersecting black lines.

First, I want to get rid of all indecency in my script. I remove the eye-catching background, make the text and lines black. We get this:


Now it's time to shove our katakana into our picture. First of all, let's decide on the font, for Japanese syllabic alphabets, I think the best variant is MS Gothic , the cutest font. To my surprise, the .TTC font was perceived by PHP absolutely normal. Honestly, I expected a lot of hemorrhoids with this type of TrueType font.

Next, you need to modify the script that generates the captcha code. First, let's feed him new symbols, namely: "" γ‚€ ウ エ γ‚ͺ γ‚« γ‚« γ‚­ γ‚― γ‚±. γ‚΅ γ‚· γ‚Ή γ‚» γ‚½ γ‚½ γ‚Ώ チ ツ テ γƒˆ γƒŠ ニ ネ γƒŒ γƒŽ ハ γƒ’ フ γƒ˜ ホ γƒž γƒž γƒŸ γƒ  パ γƒ’ ダ ユ ヨ ラ γƒͺ フ γƒ˜ ホ γƒž γƒŸ γƒ  These are all basic katakana characters. The generation function will look like this:
function generate_code() { $chars = 'をむウエγ‚ͺγ‚«γ‚­γ‚―γ‚±γ‚³γ‚΅γ‚·γ‚Ήγ‚»γ‚½γ‚Ώγƒγƒ„γƒ†γƒˆγƒŠγƒ‹γƒγƒŒγƒŽγƒγƒ’γƒ•γƒ˜γƒ›γƒžγƒŸγƒ γƒ‘γƒ’γƒ€γƒ¦γƒ¨γƒ©γƒͺルレロワン'; $length = rand(2, 3); $numChars = mb_strlen($chars, "UTF-8"); $str = ''; for ($i = 0; $i < $length; $i++) { $str .= mb_substr($chars, rand(1, $numChars) - 1, 1, "UTF-8"); } return $str; } 
Note! I use multi-byte variants of the strlen and substr functions.

Then a little bit, let's randomize the random positions of the characters in the generation script, as well as introduce multibyte functions:
  $x = rand(0, 35); for($i = 0; $i < mb_strlen($code, "UTF-8"); $i++) { $x+=27; $letter=mb_substr($code, $i, 1, "UTF-8"); imagettftext ($im, $font_arr[$n]["size"], rand(3, 4), $x, rand(54, 55), "000000", img_dir.$font_arr[$n]["fname"], $letter); } 
As a result, we get this:


Almost done. I checked the work of the captcha by entering the correct version of exactly what the katakana characters. But what to do 馬鹿 ε€–δΊΊ who do not have support for entering Japanese characters? It is necessary to make so that validation would pass with transcription. You need to write the whole function:
 function kanatoroma($str){ $replace_of = array('γ‚’','γ‚€','ウ','エ','γ‚ͺ','γ‚«','γ‚­','γ‚―','γ‚±','γ‚³', 'γ‚΅','γ‚·','γ‚Ή','γ‚»','γ‚½','γ‚Ώ','チ','ツ','テ','γƒˆ','γƒŠ','ニ','ネ','γƒŒ', 'γƒŽ','ハ','γƒ’','フ','γƒ˜','ホ','γƒž','γƒŸ','γƒ ','パ','γƒ’','ダ','ユ','ヨ', 'ラ','γƒͺ','ル','レ','γƒ­','γƒ―','ン'); $replace_by = array('a','i','u','e','o','ka','ki','ku', 'ke','ko','sa','shi','su','se','so','ta','chi','tsu','te', 'to','na','ni','ne','nu','no','ha','hi','fu','he','ho','ma', 'mi','mu','me','mo','ya','yu','yo','ra','ri','ru','re','ro','wa','n'); $_result = str_replace($replace_of, $replace_by, $str); return $_result; } 

With this wonderful feature, we save the correct captcha solution in the form of romaji to the session.
That's all :) Kana-Captcha is ready!
Test yourself for katakana knowledge
Cheat

By the way, I took this domain specifically, so that in the future I would host all kinds of nishtyaks under it, which I will publish in HabrΓ© :)

And yes, this is not my last topic about captcha, the most interesting thing to come :)

Source: https://habr.com/ru/post/121029/


All Articles