📜 ⬆️ ⬇️

We translate books using Yandex.API

Why do you need it


Once it occurred to me to read one book in the author’s native language, English. After reading a few pages, it became clear that it would take a long time to read, because behind every unfamiliar word one had to reach for a dictionary, search for the necessary page, break eyes with many other words similar to the one I was looking for ... Yes, and carry along with them extra five hundred pages. So he decided to collect a small dictionary, just for this book, which could be looked into in any situation - even when there is no Internet access.

What will come of it


At the output we get the most common .txt file, in which rare lines from the book and their translation will be placed alphabetically on each line. Such a dictionary can be easily embedded, for example, in a MIDlet for a mobile phone, or directly on the site.

Why Yandex.API


In Yandex, everything is simple: I sent the word - the translation came. No need to register a unique key, as on Google.translate, and he does not complain about a large number of requests.

Implementation


To accomplish this, the following files will be created:

')
handler.php

<?php set_time_limit(0); #       ignore_user_abort(); #        fopen('flag','x'); #  ,       @unlink('translated.txt'); #    ,   $text = file_get_contents('martin_eden.txt'); #     (martin_eden.txt)   $symbols = array('!',',','.','\'','"','-',':',';','?',"\r",'(',')'); $text = str_replace($symbols, '', $text); #      $text = str_replace("\n", ' ', $text); #      $text_array = explode(' ',$text); # ''    foreach($text_array as $val){ #      if($val==''){continue;} $val = strtolower($val); if(array_key_exists($val, $words)){ #       ,   $words[$val]++; }else{ #   -  $words[$val] = 1; } } ksort($words); #       foreach($words as $w=>$v){ #        (   1-5) if($v<=5){ $rare_words[$w]=$v; } } $w_total = sizeof($rare_words); #          $src = fopen('total.txt','w'); fwrite($src, $w_total); fclose($src); $src_trns = fopen('translated.txt','a'); #  ,      $cnt=0; foreach($w_a as $w=>$v){ #    if(!file_exists('flag')){ die(); #  " - "   -    } /*      translate.yandex    GET-,  lang -   (   ), text - ,    . */ $arr = json_decode(file_get_contents('http://translate.yandex.net/api/v1/tr.json/translate?lang=en-ru&text='.$w,3), true); if($w!=$arr['text'][0]){ #        ,     fwrite($src_trns, $w.'|'.$arr['text'][0]."\n"); #      } $cnt++; #       $src = fopen('current.txt','w'); #        fwrite($src, $cnt); fclose($src); } fclose($src_trns); unlink('flag'); #      "" -      ?> 


stop.php

 <?php @unlink('flag'); #   "",           ?> 


index.html

 <!DOCTYPE HTML> <html> <head> <!--   () --> <meta http-equiv="content-type" content="text/html; charset=UTF-8"> <!--   JQuery (       ) --> <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js" type="text/javascript"></script> <script> /*     */ function refresh(){ /*     */ $.post("current.txt",function(data) { current = data; }); /*        #status */ $.post("total.txt",function(data) { $('#status').html(current+' / '+data); }); /*        */ setTimeout(function(){ refresh(); }, 1000); } /*     */ function stop(){ $.post("stop.php"); /*    stop.php,      */ } /*     */ function start(){ $.post("handler.php"); /*     handler.php,     */ } $('document').ready(function(){ refresh(); /*     ,     */ }); </script> </head> <body> <input type="button" value="Stop" onClick="stop();"> <input type="button" value="Start" onClick="start();"> <div id="status"></div> </body> </html> 


results


With the help of such a children's bike translated 5982 words in 1033 seconds (an average of 5.78 words per second). This is relatively long, partly because I did not invent methods to speed up the translation (ideally, you could send several requests at the same time, but we don’t want to offend Yandex).

This example can be used to translate rare words from a book or article, when the goal is the result of the translation itself, not the process. If you use a remote server for this purpose, the translation process will also occur when the computer is turned off.

Source: https://habr.com/ru/post/155675/


All Articles