📜 ⬆️ ⬇️

More about Google encodings

This topic has already raised the issue of coding services Google. However, it dealt with the incorrectness of the texts of the agreements. In one of my projects, I ran into encoding problems when working with one of the Google APIs. The piquancy of the situation is that the problem arose when working with the undocumented API, and did not really want to “burn” in the support service. Searching online solutions did not give (options with “repeat until it works” were not taken as serious). How did I manage to find a way out and decide everything myself?
First about the project:
In my spare time I develop translators for mobile phones, J2ME, Blackberry and Android platforms. At some point on the forums where the program is being discussed, the guys started complaining about an incomprehensible bug. In random order, instead of the translated text, users received some kind of "hieroglyphs". They appeared in one case from 5-10 translations, and could not bother a person at all for several days. There was no definitive geography (there were complaints from the CIS countries, and from Latin America, and from Asia, and from Europe). The only thing that united was the phone model. The application has a built-in logger, and you can send its contents to me by mail with one click of a button. I made small changes, and the translation results began to be written there. Sometimes guys sent logs, but they didn't manage to understand what was going on.
We get acquainted with a bug:
So the problem would not be solved until the Samsung C3510 Corby fell into my hands. Having installed an application on it, I discovered that there in 100 cases out of 100 the translation comes in “hieroglyphs”. Ok, problems with Cyrillic are well known. What was my surprise when even a translation from English to French led to the same result. But this is unusual.
So what the hell is going on there:
Having fairly mocked the translation, I sent a letter and began to watch it already on the PC.
Some moments turned out to be interesting:
- special characters (colon, parentheses, and so on) came correctly;
- Cyrillic came not true;
-the payer came, too, is not true;
-The installation of User-Agent does not affect the result;
-Installation of the UTF-8 encoding in the body of the POST request helped only partially, English ones appeared;
The conclusion was that a non-standard encoding is used for services, and besides, it is not ASCII-based, since the English would in this case have to be in a normal form. In addition, the bug is somehow tied to a specific phone model.
//    [[["R\u0457S\u0402ReR\u0406R\u03BCS , R\u0458ReS\u0402","ò\u0457ó\u0402ò£ò\u0406ò\u00B5ó\u201A ò\u0458ò£ó\u0402","","R\u00ED\u0308S\u0110R\u00EBR\u00CDR\u00B5S\u201A R\u01F0R\u00EBS\u0110"]],,"ru",,[["R\u0457S\u0402ReR\u0406R\u03BCS",[5],1,0,1000,0,1,0],[",",[6],0,0,1000,1,2,0],["R\u0458ReS\u0402",[7],1,0,1000,2,3,0]],[["ò\u0457ó\u0402òÅò\u0406ò\u03BCó",5,[["R\u0457S\u0402ReR\u0406R\u03BCS",1000,1,0]],[[0,11]],"ò\u0457ó\u0402ò£ò\u0406ò\u00B5ó\u201A ò\u0458ò£ó\u0402"],[",",6,[[",",1000,0,0]],[[11,12]],""],["ò\u0458òÅó\u0402",7,[["R\u0458ReS\u0402",1000,1,0]],[[13,19]],""]],,,[["uk","ru"]],3] //  UTF-8 [[["hello world","ÐÒÉ×ÅÔ ÍÉÒ","","privet mir"]],,"ru",,[["hello world",[5],1,0,954,0,2,0]],[["ÐÒÉ×ÅÔ ÍÉÒ",5,[["hello world",954,1,0],["a hello world",0,1,0]],[[0,10]],"ÐÒÉ×ÅÔ ÍÉÒ"]],,,[["ru"]],23] //   [[["hello world"," ","","privet mir"]],,"ru",,[["hello world",[5],1,0,954,0,2,0]],[[" ",5,[["hello world",954,1,0],["a hello world",0,1,0]],[[0,10]]," "]],,,[["ru"]],1] 


How to solve:
The number of encodings in the phones was initially small (UTF-8, ISO 8859-1 and a couple more, if lucky), so I had to write a “manual” decoding of the byte array into the text of the desired encoding. The test application translated "Hello World", and in a cycle went through all the encodings, printing the received text to the console. CP1251, ISO-8859-7 and so on naturally did not live up to the wait, but the correct text was received (as it turned out, this comment was prophetic) with the coding KOI8-RU. On the other phones, standard UTF-8 is activated.

For those who love technical details.
  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u0402 \ u0403 \ u201a \ u201e \ u201e \ u2026 \ u2020 \ u2021 \ u20ac \ u2030 \ u0409 \ u2039 \ u040a \ u040c \ u040b \ u040f \ u0452 \ u2018 \ u2019 \ u201c \ u201d  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u0459 \ u203a \ u045a \ u045c \ u045b \ u045f \ u040e \ u045e \ u0408 \ u00a4 \ u0490 \ u00a6 \ u00a7 \ u0401 \ u00a9 \ u0404 \ u00ab \ u00ac \ u00ad \ u00ae \ u0407  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u0491 \ u00b5 \ u00b6 \ u00b7 \ u0451 \ u2116 \ u0454 \ u00bb \ u0458 \ u0405 \ u0455 \ u0457 \ u0410 \ u0411 \ u0412 \ u0413 \ u0414 \ u0415 \ u0416 \ u0417 \ u0418  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u041d \ u041e \ u041f \ u0420 \ u0421 \ u0422 \ u0423 \ u0424 \ u0425 \ u0426 \ u0427 \ u0428 \ u0429 \ u042c \ u042b \ u042a \ u042d \ u042e \ u042f \ u0430 \ u0431  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u2122 \ u0459 \ u203A \ u045A \ u045C \ u045B \ u045F \ u00A0 \ u040E \ u045E \ u0408 \ u00A4 \ u0490 \ u00A6 \ u00A7 \ u0401 \ u00A9 \ u0404 \ u00AB \ u00AC \ u00AD  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u041B \ u041C \ u041D \ u041E \ u041F \ u0420 \ u0421 \ u0422 \ u0423 \ u0424 \ u0425 \ u0426 \ u0427 \ u0428 \ u0429 \ u042A \ u042B \ u042C \ u042D \ u042E \ u042F  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u012B \ u013C \ u0161 \ u0144 \ u0146 \  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u02DD \ u017E \ u017C \ u0154 \  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u2265 \ u00A0 \ u2321 \ u00B0 \ u00B2 \ u00B7 \ u00F7 \ u2550 \ u2551 \ u2552 \ u0451 \ u2553 \ u2554 \ u2555 \ u2556 \ u2557 \ u2558 \ u2559 \ u255A \ u255B \ u255C  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u2561 \ u0401 \ u2562 \ u2563 \ u2564 \ u2565 \ u2566 \ u2567 \ u2568 \ u2569 \ u256A \ u256B \ u256C \ u00A9 \ u044E \ u0430 \ u0431 \ u0446 \ u0434 \ u0435 \ u0444  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ \ u0414 \ u0415 \ u0424 \ u0413 \ u0425 \ u0418 \ u0419 \ u041A \ u041B \ u041C \ u041D \ u041E \ u041F \ u042F \ u0420 \ u0421 \ u0422 \ u0423 \ u0416 \ u0412 \ u042C  /** *****j2me ****** **/ public static String detectEncoding() { try { String sentence = " "; String qq = encodeSequence(sentence); HttpConnection net = (HttpConnection) Connector.open(query , Connector.READ_WRITE, true); net.setRequestMethod(HttpConnection.POST); net.setRequestProperty("Host", "translate.google.com"); net.setRequestProperty("User-Agent", "Opera/9.64"); net.setRequestProperty("Referer", "translate.google.com"); net.setRequestProperty("Content-Type", "application/x-www-form-urlencoded"); net.setRequestProperty("Accept", "*/*"); net.setRequestProperty("Proxy-Connection", "close"); net.setRequestProperty("Connection", "Keep-Alive"); net.setRequestProperty("Accept-Charset", "utf-8"); String locale = System.getProperty("microedition.locale"); String l = "en"; if (!locale.startsWith("zh-")) { if (locale.indexOf('-') == -1) { l = locale; } else { l = l.replace('_', '-'); l = locale.substring(0, locale.indexOf('-')); } l = Utils.toLowerCase(l).trim(); } else { l = locale; } net.setRequestProperty("Accept-Language", l); OutputStream output = net.openOutputStream(); output.write(("sl=" + "ru" + "&tl=" + "en" + "&ie=UTF-8&client=t&text=" + qq) .getBytes()); output.close(); resp = net.getResponseCode(); resp2 = net.getResponseMessage(); if (resp == HttpConnection.HTTP_OK) { InputStream is = net.openInputStream(); ByteArrayOutputStream out = new ByteArrayOutputStream(); int b = 1; while ((b = is.read()) >= 0) { out.write(b); } out.flush(); is.close(); net.close(); byte[] buff = out.toByteArray(); String enc = detectEncoding(buff, sentence); if (!enc.equals("")) { return (enc); } } else { net.close(); throw new Exception("Invalid ResponseCode " + resp + " " + resp2); } } catch (Exception e) { System.out.println("#### " + e.toString()); } return ("UTF-8"); } public static String[] charsets = new String[]{"WINDOWS-1251", "KOI8-R", "WINDOWS-1257", "ISO-8859-1", "ISO-8859-2", "UTF-8", "UNICODE"}; protected static char[] iso8859_1map = "\u0402\u0403\u201a\u201e\u201e\u2026\u2020\u2021\u20ac\u2030\u0409\u2039\u040a\u040c\u040b\u040f\u0452\u2018\u2019\u201c\u201d\u2022\u2013\u2014\u2122\u0459\u203a\u045a\u045c\u045b\u045f \u040e\u045e\u0408\u00a4\u0490\u00a6\u00a7\u0401\u00a9\u0404\u00ab\u00ac\u00ad\u00ae\u0407\u00b0Z\u00b1\u0406\u0456\u0491\u00b5\u00b6\u00b7\u0451\u2116\u0454\u00bb\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041a\u041b\u041c\u041d\u041e\u041f\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042c\u042b\u042a\u042d\u042e\u042f\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f".toCharArray(); protected static char[] cp1251map = "\u0402\u0403\u201A\u0453\u201E\u2026\u2020\u2021\u20AC\u2030\u0409\u2039\u040A\u040C\u040B\u040F\u0452\u2018\u2019\u201C\u201D\u2022\u2013\u2014\uFFFD\u2122\u0459\u203A\u045A\u045C\u045B\u045F\u00A0\u040E\u045E\u0408\u00A4\u0490\u00A6\u00A7\u0401\u00A9\u0404\u00AB\u00AC\u00AD\u00AE\u0407\u00B0\u00B1\u0406\u0456\u0491\u00B5\u00B6\u00B7\u0451\u2116\u0454\u00BB\u0458\u0405\u0455\u0457\u0410\u0411\u0412\u0413\u0414\u0415\u0416\u0417\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u0420\u0421\u0422\u0423\u0424\u0425\u0426\u0427\u0428\u0429\u042A\u042B\u042C\u042D\u042E\u042F\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044A\u044B\u044C\u044D\u044E\u044F" .toCharArray(); protected static char[] cp1257map = "\u20AC\0\u201A\0\u201E\u2026\u2020\u2021\0\u2030\0\u2039\0\250\u02C7\270\0\u2018\u2019\u201C\u201D\u2022\u2013\u2014\0\u2122\0\u203A\0\257\u02DB\0\240\0\242\243\244\0\246\247\330\251\u0156\253\254\255\256\306\260\261\262\263\264\265\266\267\370\271\u0157\273\274\275\276\346\u0104\u012E\u0100\u0106\304\305\u0118\u0112\u010C\311\u0179\u0116\u0122\u0136\u012A\u013B\u0160\u0143\u0145\323\u014C\325\326\327\u0172\u0141\u015A\u016A\334\u017B\u017D\337\u0105\u012F\u0101\u0107\344\345\u0119\u0113\u010D\351\u017A\u0117\u0123\u0137\u012B\u013C\u0161\u0144\u0146\363\u014D\365\366\367\u0173\u0142\u015B\u016B\374\u017C\u017E\u02D9" .toCharArray(); protected static char[] iso8859_2map = "\200\201\202\203\204\205\206\207\210\211\212\213\214\215\216\217\220\221\222\223\224\225\226\227\230\231\232\233\234\235\236\237\240\u0104\u02D8\u0141\244\u013D\u015A\247\250\u0160\u015E\u0164\u0179\255\u017D\u017B\260\u0105\u02DB\u0142\264\u013E\u015B\u02C7\270\u0161\u015F\u0165\u017A\u02DD\u017E\u017C\u0154\301\302\u0102\304\u0139\u0106\307\u010C\311\u0118\313\u011A\315\316\u010E\u0110\u0143\u0147\323\324\u0150\326\327\u0158\u016E\332\u0170\334\335\u0162\337\u0155\341\342\u0103\344\u013A\u0107\347\u010D\351\u0119\353\u011B\355\356\u010F\u0111\u0144\u0148\363\364\u0151\366\367\u0159\u016F\372\u0171\374\375\u0163\u02D9" .toCharArray(); protected static char[] koi8rmap = "\u2500\u2502\u250C\u2510\u2514\u2518\u251C\u2524\u252C\u2534\u253C\u2580\u2584\u2588\u258C\u2590\u2591\u2592\u2593\u2320\u25A0\u2219\u221A\u2248\u2264\u2265\u00A0\u2321\u00B0\u00B2\u00B7\u00F7\u2550\u2551\u2552\u0451\u2553\u2554\u2555\u2556\u2557\u2558\u2559\u255A\u255B\u255C\u255D\u255E\u255F\u2560\u2561\u0401\u2562\u2563\u2564\u2565\u2566\u2567\u2568\u2569\u256A\u256B\u256C\u00A9\u044E\u0430\u0431\u0446\u0434\u0435\u0444\u0433\u0445\u0438\u0439\u043A\u043B\u043C\u043D\u043E\u043F\u044F\u0440\u0441\u0442\u0443\u0436\u0432\u044C\u044B\u0437\u0448\u044D\u0449\u0447\u044A\u042E\u0410\u0411\u0426\u0414\u0415\u0424\u0413\u0425\u0418\u0419\u041A\u041B\u041C\u041D\u041E\u041F\u042F\u0420\u0421\u0422\u0423\u0416\u0412\u042C\u042B\u0417\u0428\u042D\u0429\u0427\u042A" .toCharArray(); public static String detectEncoding(byte[] bytes, String exemple) { for (int i = 0; i < charsets.length; i++) { String ss = byteArrayToString(bytes, charsets[i]); if (ss.indexOf(exemple) != -1) { return charsets[i]; } } return ""; } public static String byteArrayToString(byte[] bytes, String charSet) { String output; char[] map = null; if (charSet.equalsIgnoreCase("WINDOWS-1251") || charSet.equalsIgnoreCase("WINDOWS1251") || charSet.equalsIgnoreCase("WIN1251") || charSet.equalsIgnoreCase("CP1251")) { map = cp1251map; } else if (charSet.equalsIgnoreCase("KOI8-R")) { map = koi8rmap; } else if (charSet.equalsIgnoreCase("WINDOWS-1257")) { map = cp1257map; } else if (charSet.equalsIgnoreCase("ISO-8859-1")) { map = iso8859_1map; } else if (charSet.equalsIgnoreCase("ISO-8859-2")) { map = iso8859_2map; } else if (charSet.equalsIgnoreCase("UTF-8")) { try { return (decodeUTF8(bytes, false)); } catch (Exception udfe) { } map = cp1251map; } if (map != null) { char[] chars = new char[bytes.length]; for (int i = 0; i < bytes.length; i++) { byte b = bytes[i]; chars[i] = (b >= 0) ? (char) b : map[b + 128]; } output = new String(chars); } else { try { output = new String(bytes, charSet); } catch (UnsupportedEncodingException e) { output = new String(bytes); } } return output; } private static String decodeUTF8(byte[] data, boolean gracious) throws UTFDataFormatException { byte a, b, c; StringBuffer ret = new StringBuffer(); for (int i = 0; i < data.length; i++) { try { a = data[i]; if ((a & 0x80) == 0) { ret.append((char) a); } else if ((a & 0xe0) == 0xc0) { b = data[i + 1]; if ((b & 0xc0) == 0x80) { ret.append((char) (((a & 0x1F) << 6) | (b & 0x3F))); i++; } else { throw new UTFDataFormatException("Illegal 2-byte group"); } } else if ((a & 0xf0) == 0xe0) { b = data[i + 1]; c = data[i + 2]; if (((b & 0xc0) == 0x80) && ((c & 0xc0) == 0x80)) { ret.append((char) (((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F))); i += 2; } else { throw new UTFDataFormatException("Illegal 3-byte group"); } } else if (((a & 0xf0) == 0xf0) || ((a & 0xc0) == 0x80)) { throw new UTFDataFormatException( "Illegal first byte of a group"); } } catch (UTFDataFormatException udfe) { if (gracious) { ret.append("?"); } else { throw udfe; } } catch (ArrayIndexOutOfBoundsException aioobe) { if (gracious) { ret.append("?"); } else { throw new UTFDataFormatException("Unexpected EOF"); } } } data = null; return ret.toString(); } /** * * * */ 

Total:
The code has been rewritten so that the first start determines the encoding, is written to the long-term memory, and then each time it is translated with this encoding, the result is checked for correctness, in case of a mismatch, the encodings are searched. The implementation has been working successfully for several months and I finally forgot about the problem.
Thanks to Stanislav Mayantsev for making me lift my ass and re-do all the code manipulations.

')

Source: https://habr.com/ru/post/149792/


All Articles