The article is intended for interested and beginners, however, it does not “chew on” simple things that can be read with the help of documentation or specialized articles. The most useful resources, a link to the sources (distributed under the BSD license ) and a link to the working version are given at the end of the article.
And in general, why not use the source code of the above Glow? First, they are quite specific for the amount of data Mozilla was using — remember the number of Firefox installations on the launch day, and also the fact that their logging system is decentralized. In our case, about 100 records per second are written to a single log file at the peak, of which only a part needs to be visualized. Secondly, the map in Glow is not the most pleasant in appearance. And thirdly, this is a test task :)
tail -f
). In addition, it should be noted that once a day the log file is closed and carefully archived, and a new file takes its place, that is, you need to monitor these actions and switch to the current log.Tailer
class from the well-known Apache Commons library, but we will go our own, partly in a similar way. Our TailReader
class TailReader
initialized by the directory in which the log is located, with a regular schedule that describes the name of the log file (since it can change) and the update period — the time through which we will periodically check for new entries in the log. The class interface resembles working with standard I / O streams (streams), however, it blocks the execution process when calling nextRecord()
, if no new entries appear in the log. To check for new entries (without blocking), you can use the hasNext()
method. Since the log is monitored in a separate thread (not to be confused with I / O, thread), there are start()
and stop()
methods for controlling the operation of the thread. If the file stream is closed (the log has been sent for archiving), after a set number of attempts to read, the class object will decide that it is time to open a new log. The log is searched for according to the rules specified in getLogFile()
: /** * - * @return - null */ private File getLogFile() { File logCatalog = new File(logFileCatalog); File[] files = logCatalog.listFiles(new FileFilter() { @Override public boolean accept(File pathname) { return pathname.canRead() && pathname.isFile() && pathname.getName().matches(logFileNamePattern); } }); if (0 == files.length) return null; if (files.length > 1) Arrays.sort(files, new Comparator<File>() { @Override public int compare(File o1, File o2) { return (int) (o1.lastModified() - o2.lastModified()); } }); return files[files.length - 1]; }
RecordParser
class, as it is not difficult to guess, analyzes the lines of the log file using regular expressions. The LogEvent parse(String record)
method returns a simple object that encapsulates the event type and IP address, or null
if we are not interested in the given log record (this, by the way, is not the best practice in the world of Java development - it's better to use the Null pattern Object ). In this case, the records are also filtered from the requests of search robots (they are not exactly users of the store, right?).IpToLocationConverter
class IpToLocationConverter
with resolving IP addresses to their respective geo-coordinates using the Maxmind ( Java API to it ) and IpGeoBase services (accessed through the XML API , which is encapsulated in the com.ecwid.geowid.daemon.resolvers
package ). Maxmind rather lousy rezolvit Russian addresses, so we use an additional IpGeoBase. The Maxmind API is trivial, the resolving is done through a database file located locally. For IpGeoBase, a resolver was written that caches calls to the service for obvious reasons.Point
) are stored in a buffer - an object of the PointsBuffer
class and “reset” when it is filled into the server in JSON format (serializing objects using Gson ).GeowidDaemon
class. The daemon settings are stored in XML (vulgarity on my part, it would be possible to manage properies or YAML files, but I wanted to try XML to Object mapping ). pay attention to <events> <event> <type>def</type> <pattern>\b((?:\d{1,3}\.){3}\d{1,3})\b\s+script\.js</pattern> </event> <event> <type>mob</type> <pattern>\b((?:\d{1,3}\.){3}\d{1,3})\b\s+mobile:</pattern> </event> <event> <type>api</type> <pattern>\b((?:\d{1,3}\.){3}\d{1,3})\b\s+api:</pattern> </event> </events>
def
- the opening of the "ordinary" casterer; mob
- the opening of the mobile casterer; api
- the service API call. The type is determined by the location in the log of the substring corresponding to a particular regular program in which the IP is allocated to the group.GeowidServlet
minimalist: it can receive data from a daemon and give it to clients. The most interesting in this respect is the following code: @Override protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { synchronized (continuations) { for (Continuation continuation : continuations.values()) { continuation.setAttribute(resultAttribute, req.getParameter(requestKey)); try { continuation.resume(); } catch (IllegalStateException e) { // ok } } continuations.clear(); resp.setStatus(HttpServletResponse.SC_OK); } } @Override protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException { String reqId = req.getParameter(idParameterName); if (null == reqId) { resp.sendError(HttpServletResponse.SC_BAD_REQUEST, "Request ID needed"); logger.info("Request without ID rejected [{}]", req.getRequestURI()); return; } Object result = req.getAttribute(resultAttribute); if (null == result) { Continuation continuation = ContinuationSupport.getContinuation(req); synchronized (continuations) { if (!continuations.containsKey(reqId)) { continuation.setTimeout(timeOut); try { continuation.suspend(); continuations.put(reqId, continuation); } catch (IllegalStateException e) { logger.warn("Continuation with reqID={} can't be suspended", reqId); resp.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR); } } else if (continuation.isExpired()) { synchronized (continuations) { continuations.remove(reqId); } resp.setContentType(contentType); resp.getWriter().println(emptyResult); } else { resp.sendError(HttpServletResponse.SC_BAD_REQUEST, "Request ID conflict"); } } } else { resp.setContentType(contentType); resp.getWriter().println((String) result); } }
getPseudoGUID()
function here ), if the ID is not present - we “sew” the client. This is necessary in order to correctly identify the continuation associated with a particular client. Next, we check whether the attribute containing the necessary data is set for this request. Naturally, if the client came to us for the first time, there can be no talk about any data. Therefore, we create a continuation for it with a given timeout, suspend it and place it in a hash table for storage. However, there are situations where the continuation timeout has expired, and there is no data as it is, or not. In this case, checking the if (continuation.isExpired())
condition if (continuation.isExpired())
helps us; when it passes, the servlet gives the client an empty array in JSON, removing the corresponding continuation from the table as unnecessary.doGet()
method is re-entered for each continuation, but with the data the user needs.You can, for example, measure the mysterious power of these very continuations with the help of a profiler under load. For this, the author used VisualVM and Siege . From the author testers mediocre, so the test looked very artificial. JVM “warmed up” for about an hour, settling into a 15Mb heap space. After that, with the help of Siege, we load the server with parallel 3000 requests per second (I did not want to poke around in the system to raise the limits for open files, etc.) for 5 minutes. JVM otshila ~ 250Mb heap space, loading the processor core by ~ 10-15%. I think a good result for beginners.
Immediately make a reservation: maybe my JavaScript code will seem “non-canonical” from the point of view of a professional frontend developer. To judge those who will understand in my code :)
. , , , ( www.html5canvastutorials.com KineticJS). : , . : .
Source: https://habr.com/ru/post/158333/
All Articles