As part of
TS Solution's corporate blog, we are starting a series of educational articles on a product for analyzing machine data like
Splunk . Most articles will be “how to tutorial”, a description of interesting cases and the solution of popular problems.
In this article we briefly describe the system itself and its purpose, as well as consider the options for installing it.

A few words about Splunk
Splunk is a platform for collecting, storing, processing and analyzing machine data, that is, logs. Today it is extremely popular in the USA and in Europe and gradually enters other markets, including Russia. One of the main features of the platform is that it can work with data from virtually any source, and therefore the list of possible applications of the system is very wide.

')
Splunk, in most cases, (automatically or using add-ons) parses the input data into fields and values ​​and subsequently processes them. Processing takes place through SPL queries (a special language from Splunk), with which you can build various samples and tables, sort, filter, aggregate, generate reports, create calculated fields, access both internal and external directories, create dashboards, with wide visualization spectrum and make alerts (for example, by the result of the request to send tickets to the Service Desk). All this can be packaged in your personal app.

The main differences or strengths Splunk
- Real Time Architecture : Splunk collects, searches, monitors and analyzes different and large enough (hundreds of TB of data per day) in real-time data volumes, and all this is one system.
Why is it important? Because Splunk can provide real-time data collection from thousands of disparate sources — and this can be either a physical or virtual host, or a cloud. Splunk also supports searching not only in real time, but also over the entire time interval for which data was collected. That is, we can search, monitor, alert, report and analyze for any time (historical data and real-time data in one solution). Finally, Splunk provides fast results and high interactivity of search queries for extremely large amounts of data.
- Universal Machine Data Platform : Splunk is a universal platform for machine data, which provides comprehensive data collection, processing and analysis. In this way, we can index any machine data with a time stamp regardless of structure and format. Splunk is able to combine machine data + business data + user data, which makes it extremely versatile.
- Schema on the Fly : Splunk searches for time, that is, you do not need to know the data structure in advance to form a query. You can select a time interval, enter a couple of keywords and quickly get acquainted with the data. There are no hard restrictions on columns, tables, and so on. This greatly increases the flexibility of the system. Also, any request can be stopped, paused or show intermediate results.
- Agile Reporting & Analytics : Splunk provides ample opportunities for the construction of analytics, reports and their visualization. In addition to target data, the system can also refer to external directories, for example, in SQL database. I would also like to say that Splunk is quite an open system and you can always add your own module, although the possibilities of visualization are very diverse.
- Scales from Desktop to Enterprise : Splunk uses MapReduce technology, which provides load balancing and horizontal system scalability, that is, we can start from one server for Splunk, and as data increases, we can quickly add a couple of new servers and distribute the load. Also, thanks to the MapReduce technology, Splunk can quickly process really large amounts of data without requiring outstanding hardware.
- Fast Time to Value : Splunk allows you to quickly get results from use. Implementation takes hours or days, not weeks and months. The same with scaling and exploitation.
- Passionate & Vibrant Community : Splunk has a very high quality, and most importantly, free community, which includes:
- Splunk Base - portal containing all kinds of applications and add-ons, 99% of which are free
- Splunk Answers - a forum with a large number of questions / answers and live participants.
- Splunk Dev - Developer Portal
- Splunk Dock - Complete Product Knowledge Base
Where can I download?
The free version of Splunk with a 500 MB index per day is available on
the company's official website , the only thing you need to do is register.
System requirements

Splunk supports both 32-bit and 64-bit bit architectures. Below are tables with available platforms for Splunk separately for Unix and Microsoft. The last column of the table contains information about Splunk Universal Forwarder. This is a separate distribution and a separate role in the Splunk platform, which acts as an agent and is solely responsible for collecting logs and sending them to the server.
Unix
A - version is available for download, but has no official support.
D - version is currently supported, but in future releases the company may remove it from official support
Windows
D - version is currently supported, but in future releases the company may remove it from official support
... - version is supported, but Splunk does not recommend using this architecture
Installation
After you have downloaded the installation file, simply run the installation and by default the system will rise in the base configuration. Detailed step-by-step installation instructions for Windows
here , on a Unix system
here .
After installing Splunk, the port 8000: localhost: 8000 should be accessible via the web interface and after changing the password and logging in, you will see the following interface.

This concludes the introductory review. In the next article we will explain how to load data into Splunk, how to use the SPL language, how to build graphs and dashboards.
Also, we recently did a general Web application about Splunk - you can see its recording on the link on
Youtube . In this webinar, the basic functionality was shown and some case studies of the product were described.