Sigma rules. Craft or new standard for SOC

I am Sergey Rublev, head of the SOC (Security Operations Center) at Infosecurit company.
In this article I will discuss in detail the ambitious project Sigma Rules , the motto of which is: “Sigma for logs is like Snort for traffic and Yara for files”.

It will be about three aspects:
')

The applicability of the Sigma-rules syntax for maintaining a knowledge base of threat identification scenarios
Possibilities of tools for generating rules for boxed SIEM systems
The SOC value of the current content of the Sigma-rules public repositories

Once upon a time, in a galaxy far far away

It all started a few years ago, when the trees were large, and our monitoring team was still small. We are faced with a lot of questions, almost any team that grows into three people passes through this.

The causes of the questions are different:

Team growth
Staff turnover
A large number of heterogeneous systems on monitoring

In case you have to take a SIEM already tuned by someone, the number of questions grows like an avalanche.

Use Case Library

The world experience of building monitoring centers has already come up with a solution for organizing chaos and his name is the library of juz cases. The goal of each case is to comprehensively describe the solution of a certain task within the framework of information security monitoring.

The composition of knowledge laid down in each case can vary, we are repelled by the following set:

Objective - a task solved by a case
Threat - a threat that is detected by the detection rule.
Stakeholders - people interested in the work of this rule: IB / IT / Business
Data Requirements - the data set required to identify the threat
Logic - the logic of detecting threats
Testing - an algorithm for testing the correctness of the detection rule
Priority - priority of event handling by case (as a rule, it is calculated from the potential damage from a successfully realized threat)
Output - A list of actions for parsing an alert, a description of the correct exits from the parsing procedure and the composition of the data recorded in the parsing results

Example use case for the task of detecting communication with the botnet control server (C & C popularly or just C2):

The example is considerably simplified; in reality, the case with proper description expands into a multipage document.

At that moment, when the number of cases exceeded several dozen, we began to look for ready-made tools for maintaining such a knowledge base, preferably having, besides human friendly, also some kind of machine friendly interface for work.

Sigma project

The Sigma project certainly deserves consideration in the context of the knowledge base on incident detection rules. He started in 2016, and I follow him almost from the very beginning.

In fact, the project consists of

Samih Sigma-rules
Utilities for converting rules to queries for various SIEM systems

The list of SIEM is impressive: there are almost all popular solutions for analyzing events. Then everything in detail and in order.

Syntax rules

Sigma rules are YAML documents that describe a script to detect a particular attack. Syntactically, the rules consist of the following blocks:

Meta-information

The descriptive part to structure and simplify the search for the necessary rules.

title: Access to ADMIN$ Share description: Detects access to $ADMIN share author: Florian Roth falsepositives: - Legitimate administrative activity level: low tags: - attack.lateral_movement - attack.t1077 status: experimental

Separately, I would like to note that many of the rules are already supplied with links to the attack technique using the MITER ATT & CK methodology.

Data source declaration

Description of the source based on the events of which the detection logic is implemented.

 logsource: product: windows service: security

Syntactically, it is possible to describe both the final service of a specific product, and the whole category of systems.

Declaration of processing logic

At the detection logic level, the following are described:

Required patterns
Values of certain fields in the log
Time frame
Aggregate functions

The logic can be as trivial, for example, the conditions imposed on a set of fields:

 detection: selection: EventID: 5140 ShareName: Admin$ filter: SubjectUserName: '*$' condition: selection and not filter

and quite complicated:

 detection: selection1: EventID: - 529 - 4625 UserName: '*' WorkstationName: '*' selection2: EventID: 4776 UserName: '*' Workstation: '*' timeframe: 24h condition: - selection1 | count(UserName) by WorkstationName > 3 - selection2 | count(UserName) by Workstation > 3

Although expressive means of language are not universal, they are still quite wide and allow us to describe a large number of cases for detecting attacks.

Rule development tools

In addition to your favorite text editor, SOC Prime's WEB UI is also available for YAML, which allows both to validate the syntax of an already written rule and to create rules manually from graphic blocks.

Sigma as a means of maintaining the knowledge base

Let's summarize a brief summary.

At the moment, the syntax of the rules mainly concentrates on the description of the threat detection logic and is not intended for a comprehensive description of the use case, respectively, it will not work to maintain a full-fledged library using only Sigma Rules.

For the use case structure we chose, Sigma covers only half (Objective, Data requirements, Logic and Priority).

Conversion to various SIEM

Since we are a SOC service provider, the idea of keeping all our developments according to the correlation rules in a universal format looked very tempting to us and at the implementation stage to convert the necessary SIEM into the format.

The project includes console utilities for generating event requests in the format of various SIEM. Consider what constitutes a conversion and what is under its hood.

The conversion takes place in 3 stages:

Parsing the rules - I think everything is clear with this: the YAML document is parsed into its component blocks
Reduction to the taxonomy of the SIEM
The need for this stage is due to the fact that normalization in the SIEM systems is implemented a little differently, respectively, the declaration from the Sigma-rules must be brought to the taxonomy of the events of the selected SIEM.
Request generation for SIEM
For this stage, another component is required - the backend for this SIEM.
In fact, the backend is a plugin for the conversion utility, which incorporates the conversion logic to the final request format in SIEM. The detection and logsource blocks are converted based on the previously applied field mapping, additional SIEM-specific information is added.

As a result, the launch of the conversion utility is as follows:

The following parameters are passed as parameters:

Target SIEM
Rule
File with mappings for this SIEM

SOC Prime also has a ready UI for the conversion function ( uncoder.io )

Conversion pitfalls

Having studied the mechanics of conversion, we faced significant limitations, which kept us from transferring all the developments into the Sigma format:
The converter operates only with the request. The correlation rule in the SIEM covers more aspects: time window, aggregation, actions based on the results of detected alerts
Key features of individual SIEMs, for example, ActiveLists, are not taken into account.
Insufficient detailing of the mapping of fields - as part of the mapping configuration, the fields of just a few sources are described, respectively, having rules for several dozens of different types of event sources in the base, you have to invest heavily in writing mapping.

Rule base

Let's see what the publicly available Sigma rule base carries. Currently, content is actively being added to two repositories:

The main repository of the project
SOC Prime Threat Detection Marketplace

The rules in the repository have a non-zero intersection.
SOC Prime has a number of rules distributed in a paid subscription, I do not consider their content in this article.

For analytics, we need a sigmatools library for Python and some programming skills.

To parse and download the rules from the catalog to the dictionary, you can use the following code:

 from sigma.parser.collection import SigmaCollectionParser import pathlib import itertools def alliter(path): for sub in path.iterdir(): if sub.name.startswith("."): continue if sub.is_dir(): yield from alliter(sub) else: yield sub def get_inputs(paths, recursive): if recursive: return list(itertools.chain.from_iterable([list(alliter(pathlib.Path(p))) for p in paths])) else: return [pathlib.Path(p) for p in paths] BASE_PATH = [r'sigma\rules'] path_list = get_inputs(BASE_PATH, True) rules_map = {} for sigmafile in get_inputs(BASE_PATH, True): f = sigmafile.open(encoding='utf-8') parser = SigmaCollectionParser(f) rule = next(iter(parser)) rules_map[rule['title']] = rule

Deduplicating the same rules, the following picture emerges:

Within the framework of a unique list of rules, we obtain the following distributions:

By type of event source:

Slightly larger statistics

Windows ~ 80%
Sysmon ~ 53%
Proxy ~ 8%
Linux ~ 4%

Basically, the current content is focused on the Windows system and Sysmon, in particular, only a few of the rules on other systems.

By the degree of content readiness:

It turns out that the developers of the Sigma-rules marked as stable, less than 20% of all existing rules.

Let's sum up

There are a large number of rules in publicly available sources. They are regularly updated, and the rules for detecting indicators appear quickly, and sometimes even the technician for the loudest APT companies.

To apply the rules in real life there are a large number of restrictions:

A lot of rules for Microsoft Sysmon, which is rarely used in the enterprise.
There are many rules that actually verify IoC (hashes, IP addresses, URLs, User Agents). Such rules quickly become obsolete, and there are more efficient mechanisms than rules for searching for IoC.
A lot of experimental content, respectively, imposes additional requirements on high-quality testing before commissioning.

In Infosecurity, we use the content of Sigma-rules as an additional source of knowledge for more efficient detection of incidents. If we find something interesting, we implement it already within the framework of our correlation rules, which take into account the core of the rules (Apache Spark), and the specifics of the infrastructures and the security tools we use.

Source: https://habr.com/ru/post/442570/

All Articles