Hacking 101: Introduction to YARA rules

Akshay Jain
6 min readJan 23, 2020

--

You might have heard about YARA rules, it is an open-source tool for detecting and reversing hashes of arbitrary files and directories on an end system. And the thing about that is that the result of executing the rules via a regular command line is a perfectly nice file inclusion detector. It’s an extremely powerful tool

YARA rules are a method for distinguishing malware (or files) by making decides that search for specific qualities. YARA was initially created by Victor Alvarez of Virustotal and is mostly utilized in malware research and location. It was created with the plan to depict designs that distinguish specific strains or whole groups of malware.

YARA is a tool focused on causing malware researchers to recognize and characterize malware tests. With YARA you can make descriptions of malware families (or anything you desire to depict) in view of literary or parallel examples. Every portrayal, a.k.a rule, comprises of a lot of strings and a boolean articulation which decide its rationale. below is an example :

rule test
{
meta:
description = "Example"
threat_level = x
strings:
$a = {6A 40 68 00 30 00 00 6A 14 8D 91}
$b = {8D 4D B0 2B C1 83 C0 27 99 6A 4E 59 F7 F9}
$c = "PEJXQZAKCBGMTUVODFRYSIHLNW"
condition:
($a or $b) and ($c or $d)
}

YARA rules are easy to compose and comprehend, and they have a sentence structure that looks like the C language, below are keywords used in creating Yara rules

Rules are commonly made out of two things: strings definition and condition. The strings definition segment can be precluded if the standard doesn’t depend on any string, however, the conditioning segment is constantly required. The strings definition area is the place the strings that will be a piece of the standard are characterized. Each string has an identifier comprising in a $ character followed by an arrangement of alphanumeric characters and underscores, these identifiers can be utilized in the conditioning segment to allude to the comparing string. Strings can be characterized in content or hexadecimal structure, as appeared in the accompanying model:

rule Example_Rule
{
strings:
$my_text_string = "text"
$my_hex_string = { E2 34 A1 C8 23 FB }
condition:
$my_text_string or $my_hex_string
}

Each standard needs to begin with the word rule, trailed by the name or identifier. The identifier can contain any alphanumeric character and the underscore character, however, the primary character isn’t permitted to be a digit. There is a rundown of YARA watchwords that are not permitted to be utilized as an identifier since they have a predefined meaning.

Comments:

YARA rules can be included with comments similar to c source code, single-line and multi-line comments can be used in Yara rules.

/*
This is a multi-line comment.
*/
rule Example // this is a single-line comment
{
condition:
True
}

Strings:

There are three types of strings in YARA: hexadecimal strings, text strings and regular expressions.

Hexadecimal strings are used for defining raw sequences of bytes, while text strings and regular expressions are useful for defining portions of legible text. However, text strings and regular expressions can be also used for representing raw bytes by mean of escape sequences as will be shown below.

To give the condition section meaning strings are utilized. The strings sections are where you can define the strings that will be looked for in the file. Let’s look at an easy example.

rule vendor
{
strings:
$text_string1 = “anything”
$text_string2 = “dummy”
condition:
$text_string1 or $text_string2
}

Conditions:

Conditions are just Boolean articulations as those that can be found in all programming dialects, for instance in an if explanation. They can contain the run of the mill Boolean administrators and, or and not and social administrators >=, <=, <, >, == and !=. Additionally, the number-crunching administrators (+, — , *, \, %) and bitwise administrators (and, |, <<, >>, ~, ^) can be utilized on numerical articulations.

String identifiers can be likewise utilized inside a condition, going about as Boolean factors whose worth relies upon the nearness or not of the related string in the record

rule condition_Example
{
strings:
$a = "text1"
$b = "text2"
$c = "text3"
$d = "text4"
condition:
($a or $b) and ($c or $d)
}

String counterbalances:

In most of the cases, when a string identifier is utilized in a condition, we are eager to know whether the related string is anyplace inside the record or procedure memory, yet now and then we have to know whether the string is at some particular counterbalance on the document or at some virtual location inside the procedure address space. In such circumstances, the administrator at is the thing that we need. This administrator is utilized as appeared in the accompanying model:

rule AtExample

{

strings:

$a = “dummy1”

$b = “dummy2”

condition:

$a at 100 and $b at 200

}

The articulation $a at 100 in the above model is genuine just if string $a is found at counterbalance 100 inside the record (or at virtual location 100 whenever applied to a running procedure). The string $b ought to show up at balance 200. If it’s not too much trouble note that the two counterbalances are decimal, anyway hexadecimal numbers can be composed by including the prefix 0x before the number as in the C language, which comes exceptionally convenient when composing virtual locations. Likewise, note the higher priority of the administrator at over the and.

While the at administrator permits to scan for a string at some fixed balance in the document or virtual location in a procedure memory space, the in administrator permits to look for the string inside the e of counterbalances or addresses.

rule example

{

strings:

$a = “dummy1”

$b = “dummy2”

condition:

$a in (0..100) and $b in (100..filesize)

}

In the model over the string $a must be found at a counterbalance somewhere in the range of 0 and 100, while string $b must be at a balanced among 100 and the finish of the document. Once more, numbers are decimal as a matter of course.

You can likewise get the counterbalance or virtual location of the I-th event of string $a by utilizing @a[i]. The files are one-based, so the principal event would be @a[1] the second one @a[2], etc. In the event that you give a record more prominent, at that point the number of events of the string, the outcome will be a NaN (Not A Number) esteem.

Conclusion:

To put it plainly, YARA is flexible, incredible and accessible. Its expectation to learn and adapt is delicate and its application is wide. In this present reality where your adversary covers up on display and around the bend, it has crazy identification ability to illuminate the suspicious, pernicious or plain simply intriguing. In the event that it hasn’t found a home in your toolbox, you ought to consistently request the best.

Wanna connect:

Linkedin: https://www.linkedin.com/in/akshay-jain-533a79111/

Email: Akshayjain5@protonmail.com

you can also visit my GitHub account: https://github.com/akshay1729

--

--