liminfo

YARA Reference

Free reference guide: YARA Reference

29 results

About YARA Reference

The YARA Rules Reference is a searchable quick-reference covering the syntax and features of YARA, the pattern-matching tool used by malware researchers, incident responders, and threat intelligence analysts to identify and classify malicious files. The reference is organized into six categories: Strings (text string definitions with modifiers like nocase, wide, ascii, and fullword; hex string patterns with byte sequences), Conditions (boolean logic with $a and $b, string occurrence counts with #a, offset conditions with @a, set operations like any of them, all of ($a*), and N-of-M matching with 2 of ($a, $b, $c)), Modules (import "pe" for PE file analysis, import "elf" for ELF binaries, import "math" for entropy calculation, import "hash" for MD5/SHA hash verification), Metadata (author, description, date, and reference fields for rule documentation), Regex (regular expression strings with /pattern/ syntax, case-insensitive /i flag, single-line /s flag), and Execution (yara CLI commands for single-file scanning, recursive directory scanning with -r, matched string display with -s, and count-only output with -c).

YARA is the de facto standard for writing malware signatures in the security industry. Unlike simple hash-based detection that breaks with a single byte change, YARA rules match patterns in file content — specific strings, byte sequences, file structure characteristics, and statistical properties like entropy. Security operations centers (SOCs) deploy YARA rules to scan incoming files, email attachments, and network-captured binaries. Threat intelligence teams share YARA rules through platforms like MITRE ATT&CK, VirusTotal, and industry ISACs to enable collective defense against emerging threats. The modular system (PE, ELF, math, hash) enables structural analysis beyond simple byte matching.

Each entry includes the exact YARA syntax, a description of its purpose, and a complete working example. The string section covers text strings with modifiers (nocase for case-insensitive matching, wide for UTF-16, ascii wide for both encodings, fullword for word-boundary matching) and hex strings for raw byte patterns like MZ headers ({ 4D 5A 90 00 }). The condition section demonstrates the powerful boolean logic: counting occurrences (#a > 3), checking offsets (@a < 100), and flexible set operations (any of them, all of ($a*), 2 of ($a, $b, $c)). Module examples show practical detection patterns: checking PE section counts, verifying ELF types, detecting high entropy (packed/encrypted content), and matching known file hashes. All content is browsable with instant search and dark mode support.

Key Features

  • String definitions: text strings with nocase, wide, ascii, fullword modifiers; hex byte patterns like { 4D 5A 90 00 } for signature matching
  • Condition logic: boolean AND/OR, occurrence counting (#a > 3), offset conditions (@a < 100), set operations (any of them, all of, N of M)
  • PE module: import "pe" for Windows executable analysis including pe.number_of_sections, section names, imports, and exports
  • ELF, math, and hash modules: ELF binary type checking, entropy calculation (math.entropy > 7.0 for packed detection), MD5/SHA hash matching
  • Metadata fields: author, description, date, and reference for rule documentation, attribution, and threat intelligence sharing
  • Regex support: /pattern/ syntax with /i (case-insensitive) and /s (single-line/dotall) flags for flexible pattern matching
  • CLI execution commands: yara for single-file scan, -r for recursive directory scan, -s for matched string display, -c for count output
  • Instant search and category filtering across all YARA syntax elements with no server processing

Frequently Asked Questions

What YARA features does this reference cover?

The reference covers six categories: Strings (text and hex string definitions with nocase, wide, ascii, fullword modifiers), Conditions (boolean logic, occurrence counting, offset conditions, set operations like any of them and N-of-M), Modules (PE for Windows executables, ELF for Linux binaries, math for entropy, hash for MD5/SHA verification), Metadata (author, description, date, reference fields), Regex (regular expression patterns with /i and /s flags), and Execution (yara CLI commands for scanning files and directories).

What is the difference between nocase, wide, ascii, and fullword modifiers?

nocase makes string matching case-insensitive, so "Malware" matches "malware", "MALWARE", etc. wide matches UTF-16 encoded strings (common in Windows), where each ASCII character is followed by a null byte. ascii matches standard single-byte strings. You can combine ascii wide to match both encodings. fullword ensures the string appears as a complete word — "domain" with fullword matches "domain.com" but not "subdomain" because the string must be bounded by non-alphanumeric characters or file boundaries.

How do I detect packed or encrypted files with YARA?

Use the math module to check entropy: import "math" and then set the condition to math.entropy(0, filesize) > 7.0. Files with entropy above 7.0 (on a scale of 0-8) are likely compressed, packed, or encrypted, as their byte distribution is nearly random. You can also combine this with PE module checks: pe.number_of_sections > 5 or checking for unusual section names, which are common indicators of packing. For specific packers, match their characteristic byte signatures using hex strings.

How does the "any of them" condition work?

"any of them" matches if any of the strings defined in the rule are found in the target file. This is useful when a malware family uses different variants of command strings. "all of them" requires every defined string to be present. You can use wildcards: "all of ($a*)" requires all strings starting with $a. For more precise control, "2 of ($a, $b, $c)" requires at least 2 of the 3 specified strings to match. These set operations are essential for writing rules that detect malware families with variable indicators.

How do I use hex strings to match byte patterns?

Define hex strings with curly braces: $hex = { 4D 5A 90 00 } matches the exact byte sequence (in this case, a standard MZ/PE header). You can use wildcards: { 4D 5A ?? 00 } where ?? matches any single byte. Jumps are specified with brackets: { 4D 5A [2-4] 00 } matches 2 to 4 arbitrary bytes between 4D 5A and 00. Alternatives use parentheses: { (4D 5A | 7F 45) } matches either MZ or ELF headers. These features make hex strings powerful for matching binary signatures that have minor variations across samples.

What are YARA modules and which ones are covered?

YARA modules extend rule capabilities beyond simple string matching. This reference covers four modules: pe (analyzes Windows PE files — section count, imports, exports, timestamps, resources), elf (analyzes Linux ELF binaries — file type ET_EXEC/ET_DYN, sections, symbols), math (mathematical functions — entropy calculation for detecting packed/encrypted content), and hash (cryptographic hashes — verify MD5, SHA1, SHA256 of files or file regions). Modules are loaded with import "module_name" and their functions/properties are used in the condition section.

How do I scan files with the YARA command-line tool?

Basic scan: yara rules.yar target.exe applies rules to a single file. Recursive directory scan: yara -r rules.yar /path/to/directory/ scans all files in the directory tree. Show matched strings with offsets: yara -s rules.yar target.exe displays which strings matched and where. Count-only mode: yara -c rules.yar target.exe outputs just the rule name and match count. You can combine flags: yara -r -s rules.yar /samples/ recursively scans and shows matched strings for all files.

Is any data sent to a server when using this reference?

No. The entire YARA reference dataset is embedded in the page and rendered client-side. Searching, filtering by category (Strings, Conditions, Modules, Metadata, Regex, Execution), and browsing entries all happen within your browser using JavaScript. No YARA rules, file data, or search queries are transmitted to any server.