Just a brief pointer to a useful write up of some YARA performance guides today – https://github.com/Neo23x0/YARA-Performance-Guidelines/blob/master/README.md
This write up contains a lot of good insights into performance.
Musings on Technology
Just a brief pointer to a useful write up of some YARA performance guides today – https://github.com/Neo23x0/YARA-Performance-Guidelines/blob/master/README.md
This write up contains a lot of good insights into performance.
Today I’ll share just a short snippet that I used to look for some specific scheduled tasks on a Windows system. Luckily windows creates XML files that are located somewhere like the C:\Windows\System32\Tasks
folder. These files contain an XML representation of the scheduled tasks, and it is this that I am scanning with YARA.
Here’s a quick example of the rule:
// Detects the scheduled task on a Windows machine that runs Ransim
rule RansimTaskDetect : RansomwareScheduledTask {
meta:
author = "Ben Meadowcroft (@BenMeadowcroft)"
strings:
// Microsoft XML Task files are UTF-16 encoded so using wide strings
$ransim = "ransim.ps1 -mode encrypt</Arguments>" ascii wide nocase
$task1 = "<Task" ascii wide
$task2 = "xmlns=\"http://schemas.microsoft.com/windows/2004/02/mit/task\">" ascii wide
condition:
all of them
}
The scheduled tasks runs a short PowerShell script that simulates some basic ransomware behavior, and this rule just matches the XML file for that task. This file is encoded in UTF-16, so the $task1
and $task2
strings simply reference some strings with the wide
that are a part of the common strings found within the XML file (the start of the <Task
element, and the XML namespace used to define the schema), the ascii wide
modifiers searches for the string in both ascii and wide (double byte) form. The remaining string just looks for the invocation of the script as an argument to the task, and ignores the case used.
If I was looking for the presence of a task on live systems then I of course have other tools I could use, such as schtasks query
. However, as I am often operating on the backups of a system being able to use this file based approach can be very helpful as it doesn’t rely on the availability of the primary system when I want to identify whether a scheduled task was present at some historical point in time.
My prior posts about examining ZIP archives have covered matching the file names within a ZIP archive, as well as matching the pre-compression CRC values of the files within the archive. In this blog I am going to reference an interesting example of parsing the OOXML format used by modern Microsoft Office products. This office format essentially is a ZIP archive that contains certain files within it (describing the office document).
Aaron Stephens at Mandiant wrote a blog called “Detecting Embedded Content in OOXML Documents“. In that blog Aaron shared a few different techniques used to detect and cluster Microsoft Office documents. One of these examples was detecting a specific PNG file embedded within documents, the image was using to guide the user towards enabling macros. The presence of the image in this phishing doc could be used to indicate a clustering of these attacks.
Given the image files CRC, size, and that it was a png file, the author was able to create a YARA rule that would match if this image file was located within the OOXML document (essentially a ZIP archive). This rule approached the ZIP file a little differently than we have in my prior couple of blogs. The author skips looking for the ZIP file entry and references the CRC ($crc
) and uncompressed file size ($ufs
) hex strings directly to narrow down the match. They also checked if the file name field entry ended with the ".png"
extension.
rule png_397ba1d0601558dfe34cd5aafaedd18e {
meta:
author = "Aaron Stephens <[email protected]>"
description = "PNG in OOXML document."
strings:
$crc = {f8158b40}
$ext = ".png"
$ufs = {b42c0000}
condition:
$ufs at @crc[1] + 8 and $ext at @crc[1] + uint16(@crc[1] + 12) + 16 - 4
}
In this example the condition is using the @crc[1]
as the base from which the offsets are calculated, unlike our prior examples where the offsets were based from the start of the local file header. The use of the at
operator tests for the presence of the other strings at a specific offset (to the CRC value in this case).
An alternative approach to consider is using the wildcard character ?
in the hex string, this allows us to match on the CRC and uncompressed file size fields together while skipping over the 4 bytes used to store the compressed file size field. Then validating that the four letter .png
extension is at the end of the file name field.
rule png_alt {
strings:
$crc_ufs = {f8158b40 ???????? b42c0000}
$ext = ".png"
condition:
$ext at @crc_ufs[1] + uint16(@crc_ufs[1] + 12) + 16 - 4
}