Using Bloom Filters to Efficient Filter Out "Known Good"
There are many times in security investigations where we want to quickly filter out “Known Good” and only focus on what remains. When these are files shipped as part of the operating system, you can use Authenticode and Code Signing to figure this out. But what if this is something more ad-hoc, like the command lines used in Windows Scheduled Tasks?
You could possibly store all of this stuff in a database, schematize it, and then figure out a way to query it at scale. Another alternative would be to hash the data (i.e.: the hash of the command line) and store that as a data set somewhere.
A big problem with the hash approach, however, is that it takes up a relatively large amount of space. My system has 177 scheduled tasks on it, representing about 6KB of content. At 64 bytes per SHA256 hash, that’s already 11KB of data.
One data structure that’s been used to address the challenge of keeping large sets of hashes in memory is called a Bloom Filter. Bloom Filters can tell you with 100% certainty when something is not in the data set. With a very small false positive rate, they can also tell you that something is in the data set. This false positive rate means that 1 in a billion times, it’ll say that something was in the data set when it really wasn’t.
This probabilistic feature makes Bloom Filters very efficient. For that same set of 177 scheduled tasks, a Bloom Filter can encode these items into about 1KB of data – 10 times smaller than a list of hashes. In Base64, you can easily include this in automation, database queries, Sentinel queries, and more.
To make Bloom Filters easy to work with, I’ve uploaded New-BloomFilter and Test-BloomFilter to the PowerShell Gallery.
You can see this in action with this example, where our Bloom Filter can easily detect that a scheduled task has been tampered with.
These are distributed on the PowerShell Gallery as scripts rather than a module to make it easy for you to package the Test-BloomFilter command into endpoint investigation and remediation tools. You can download them from the PowerShell Gallery using the Install-Script command: