Event Stream Processing (ESP) has been a central component of CrowdStrike Falcon®’s IOA approach since CrowdStrike's inception.
In this post we'll take a closer look at ESP — along with its utility and challenges — in an endpoint protection platform like CrowdStrike Falcon®. ESP is really a category of approaches, with a subset of those approaches being commonly referred to as Complex Event Processing (CEP).
Leveraging ESP in the information security space is an approach that has been evaluated in various forms, at least academically, since at least as far back as the 1990's.
Basically, ESP can be contrasted with systems that use retrospective, or "offline" analysis methods, such as table-based queries commonly associated with SQL, Splunk, and cloud technologies such as column-oriented databases.
For systems intended to identify patterns in a stream of events, retrospective queries are inefficient – frequently requiring that redundant computations be performed repeatedly.
ESP’s intent is to answer the same questions as offline retrospective querying, but by using more efficient "online” algorithms, which do not require access to an entire and finite data set in order to perform useful analysis. Here is an example.
Suppose you have a stream of process creation events from endpoint sensors.
Each event might contain information such as:
- Identifier for the machine
- Identifier for the process
- Identifier for the parent process
- Filename of the created process' executable filename
With a retrospective query system like SQL, we would need a nested query that first finds all process instances where ImageFileName=='cmd.exe', and then joins that result set with another query on ImageFileName=='iexplore.exe', and where ParentProcessId==ProcessId.
This search is obviously inefficient, since we must make two passes through the data.
What’s worse, doing this retrospectively with a standing query requires a huge amount of unnecessarily redundant computation.
In contrast, ESP provides a much more efficient approach by statefully holding onto only relevant data, and then correlating later events with that information. One straightforward ESP-based approach would be to store each instance of iexplore.exe as it is observed on the endpoint, hanging onto that knowledge for later correlation.
When an instance of cmd.exe is observed, we can take the ParentProcessId of the new event and compare it with the current set of saved iexplore.exe ProcessIds.
This approach is clearly more efficient than the retrospective query.
This example is highly simplified.
There are many approaches that can be classified as ESP, but this stateful correlation approach is a straightforward starting point to explain the concept. Keep in mind that while ESP is a very useful tool, it is not, by itself, a breakthrough technology.
However, while ESP itself is not challenging in theory, there are many challenges in making a robust ESP-based commercial solution.
Two important ones are scale and efficiency. While CrowdStrike Falcon® is perhaps best known for its class-leading cloud technology, an important and often overlooked aspect of its platform is the endpoint sensor itself.
Being able to efficiently perform ESP correlation on the sensor (and in the kernel!)
is unique in the industry. By performing ESP on sensors, in addition to correlation that happens in the cloud, the CrowdStrike Falcon® platform can operate on data at scales that are too prohibitive to achieve by centralizing all of the data.
For example, while CrowdStrike Falcon® gathers and processes a lot of data proactively in the cloud, sending all registry read operations to the cloud would multiply the data transmission, storage, and computational costs by perhaps 1000X.
And registry reads are useful for ESP correlation. Clearly, having to first centralize all data before being able to correlate it is the wrong approach.
Yet somehow, that bottleneck-laden approach is still common practice. However, simply "doing ESP" — even when correlation is done on the endpoint — is still not sufficient to create a detection and prevention platform that is truly “next-generation.”
Another important consideration is the nature of the events themselves, because details matter.
CrowdStrike Falcon® sensor has access to over 1,000 types of events, many of which provide the sensor with data that is entirely unique in the industry, resulting in a detection and prevention capability that is second to none.
These events indicate activity ranging from simple file I/O operations to privilege escalation.
Behavioral IOA correlation ties these together to detect and prevent malicious activity.
The result is technology sophisticated enough to detect when credential theft is occurring from a reflectively injected module in PowerShell, and to prevent that activity before it can actually be observed by the attacker. Another significant challenge that faces a would-be adopter of ESP is the question, "Where do ESP patterns come from?"
Just relying on what you know from coarse-grained, centralized data is not enough to illuminate what behaviors distinguish malicious software from benign.
Human insight combined with sophisticated analytical methods and data gathering techniques are required.
This synthesis of human-computer effort also enables CrowdStrike to leverage machine learning methods to further accelerate the pace and accuracy of IOA development.
Perhaps one day the need for human insight may be eliminated from the process of developing ESP-based IOAs.
And perhaps in that Utopian future, when my job has been completely automated, I will be able to once again find gainful employment blogging about security. Of course, some data outside of ESP is still useful to send to humans for analysis.
This data helps expert threat hunters in CrowdStrike's OverWatch group find new ways of detecting malicious behavior and malware.
As one example among many, CrowdStrike's platform proactively collects all information about inter-process activity — including data that is completely unique in the industry — and makes it all available to analysts. Using that data, OverWatch threat hunters can perform additional analysis that culminates in deploying new IOAs into the product rapidly through the cloud, automating detection of newly discovered behaviors and malware.
The number of different ways that the resulting platform can detect instances of malicious behavior is striking. So to do ESP right you need:
- An excellent data collection platform,
- A highly optimized sensor capable of providing those events without impacting users or server performance, and
- An agile and mature IOA development pipeline that starts with world-class adversary hunters.
and more are being added every day to the list of methods under active research at CrowdStrike.
Combining human expertise in information security with ESP technology has long been and will remain a core component of CrowdStrike’s approach, making the company a technology leader in the endpoint protection space.
CrowdStrike pioneered the use of ESP in the Endpoint Protection Platform market, and it will be unsurprising if others try to imitate CrowdStrike's approach, given its success. Perhaps the most obvious evidence of this leadership is the simple fact that, unlike others in the industry, CrowdStrike's IOA philosophy, message, and technical approach have remained consistent over the years and been well proven by withstanding the test of time. To learn more about CrowdStrike’s pioneering work in IOA behavioral detection, please read the whitepaper: Indicators of Attack Vs. Indicators of Compromise.