Security professionals no longer wonder if their organization will be breached but when. Cyber criminals now standardly use sophisticated, multi-stage attack techniques that get past traditional defenses, which rely on rules and signatures for detection and prevention. So, in addition to dealing with the fallout resulting from a data breach, security teams must also figure out how it happened in the first place and what else may have been been impacted.

The clues to detect these sophisticated attacks are buried in vast troves of network traffic which, for a variety of reasons, most organizations are not able to collect (e.g., deployment cost and complexity, enterprise activity in the cloud, etc.) Even if they did, security teams would be faced with the unenviable and time-consuming task of manually trying to figure out how the attack happened and what security measure are needed to prevent future occurrences.

Numerous in-market solutions attempt to automate this task by leveraging advanced analytics techniques that can surface the entire attack chain. Where they fall short is in the timeliness of threat detection. That’s because most solutions are built using big data technologies which have to perform data analysis in large batches in order to deliver on the efficiency promise of big data. Unfortunately, that introduces detection latencies - sometimes on the order of hours.

Though security professionals may have to resign themselves to the fact that their cyber defenses will be breached, there is no reason for them to have increased suffering due to delayed detection. For example, consider ransomware, a bane for healthcare organizations (though others are not immune). Every minute that such an attack is not detected provides the malware with additional opportunity to spread laterally within the organization and encrypt more data that can be held for ransom.

Stream processing can help, delivering real-time detection of advanced attacks and an accompanying mitigation of the damage caused. A well designed stream processing-based analytics solution is just as capable of producing correct, consistent and repeatable results as a batch solution (i.e., big data-based). Plus a streaming approach is designed with infinite data sets in mind. That’s important given that enterprise network traffic, already very large, is anticipated to grow significantly in the next three years.

Of course, building a stream processing-based analytics solution is easier said than done. Without compromising on detection accuracy such a solution must, at a minimum, be able to:

  1. Analyze live data, before it is stored (which is what big data-based systems must do)

  2. Integrate and stream historical data with live data for complete context and to continually reassess the presence of threats

  3. Process large volumes of high-velocity data with minimal latency.

  4. Handle imperfections in the data stream (i.e., delayed, missing or out-of-order data).

  5. Apply an ever growing variety of analytics in parallel to find any relationship between disparate security events and surface an attack

  6. Automatically distribute processing across multiple processors and machines, scaling up and down as needed.

While the requirements are stringent, the end results are well worth it: an optimized analytics engine with minimal overhead that delivers real-time threat detection using high volume network traffic data.

Multi-stage attacks infiltrate enterprise networks through a variety of different channels, gestate over long time periods, and use creative ways to exfiltrate data. To be effective against such attacks, a stream processing-based analytics solution must be able to holistically evaluate the state of the enterprise, both in the present and the past. Creating a single, unified haystack of any enterprise network activity going back as far as necessary will enable the analytics engine to reason about the enterprise as a coherent whole.

The cloud is a natural fit for a stream processing-based analytics solution. It provides the scalable compute that’s needed to perform multi-dimensional analytics on large volumes of live data. And the cloud also provides the scalable storage that can easily support the single unified haystack, however large, needed for the analytics engine to consider both the present and the past.

The purist may argue that "real time" detection of unknown attack isn’t possible. After all, the breadth of data analysis needed to surface a multi-stage attack requires some amount of time. However, with a streaming solution, that delay is only because analytics are bounded by network latency. For practical security teams, attacks being detected in near real time is well beyond good enough and allows them to quickly mitigate the impact of advanced attacks. The difference is detection of advanced modern threats within seconds, compared to minutes, hours, or even sometimes days from traditional batch systems.

So, as you evaluate solutions for detecting advanced threats, consider this question: if there is an analytics solution that can provide near real-time threat detection, why would you settle for anything else?

Next blog post