One of the problems we set out to solve was to improve the signal to noise ratio for incident responders and security analysts. Most security teams are overwhelmed with security alerts and alarms from a multitude of point solutions. This leaves them with the daunting and often impossible task of prioritizing information and connecting the dots.

Our cloud platform makes it possible to automate this process by employing a wide range of analytic techniques, scoring the outputs of these techniques, and then generating prioritized security events based on the outputs we observe. In other words, we create and analyze thousands of Observations and roll them up into a more manageable, actionable list of security Events.

We receive a lot of questions about what we call Events and Observations. What exactly do they mean and how do they translate to traditional security terms like alerts and alarms? In this post I am going to explain these terms and the methodology behind them. In short, Observations are the context that describes network traffic. Observations are scored yielding attributes such as severity and confidence. Events are composed of Observations that describe security incidents.

Observations

Alarm fatigue is a serious problem for security teams. The 2015 Ponemon Institute Survey: "The Cost of Malware Containment" found that the average enterprise receives 16,937 malware alerts a week from their IT security products, of which only 19 percent are deemed reliable, and only 4 percent are investigated. The sheer volume of alerts has created what is often referred to as car alarm syndrome. There are so many alerts, you begin to become deaf to them. When was the last time you responded to a car alarm?

If you receive an alert that indicates a compromise, but you receive it along with tens of thousands of similar alerts, the odds of the valid alert being discovered are very low. On the other hand, alerts provide a lot of context. Each can add a valuable piece of information to the overall puzzle. Without them, you may miss detecting something significant in the first place. If each alert or alarm on its own were high confidence this wouldn't be an issue. However, alerts typically need to be viewed in the context of the whole picture to provide meaning. You need to be able to take a step back and see how (and if) the thousands of Observations are connected.

The scale achieved by cloud computing gives us the ability to examine and store every aspect of a network's traffic. We break each flow into many components and then score each component to derive context. These scores are what we call Observations.

Observations can be as simple as known malicious IP addresses associated with a flow or as complex as determining that a host has deviated from its typical behavior. Observations can be high volume and would typically cause alarm fatigue if an analyst had to review each one individually. Since context can be useful in finding attacks, but comes in volumes too large for analysts to handle, our platform interprets the Observations in a way that produces high confidence, low volume security events.

Events

Humans can only handle so much information in a given day. Therefore, the most important information should be prioritized for review. Any cumbersome, time consuming dot connecting should be done automatically and not by hand. Network intrusions are typically complex and involve putting a lot of pieces together. For example if a host is exploited and the payload sleeps for days before phoning home, it can be very laborious to identify the initial infection vector. If the initial infection vector isn't identified, there are lingering questions surrounding whether the host itself was infected from an outside source or if the infection came from lateral movement within the network.

Figure 1 illustrates how Observations relate to an Event in our Visualizer. The screenshot visualizes both the Event and the seven Observations that make up the Event. In this example, an internal host was redirected to an Angler Exploit Kit page. The page served an exploit which successfully installed the Upatre downloader on the system. The downloader then installed the banking trojan Dyre. Finally, Dyre is seen beaconing. All these Observations are connected to track an exploit attempt through successful compromise.

alarm-fatigue-blog-post-copy-bright.png

Figure 1. Attack Progression Event involving an exploit attempt through infection with a banking trojan.

With full fidelity packet capture and cloud computing technology it's possible to have all the context you need to detect malicious activity on your network without being overwhelmed by the data. Additionally, having the ability to quickly move from a high confidence Event to the packets that caused the Event greatly reduces the time and effort required to inspect and resolve security incidents.

Next blog post