Tuesday, 1 December 2009

Type I error and monitoring intrusions

The following post contains a few notes on experiments into incident analysis and the rate of error in determination.

An experiment was conducted where known incidents where replayed. Existing PCAP capture traces from client sites with known attack and incident patterns are loaded into an analysis system for evaluation purposes. The OSSIM and BASE frontends to snort had been deployed for this exercise.

SQL scripts where altered to display a random lag into the responses and tcpdump was used to replay the PCAP trace as if it occurred 'live'. The analyst had to decide if each incident was worth escalating or should be noted and bypassed. The results of this process are reported below through a display of type I errors.
It is easy to see that as the response time of the system increases, so does the analyst's error rate. Basically, the lag in returning information to the analyst has a direct causal effect. The longer the lag between requesting the page and that where the page is returned and the greater is the error rate in classifying events.

To this we can add a Loess calculated plot of the expected error against time.
It this plot we can clearly see that the slope increases sharply after around 4 seconds. As such, it is critical to ensure that responses to queries are returned in under 3-4 seconds.

No comments: