Tuesday, 11 October 2011

Checklists make good better

We all like to believe that as we become better at a task that we no longer need help. That somehow we will remember everything. This is false and in this post I how to allay that belief with a couple quick results from some research I shall be publishing in the new year.

What was done was simple. We measured people with and without a checklist as they responded to incidents. These are the same individuals measured when they used a checklist and without. These are simply normal people in a number of organisations I have consulted to. Nothing special in itself, but results were measured based on times.

The results themself will of course vary as people have better and worse days. That stated, we can make a hypothesis that there would be no statistically significant difference in average results between the times taken from the start of an incident or event to the determination that an event had or had not occurred. This is the value we measured and was defined as time in minutes t.

We could even state that if t was larger for those with a checklist, the effect was negative and a checklist made things worse.

So, to define or variables and hypothesis.

We measured the following variables:

·         tij             Here i is the ith individual with the measurement j in minutes to determine a response and select if an event was an incident or not.

·          tij(check)  This is the subset of readings where the individual i used a checklist as measured in minutes

·         tij(free)     This represents the subset of readings where the individual i did not use a checklist.

Now, with these variables, we can calculate the following:

·         ti(ave)       This is the average response time in minutes for an individual i.

·         ti(check)    This is the average time for the individual to respond and determine if an event is an incident using the checklist.

·         ti(free)      This is the average time for the individual to respond and determine if an event is an incident without using the checklist.

Now, the test and hypothesis is very simple.

We define Ho as the null hypothesis and Ha as the alternative hypothesis. We state our hypothesis as follows:

Ho          ti(check) =   ti(free)

Ha           ti(check) <>  ti(free)

Or, the null hypothesis is that there is no difference in how long it will take an individual on average using a checklist in respond and determine an event is an incident or not and the alternative hypothesis is that the use of a checklist will result in a difference in how long the responder reacts. That is, the time with a checklist will be significantly different to that with a checklist.

Although each event will vary in nature and the responder will vary in ability through the day and at different points in their lives, the averages when taken over time should be the same. To ensure this, the responders used their own checklists based on the best practice as they determined and defined it.

The process to randomise if a checklist was used was simple, a coin toss determined if the responder used the checklist or not. There are limitations to this, but we all have to work within the constraints of the world and scientific studies on live companies and with actual incidents need to be measured in a manner that allows the organisation to function as it is being experimented on.

In the boxplot below we have displayed the results.

clip_image002[5]

Just looking at the two datasets, we see that there is a difference in the standard deviations with a larger range of values for the responses without a checklist then those recorded when a checklist was used. If we look at the statistics in R (our statistical package, we see a mean (average) value of 14.3602 minutes for responses without the use of a checklist and 14.00188 when a checklist is used.

> summary(ticheck)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.

  0.000   9.878  13.680  14.000  18.000  41.260

> summary(tifree)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.

  0.000   8.246  14.060  14.360  20.280  64.600

> 

The mean values are only 21 seconds different on average over a mean of around 14 minutes. This visual analysis does not give the real result. By conducting a Student’s t-test on the two datasets, we can see if a difference really exists or not. This is simple to do in R and the results are displayed below.

> t.test(tifree, ticheck)

 

        Welch Two Sample t-test

 

data:  tifree and ticheck

t = 3.3964, df = 20638.23, p-value = 0.000684

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 0.1515310 0.5651013

sample estimates:

mean of x mean of y

 14.36020  14.00188

> 

What this all means is that at the alpha = 5% level we have a p-value of 0.000684 and we are confident that there is a statistically significant difference in the means.

Although there is little difference in the mean value, there are some outliers where something has gone wrong. We see from the boxplot that there are occasions without a checklist where errors do occur and these cost time.

In responding to an incident, having a checklist helps even experienced professional incident responders. You may have worked many years and know your work inside out, but there are always things that you can overlook in a rush and in the moment.

So, the moral here is simple, create a checklist. A good incident responder is not afraid to use a checklist and follow a process.

Science does not mean we have to have huge budgets and it also does not need to be difficult. Simple experiments can be delivered by most people. In the DIT (Doctor of IT) program we have launched at CSU, we hope to have many experiments and in time, turn computer science back into a science from the art form it has become.

In the full paper, we will also be looking at accuracy and other measures, but this will have to wait until the paper is released.

3 comments:

Dr Craig S Wright GSE said...

It is also important to not be limited by the checklist. It is an aid that allows us to ensure we have done the minimum, not a limit that restrains us.

Kenneth G. Hartman, CISSP, CPHIMS, GSEC said...

Nice! This is valuable research for the security community. I would also add that the more routine the tasks, the more the checklist is needed. Have you ever flown so much, that you forget where you parked your car? I have a laminated checklist that I use to pack my gym bag, because I hate it when I show up at the health club without my running shoes or towel. We have all re-built servers, but in an incident response situation, it would be a shame to omit a critical security configuration because in the stress of the situation one lost track of which step they were on while being interrupted and multitasking.

I heard a great speech by Sully Sullenberger at HIMSS advocating increased usage of checklists in healthcare. (http://geovoices.geonetric.com/2010/03/a-pilots-thoughts-on-patient-safety/) In the speech he makes the point that pilots cannot believe checklists aren’t standard practice in medicine – “you can’t do anything that complicated without a checklist.” No one is infallible. However, by using checklists, you can at least be sure that your crises happen over the hard things, and not the easy things. This is also very applicable to incident handling.

Thanks again. I will be following your research.

Dr Craig S Wright GSE said...

Hello Kenneth,
A part of the abstract in the academic paper to be submitted this weekend has been inspired by your comment.

Thank you.