Saturday, 22 October 2011


There will be materials up this year, but the main seeds will start in the new year. What we are going to do is to have the basics covered. Free, instructor lead and more.

I will be seeding what I hope to grow into a cloud based training platform in the new year. The idea is to have a pathway and materials to train MANY people. I will be donating all I have written as a seed and more.

I will have at least 12 complete courses a year. With the help of others, more. These will have audio, video and tutorials. In time, I want to have a platform where a junior analyst can ask how and get an answer, step by step with video on a tablet.

I will be requiring that those who do the training give back. They will create videos and write tutorials. These will be used as materials with the best being selected for ongoing training. The basic training will lead into research. I will detail this later. I am working to make this a lead into a doctoral program.

These courses are:

  • Pen Testing (multiple levels)
  • Digital Forensics (multiple levels)
  • CISSP primer (general security)
  • IT Law
  • Risk Management
  • Reverse Engineering (in development and a Beta is planned to start in Nov)

I am also working on the following (next year all of these will be written soon):

  • Secure C coding (beginner and advanced)
  • Coding algorithms
  • Secure C++ coding (beginner and advanced)
  • Secure C# coding (beginner and advanced)
  • Secure Java coding (beginner and advanced)
  • Python for Ethical Hackers
  • Exploit development
  • Malware Engineering
  • LAMP Security
  • Windows Security
  • Cloud Security

There will also be several short courses (1-2 or 3 days) on:

  • IPv6 Security
  • Mobile forensics
  • Linux Security
  • Windows Security Issues
  • MySQL security
  • Database auditing
  • Systems Audit
  • SCADA security
  • Risk
  • I will be starting at simple materials and low level introductory courses and working to advanced. Right now, this is about training as many people as fast as possible.
              This WILL grow. In time, I want to have a community model where we all work to add more.
              I shall also be adding web based certification guides in the coming year. PLEASE let me know what is wanted first. I have managed to collect a large volume of certifications, but I do not know what most people need. So, you tell me, I will create.
                This is the start, more will follow soon. In the coming week, I will be starting a forum for this and we will create a more secure world through better education and training.

              More Assembly

              To follow, the links below are all good reference sites for a person wanting to learn Assembly language coding.

              First, I have a link to an online text book. This follows from Paul Carter’s PCASM page on his computer science site.

              There are also some good Assembly Language tutorials at Layer Networks.

              Learning Assembly

              Assembly code is still a useful skill. If you want to be able to develop exploit code or analyse malware at a deep level, then you really need to add this to your list of skills.

              One of the best sites I know of for MASM and NASM tutorials is

              These are simple, step by step lessons that allow the beginner to progress to an advanced stage.

              More on proof.

              Following my post from the other day, I am going to make it simple for people to check and validate my credentials. Charles Sturt University is easy enough as there are links from the University stating what I have done (and some old ACS ones). See the prior post for these.

              My problem.

              I have an issue with being called a liar. I should not react as I do, but I do. It is something I am working on and fail at. I do understand that extraordinary claims require extraordinary proof, but I have never been asked to prove anything first. I am accused and rarely even know what of. I am told that I am lying and at times, those I consider friends do not support me.

              That is my flaw and failing. I am not humble, but I am not as I am passionate about what I do. I want a world where security matters and if that means I have to fight people who call be a liar or more, I will leave my head on the block.

              I am proud.

              I love my achievements more than I should.

              As a sometime lay pastor I know that this is a flaw and I should not be so. I know of no way to achieve what I see in the coming years without promotion. I will do whatever at any cost to myself to achieve the goal I have set.

              My private life

              I remain a Uniting Church (NSW) Trust Association Member until 2013. It was resolved by the Synod Standing committee on 27/28 August 2010.

              I have been lax in attending church. I used to lead what we called “Cafe’ Church” from time to time, but I have not lead a service this year. I am in the process of a divorce following 15 years of marriage.

              My life has been consumed with my work. Not simply study, but applied roles. I have worked full time as I have studied full time my entire life. In coming months, like this post, I will document everything piece by piece.

              I am a man who loves children, but I have none. I have worked and studied my entire life and this was the sacrifice I had to make. There is always a cost and mine is one where I have not had and likely never will have the opportunity to hold my own child. Yet I know what I seek to achieve and I shall make sure I do it.

              In my studies and research I have spent a large 7-figure value on doing what I set out to measure and test. I do not expect anyone to believe this statement. Consequently, I shall in coming days and weeks also document this.

              I will document the costs and expenditures associated with my research. These have exceeded any grants or income I earned in the last decade and I have gone from a large 7 bedroom house and a farm to a small flat that I lease. In this, I did manage to pay for the experiments I needed to complete and this has allowed me to start to create risk models (there are several peer reviewed papers being published that I will link in coming weeks).

              I am not sad for having done this, I value the knowledge I have gained. Again my flaw, I am proud of it.

              I am also proud of my students and their achievements. 

              I will not discuss my family. I do not have the right to. They value their privacy and although mine is a figment that seems to have been long lost, I will retain theirs.


              In the coming year, I shall be releasing training material and courses. These will be online, cloud based and free. I will be offering a good amount of my time in this end freely.

              If people think that calling me a liar or worse will shut me up and stop this, they are wrong.

              To proof.

              In the last post I noted my student number and degrees from Newcastle and Northumbria. Here, I have scanned copies of the degrees. So a little more to aid in validating/checking.

              If for once in my life I was actually told what I was supposed to be defending, I would have an easier time, but I guess that is the point. Why make it easy for somebody to rebut an accusation.

              I admit, I have done many things and it does go to the extremes of believability just on my academic qualifications. So, I see I cannot expect to excel and also have people believe me. It would be nice, but I do not expect it anymore.

              I never actually get told what I am supposedly saying I have done that I have not actually done, so I will make it simple. In the coming months I will document my life.

              The good, the bad and the crazy.

              I will not link to new diagrams and information. I will link things/documents that are more than seven (7) years old. People keep saying I have not done things and some have denied projects that I have been a part of. Fine. I will simply offer proof until there is nothing left to be doubted.

              LLM Northumbria - Master of Law (International Commerce Law, Ecommerce Law with commendation).


              LLM Transcript

              MSTAT - Master of StatisticsMSTAT

              Work and some of the systems I was involved with.

              First NSW Police.



              Next, RAC (Rail Access)


              Rail Services Australia (RSA)



              Energy Australia



              More to come later…

              A preamble into aligning Systems engineering and Information security risk

              The following details the start of a risk quantification process that will be expounded in more detail in a series of peer reviewed papers over the coming year.

              The results come from a few years of experiments such as those already noted and many more. Full details and methodologies of these experiments will be provided soon.

              This paper will be loaded into he SANS reading room shortly where the equations will be far more visible than on this blog.

              Right now, many of the methodologies are computationally expensive on a small scale, however, with the exponential growth of system power, each generation of CPU makes this process and the calculations less expensive and more likely to be deployed effectively.

              In probability studies, we do not calculate the particular host that will fail, but rather look at the system population and obtain an overview of the risk as an economic function. Using the functions in section 4.4 we can model the censored time expected survival time of hosts between and audit and review process.

              Using a stochastic model, we can also calculate the approximate answers to these variables and as we add more computation, gain a more accurate estimate of the risk on a series of systems.



              For many years information security and risk management has been an art rather than a science. This has resulted in the reliance on experts whose methodologies and results can vary widely and which have led to the growth of fear, uncertainty and doubt within the community. At the same time, the failure to be able to effectively expend resources in securing systems has created a misalignment of controls and a waste of scare resources with alternative uses. This paper aims to introduce a number of models and methods that are common in many other areas of systems engineering, but which are only just starting to be used in the determination of information systems risk. This paper introduces the idea of using neural networks of hazard data to reliably model and train risk systems.


              1. Introduction

              This paper presents and extends the major statistical methods used in risk measurement and audit, and extends into other processes that are used within systems engineering (Elliott, Jeanblanc, & Yor, 2000). Security risk assessment is fundamental to the security of any organization (Grandell, 1991). It is essential in ensuring that controls and expenditure are fully commensurate with the risks to which the organization is exposed. The paper starts with defining risk and the terms; next it explores a few of the methods used.

              The equations presented in this paper can be used by organizations in order to quantify the relative risk of various solutions and systems and hence to assign risk strategies using the historical data and the organization as well as that of third parties providing a means to optimize audits and system reviews in a manner that detects an incident in the most economical fashion. Projects are all risk consequential endeavors and if our profession can better manage and calculate risk, society will be at an advantage.

              The paper defines processes as being the methods that are utilized in order to achieve a set of desired objectives. What is needed is to know just how these processes are implemented within an organization. An objective on the other hand is a goal or something that people desire to have accomplished. It is important to ask just who sets these objectives and how they are designed if risk management solutions are to be achieved effectively and economically.

              Controls are the mechanisms through which an individual or group’s goals are achieved. Controls are useless if they are not effective. As such, it is important to ensure that any control that is implemented is both effective as well as being justifiable in economic terms. Controls are the countermeasures for vulnerabilities but they need to be economically viable to be effective. There are four types:

              1. Deterrent controls reduce the likelihood of a deliberate attack

              2. Preventative controls protect vulnerabilities and make an attack unsuccessful or reduce its impact

              3. Corrective controls reduce the effect of an attack

              4. Detective controls discover potential (attempted) or successful attacks and trigger preventative or corrective controls.

              1.1. Identifying classify risk.

              A risk analysis is a process that consists of numerous stages. Some of these are defined below:

              · Threat analysis,

              · Vulnerability analysis,.

              · Business impact analysis,

              · Likelihood analysis (the probability of an event),

              The Risk analysis process should allow the organization to determine the risk for an organization based on threats and vulnerabilities. From this point, the auditor will be able to classify the severity of the risk and thus assign an overall importance to each risk. It should be feasible to use this information to create a risk management plan (Wright, 2008). This should consist of:

              · Preparing a risk treatment plan using a variety of control methods.

              · Analyzing individual risks based on the impact of the threats and vulnerabilities that have been identified from the risks.

              · Rate the individual risks from highest to lowest importance.

              · Create a risk treatment plan that categorizes each of the threats and vulnerabilities in order of its priority to the organization, together with some possible controls.

              1.1.1. Monte Carlo method

              A number of stochastic techniques have been developed to aid in the risk management process. These are based on complex mathematical models that use stochastically generated random values to compute likelihood and other ratios for our analysis model (Corcuera, Imkeller, Kohatsu-Higa, & Nualart, 2004).

              Monte Carlo methods can aid in other risk methodologies such as Time-based analysis (Curtis, et al 2001). There is a good introduction to Monte Carlo methods available at (Woller, 1996). This technique further allows for the determination of the range of possible outcomes and delivers a normalized distribution of probabilities for likelihood. Combining stochastic techniques with Bayesian probability and complex time series analysis techniques such as Heteroscedastic mapping is mathematically complex, but can aid in situations where accuracy is crucial (Dellacherie, & Meyer, 1982).

              These methods are truly quantitative. They help predict any realistic detection, response and thus exposure time in a manner that can be differentiated by the type or class of attack. These types of statistical methods are known to have a downside in that they are more expensive than the other methods. The level of knowledge needed to conduct a true quantitative type of analysis is not readily available and the level of knowledge of the organization needed by the analyst often excludes using an external consultant in all but the smallest of risk analysis engagements.

              2. System Survival

              When assessing network reliability, it is necessary to model the various access paths and survival times for not only each system, but for each path to the system. This requires the calculation of the following quantitative fields

              · clip_image002 The Reliability function

              · MTBF Mean Time Between Failures

              · MTTF Mean Time to Repair/Fix

              · clip_image004 The expected survival rate (Therneau et. Al. 1994)

              Other measures will be introduced later. The expected survival or failure rate clip_image004[1]is used throughout this paper and is detailed further in EQ 3.6 and EQ 3.7. Where possible, the standard systems reliability engineering terms have been used. In the case of a measure such as the MTTF, this represents the time both to discover and recover a compromised system. The true value estimate for the system comes as a measure of the applications on the system, this may be estimated for a less economically expensive (though less accurate) estimate. In this calculation, the compromise measure, MTBF is best thought of as the mean time to the first failure.

              This can be modelled with redundancy in the design. Here, each system is a parallel addition to the model. Where a system is required to pass another, a serial measure is added. For instance, if an attacker has to:

              · bypass system A (the firewall) to

              · compromise system B (an authentication server) which allows

              · an attack against a number of DMZ servers (C, D and E) where

              · systems C and D are connected to the database through

              · a secondary firewall (system F) to (Not in figure 2.1)

              · the database server G (as displayed in figure 2.1).


              Figure 2.1 Attacking a series of systems

              The attacker can either attack the system directly through the VPN or by following the attack path indicated by the systems. If the firewall system A restricted the attacker to a number of IP addresses, the attacker may do 1 of a number of things in attacking this system (in order to gain access as if the attacker was one of these IPs):

              1. Compromise the input host

              2. Spoof an address between the input IP address (such as through a router compromise at an ISP or other system)

              3. Compromise the VPN

              Other options, such as spoofing an address without acting as a MITM (Man In The Middle) will leave some attacks possible that can not result in a compromise of system G. These could have an economic impact that would be calculated separately. Such an event that can be calculated would be a DDoS (Distributed Denial of Services) attack on the server.

              Hence, the effective attack paths are:

              · Input, A, B, C, F, G

              · Input, A, B,D, F, G

              · Input, A, B, E, C, F, G

              · Input, A, B, E, D, F, G

              · VPN, G

              In this instance, it is necessary to calculate conditional probabilities as these paths are not independent. Here the options need to consider first include (the paper will use the term I to define an attack on the Input system and S to refer to a spoofed attack of the input system):

              · The conditional probability (Annis, 2010) of compromising system A given a successful spoof attack on the Input system, clip_image008 (where clip_image010refers to an attack on system A using path No. 1, or Input, A, B, C, F, G)

              · The conditional probability of attacking system A

              · The probability of attacking system G, clip_image012

              Each of the attack paths are able to be treated as independent. Hence, the overall probability of an attack is a sum of the conditional probabilities from each attack path. As a consequence, the attacker will most likely come over the lowest cost path, but the probabilistic result allows for an attack from any path. The high and low probability attack measures are jointly incorporated in the process.

              Presuming no other paths (such as internal attacks etc) it is feasible to model the alternate probability as not possible (or at least feasible). Here clip_image014. Additionally, the probability of an attack over path 5 (the VPN) can be readily calculated without further input as:



              clip_image018 (Blanchard & Fabrycky, 2006) and

              clip_image020 EQ 2.0

              Here there exists a single system behind the VPN. Where more than one system exists, it is necessary to calculate the joint probability as is detailed below. In the example, with only a single system:

              clip_image022 EQ 2.1

              Equation 2.1 holds as the probability of the attacker compromising system G when the VPN has been compromised approaches 1. This is as the attacker has a single target with the VPN and the utility of attacking the VPN and no more is negated as no other systems exist and the VPN offers no other utility for the attacker alone.

              The values, clip_image024andclip_image026are the expected survival time or the mean time to compromise for the VPN and database respectively as configured and t is the amount of time that has passed from install and represents the current survival time of the system (Jeanblanc, & Valchev, 2005).


              clip_image028 EQ 2.2

              On the other hand, the probability of compromise to system I is based on the number of systems and as clip_image030. Basically, as more systems are allowed to connect to system A, the closer the probability of compromise tends towards a value of P=1. That is, as the systems available to be compromised increase, the probability of compromise approaches certainty. This is generalised as each addition to the system adds a positive probability of compromise when added to an existing system (Blanchard & Fabrycky, 2006). This occurs as no system can be shown to have a probability of compromise P=0. Hence, for each additional component added into the system, the chance of compromise approaches P=1 (where it is finally reached at n=∞).

              Where there are only a limited number of systems, the probability can be computed as a sum of the systems. Where there are a large number of systems with equivalent (or at least similar) properties, these can be calculated through the sum of the systems. If in the above example, system E is replaced with a series of systems (each with the same configuration), it is possible to calculate the probability of a compromise of one of the "E" systems as follows:

              clip_image032 EQ 2.3

              Here, P(E) is a multiplicative and not additive function. As such, if system "E" is defined as a DNS server with a single BIND service and SSH for management of the host, an attacker has two means to compromising the system;

              · Attack SSH

              · Attack BIND

              The probability can be considered as independent in this case if there are no restrictions. In the example, DNS is an open service, that is, P(I)=1. The SSH service may or may not be open and could be restricted. If this is the case 0<P(I)<1. In the simple case where no restrictions have been imposed on SSH, the probability can be calculated as a standard independent probability formula. This is:  clip_image034EQ 2.4

              The complication comes where one of the services has been restricted as a further control. This is a combination of the probability of compromising the restrictions on the service (that is spoofing or otherwise bypassing IP address controls) and the compromise of the service itself. This can be represented by:


              In this case, there exists a probability clip_image038 where the allowed source systems (I) are limited to a total of "n" IP addresses (or keys). The probability clip_image040 of any source system being compromised will vary, but may be estimated based on the type and location of each system. As more systems are added into the equation, the polynomial equation becomes more complex. In the event that similar systems are also accessing this, these can be calculated and the equation simplified.

              For example, if two (2) classes of systems exist (Linux and Windows Vista) that comprise the set of systems clip_image042for a total of 4 systems (2x Windows and 2x Linux) these can be defined using:


              In the case where clip_image046and clip_image048due to the system configurations and patch status, it is possible to calculate P(I):

              clip_image050 EQ 2.5

              In this case, the probability of a compromise due to SSH would become:

              clip_image052 EQ 2.6

              With the details from the example at 2.2, it is possible to calculate the survival function for system E:

              clip_image054 EQ 2.7

              Thus, there exists a method to calculate the probability of each system as well as the conditional probability of that system.

              The addition of a device (such as an IDS) changes or otherwise impacts t and adds additional complexity to the calculations. An IDS for instance can limit the value of t through a probabilistic feedback process. The more effective the IDS is, the quicker an attack or other incident will be intercepted. In this instance, t becomes a probabilistic function based on how effective the IDS itself is. This becomes a combination of the following factors:

              · The inherent accuracy of the IDS (which is a trade-off between TYPE I and TYPE II errors (Rogers, 1996) and it is a cost function in itself)

              · The missed detection rate (even where an incident is noted, the analyst may miss the detection. As more false negatives are seen, the missed detection rate increases (Ikeda, & Watanabe, 1962). As a result, increasing false negatives to capture all possible attacks ends in a limit where the IDS is no longer effective).

              A Type I error is often denoted as a “false positive”. This involves incorrectly rejecting the null hypothesis in favor of the alternative. Where an IDS is involved, a false positive would involve detecting and alerting on an event that did not actually happen to be an incident or attack. A Type II error is the opposite of a Type I error. A Type II error in an IDS system involves the false acceptance of the null hypothesis and is commonly referred to as a false negative. It would imply that the packet or traffic is not an attack and is safe when it is in fact malicious or otherwise dangerous.

              The IDS forms a cost function as the increase in reporting results in a greater number of false positives that need to be investigated. In limiting the false positives, the likelihood of missing an incident of note also increases. Each validation of a false positive takes time and requires interaction from an analyst. Hence the tuning of an IDS is balanced on maximizing the detection against cost.

              In the event that the IDS does not detect the attack, the function mirrors that of the system without the IDS in effectiveness. Note that the cost of the system with the IDS is greater than the system without IDS. As a result, the addition of IDS is a limiting function. An increase in cost adds to the power of the IDS. This is, more analyst time and more detection capability lowers the false negative and false positive rate through an increase in cost. Each IDS system has an expected TYPE I and TYPE II error rate that will vary as the system is tuned to a particular environment. The result of this is an individualistic function for the organisation that can only be generally approximated for other organisation (even when the same IDS product is deployed).

              For a given probability of survival, it is possible to calculate the expected survival time (t) of the system. This process becomes computationally infeasible in large systems with numerous inputs. For instance, on system E (as defined in EQ 2.3) it is feasible to rearrange the equation of the expected probability of system E being compromised. For instance, if a calculation of the expected function of survival time for a set survival probability P is desired, rearrange the equations in EQ 2.3 as follows.clip_image056 EQ 2.8

              This result is in the form of:


              From EQ 2.8 it is clearly seen that as clip_image060clip_image062. From these equations, as long as t is large, an approximation can be deployed to obtain a lower limit estimate of clip_image058[1]as clip_image064. As such an approximate for the lower limit of time for system E's survival is defined as:

              clip_image066 EQ 2.9

              In EQ 2.4, it is demonstrated that the lower the value of t, the greater the error. Measuring "t" in seconds and substituting normal system values of clip_image004[2] allows for the use of Monte Carlo simulations to approximate the expected value of t.

              For simplicity, let R represent reliability and Q the unreliability (hence, clip_image068).

              For each application, a possibility exists to use Bayes' theorem to model the number of vulnerabilities and the associated risk. A mathematical introduction to Bayes' Theorem is available online from Weisstein and in more detail from Joyce (2008). For open ports, the person evaluating risk can use the expected reliability of the software together with the expected risk of each individual vulnerability to model the expected risk of the application. For instance, it is conceivable to model clip_image070using this method.

              clip_image072 EQ 2.10



              Over time, as vulnerabilities are uncovered and fixed (assuming that new vulnerabilities have not been introduced), fewer issues will remain. Hence, the confidence in the software product increases. This also means that mathematical observations can be used to produce better estimates of the number of software vulnerabilities as more are uncovered.

              It is thus possible to observe the time that elapses (Guo, Jarrow, & Zeng, 2005) since the last discovery of a vulnerability. This value is dependent upon the number of vulnerabilities in the system and the number of users of the software. The more vulnerabilities, the faster the discovery rate of bugs. Likewise, the more users of the software, the faster the existing vulnerabilities are found (through both formal and adverse discovery).

              2.1. Mapping Vulnerabilities within software

              Now let E stand for the event where a vulnerability is discovered within the Times T and T+h for n vulnerabilities in the software


              Where a vulnerability is discovered between time T and T+h use Bayes’ Theorem to compute the probability that n bugs exist in the software:

              clip_image078 EQ 2.11

              From this it can be seen that:

              clip_image080 EQ 2.12

              EQ 2.12 will apply for all versions of software (Wright & Zia, 2011). As patches and updates are applied to the software, existing vulnerabilities will be rectified and removed, but new flaws related to how many new lines of code have been added in the patching process will be introduced and will also need to be calculated.

              By summing the denominator it can understood that in observing a vulnerability at time T after the release and the decay constant for defect discovery is clip_image082, then the conditional distribution for the number of defects remaining is a Poisson distribution with expected number of defectsclip_image084.


              clip_image086 EQ 2.13

              This can be extended to create a method to calculate the expected failure of a system based on the interaction of multiple software products.

              3. Exponential Failure

              The reliability function (also called the survival function) represents the probability that a system will survive a specified time t. Reliability is expressed as either MTBF (Mean time between failures) or MTTF (Mean time to failure). The choice of terms is related to the system being analysed. In the case of system security, it relates to the time that the system can be expected to survive when exposed to attack. This function is hence defined as:

              clip_image088 EQ 3.1

              The function F(t) in EQ 3.1 is the probability that the system will fail within the time 't'. As such, this function is the failure distribution function (also called the unreliability function). The randomly distributed expected life of the system 't' can be represented by a density function, clip_image090and thus the reliability function R(t) can be expressed as:

              clip_image092 EQ 3.2

              The time to failure of a system under attack can be expressed as an exponential density function:

              clip_image094 EQ 3.3

              where clip_image096is the mean survival time of the system when in the hostile environment and t is the time of interest (the time that the user wishes to evaluate the survival of the system over). Together, the reliability function, R(t) can be expressed as:

              clip_image098 EQ 3.4

              The mean (clip_image096[1]) or expected life of the system under hostile conditions can hence be expressed as:

              clip_image100 EQ 3.5

              Where M is the MTBF of the system or component under test and clip_image004[3]is the instantaneous failure rate (Brémaud, 1981) where Mean life and failure rate are related by the formula:

              clip_image102 EQ 3.6

              The failure rate for a specific time interval can also be expressed as:

              clip_image104 EQ 3.7

              Failure rates are generally expressed in terms of failures per hour, percentage of failures per each 1,000 hours or the rate of failures per million hours. For instance, if a system has a 90 day patch cycle (the total mission time) and that the total number of software failures in that time is expected to be (or is later measured to be) 6 vulnerabilities, it is conceivable to calculate the failure rate per hour as:

              clip_image106 EQ 3.8

              In the case of an exponential distribution for the system mean survival under attack, the MTBF can be defined as:

              clip_image108 EQ 3.9

              Hence, it is expected the system to survive 15 days before a vulnerability is discovered. This does not return when a system will actually be exploited, simply the expected probabilistic time that can be used to project and plan future expenditure.

              4. Modeling System Audit as a Sequential test with Discovery as a Failure Time Endpoint

              Combining hazard models ( ) with SIR (Susceptible-Infected-Removed) epidemic modeling (Altmann, 1995) provides a means of calculating the optimal information systems audit strategy. Treating audit as a sequential test allows for the introduction of censoring techniques (Chakrabarty, & Guo, 2007) that enable the estimation of benefits from divergent audit strategies (Benveniste, & Jacod, 1973). This process can be used to gauge the economic benefits of these strategies in the selection of an optimal audit process designed to maximize the detection of compromised or malware infected hosts.

              Computer systems are modeled through periodic audit and monitoring activities. This complicates the standard failure and hazard models that are commonly deployed (Newman, et. al. 2001). A system that is found to have been compromised by an attacker, infected by malware or simply suffering a critical but unexploited vulnerability generally leads to early intervention. This intervention ranges from system patching or reconfiguration to complete rebuilds and decommissioning.

              Audits and reviews of computer systems usually follow a prescribed schedule in chronological time. This may be quarterly, annually or to any other set timeframe. Further, periodic reviews and analysis of systems in the form of operational maintenance activities also provide for a potential intervention and discovery of a potential system failure or existing compromise.

              Using a combination of industry and organizational recurrence rates that are stipulated from a preceding failure and covariate history as derived from the individual organization introduces a rational foundation in modeling current event data. An incident as defined for the purposes of this paper is an event leading to the failure of the system. This can include a system compromise from an attacker or an infection process of malware (such as a scanning worm). By denoting the number of incidents within the organization as clip_image110 by follow-up time t and clip_image112as the corresponding observed incidents in clip_image114 with regards to absolute continuous event times, the hazard or intensity process clip_image116 for the intervention time t using the covariate data clip_image118can be expressed as:

              clip_image120 EQ 4.1

              Taking the assumption that the administrative and audit staff are not the direct cause of an incident, a point process clip_image122will usually be observed for the system being examined. A system is defined by an isolated and interactive grouping of computers and processes. This could be a collection of client and server hosts located at a specific location isolated by a common firewall.

              Due to censoring through the audit process, clip_image112[1]can be greater thanclip_image110[1]. Equation (4.1) has an assumption that only a single incident has occurred, that is, clip_image124increments by units. Live systems can and do experience multiple incidents and compromises between detection events. Hence it is also necessary to model the mean increments in clip_image124[1]over time

              clip_image126 EQ 4.2

              with the cumulative intensity process clip_image128.

              In the case of a continuous-time process with unit jumps, expressions (1) and (2) can be expressed as

              clip_image130. EQ 4.3

              Independent censorship requires that clip_image132 [15]. This assumption of independent censorship allows the preceding covariate histories to be incorporated into the model. In definingclip_image134, it is now necessary that

              clip_image136 EQ 4.4

              for all times (clip_image138) prior to the audit or review.

              4.1. NHPP, Non-homogeneous Poisson Process

              Poison processes have been used to model software (Zhu et. al 2002) and systems failures (Marti, 2008), but these models are too simplistic and it is necessary to vary the intensity (rate) based on historical and other data in order to create accurate risk models for computer systems. The non-homogeneous Poisson process (NHPP) can be used to model a Poisson process with a variable intensity. In the special case when clip_image116[1]takes a constant valueclip_image139, the NHPP is reduced to a homogeneous Poisson process with intensityclip_image141.

              In the heterogeneous case, an NHPP with intensityclip_image116[2], the increment, clip_image143has a Poisson distribution with an intensity ofclip_image145. Hence the distribution function of the incident discovery can be expressed as:


              The NHPP format is better suited to information systems risk modeling than is the homogeneous Poisson Process as it can incorporate changes that occur over time across the industry.

              This can also be modeled as the Poisson process with parameterclip_image139[1]. Here clip_image149 represents the unique (in law) increasing right continuous process with independent time homogeneous increments. Each clip_image151has a Poisson distribution with rateclip_image153. The process clip_image155is also stationary with independent time increments.

              With clip_image157and clip_image159, clip_image161the r.v.'s clip_image163are independent and for eachclip_image165has the same distribution as clip_image167.

              The characteristic function of clip_image169can be computed for anyclip_image171as:


              as clip_image175the expression converges toclip_image177, which is the characteristic function of a Gaussian variable with variance clip_image179.

              4.2. Recurrent Events

              In many cases, audit and review processes are limited in scope and may not form a complete report of the historical processes that have occurred on a system (Revuz, & Yor, 1999). The audit samples selected systems and does not check neighboring systems unless a failure is discovered early in the testing. In these instances, the primary interest resides in selected marginalized intensities that condition only on selected parts of the preceding histories. Some marginal intensity rates drop the preceding incident history all together

              clip_image181 EQ 4.5

              A common condition for the identification of clip_image183is that

              clip_image185. EQ 4.6

              For (EQ 4.6) to be valid, censoring intensity cannot depend on the preceding incident history for the systemclip_image187. The process of randomly selecting systems to audit makes it unlikely that particularly problematic systems will be re-audited on all occasions. This would include the exclusion of targeting client systems that have been compromised several times in the past or which have suffered more than one incident in recent history. The result is that covariates that are functions of clip_image189will also have to be excluded from the conditioning event. Here

              clip_image191 clip_image193 EQ 4.7

              When this occurs


              clip_image197models the expected number of incidents that have occurred in the system over clip_image199as a function of clip_image118[1].

              4.3. Cox Intensity Models

              Using a Cox-type model

              clip_image201, EQ 4.8

              with clip_image203having been created using functions of clip_image118[2]and clip_image187[1], inference differs little to univariate failure time data. The log-partial likelihood function, score statistic and the integral notation for the information matrix may be written respectively as

              clip_image205 EQ 4.9

              clip_image207 EQ 4.10


              clip_image209 EQ 4.11





              By defining clip_image215in terms of fixed or external time varying covariates, (8) can be further defined by adding additional elements clip_image217 toclip_image219. This would allow the intensity to be altered by a multiplicative factor clip_image221following the jth incident on an individual system when compared against another system without any incidents at the same point in time.

              4.4. SIR (Susceptible-Infected-Removed) epidemic modeling of incidents during Audit

              Allowing that a compromised or infected system remains infected for a random amount of timeclip_image223, the discovery of an incident by an auditor will be dependent on a combination of the extent of the sample tested during the audit and the rate at which the incident impacts individual hosts. Here, it is assumed that the audit is effective and will uncover an incident if an infected host is reviewed. When a host in a system is infected, any neighboring hosts are attacked and infected at a rate r. The sample size selected in the audit is set as clip_image225and the total number of hosts in the system being audited is defined byclip_image227 wereclip_image229. The time between audits (the censor time) is defined by C.

              If clip_image231, an infected or compromised system will be undiscovered and attacking other hosts within the system when the audit occurs. At the end of the timeclip_image223[1], the system is removed as it is either 'dead' - that is decommissioned and reinstalled or it has been patched against the security vulnerability.

              A NSW (Newman, Strogatz, and Watts, 2001) random graph is obtained by investigating the neighboring systems in the SIR model. From this, the thresholds can be computed.

              4.4.1. Calculations with a constantclip_image223[2].

              First, consider the case where clip_image223[3]is a constant value, and without loss of generality scale time to make a constant. Start with letting clip_image233be the degree of distribution.

              Starting with a single infected host in a system, it leads to the ability to compute the probability that j of k neighboring hosts will be infected is given by:

              clip_image235 EQ 4.12

              Setting clip_image237is the mean of p then the mean of clip_image239is clip_image241

              With the network constructed as an NSW random graph, systems that are compromised in the first and subsequent iterations will each have k neighbors. The value k includes subsequently compromised machines and the host that compromised the existing system. The probability of a compromise associated with these neighboring hosts is given by:

              clip_image243 EQ 4.13

              This allows us to calculate the probability that j neighboring hosts also become infected:

              clip_image245. EQ 4.14

              Setting clip_image247to represent the mean of q, leads to the mean of clip_image249,


              From this it is not too difficult to see that for the attack or malware to propagate and infect other systems, it is necessary have the condition where;

              clip_image253. EQ 4.15

              Using this condition provides the capability to calculate the probability that a particular attack or type of malware could result in an outbreak (Thomson, 2007). Setting clip_image255(Newman, Strogatz, and Watts, 2001) returns:

              clip_image257 EQ 4.16

              Similarly, it is simple to prove that clip_image259.

              As such, the probability that an incident behaves as an epidemic is clip_image261 where clip_image263is the smallest fixed point in clip_image265. This is important for where incident can be demonstrated to behave in a manner that lies between [0,1] it is possible to utilize the joint probability substitutions from section 2.

              4.4.2. Calculations with a variableclip_image223[4].

              Next it is necessary to consider the effects of a variable or random value ofclip_image223[5]. This is the probability that a compromised system causes a compromise in its neighboring system:

              clip_image267. EQ 4.17

              Again, T is the transmissibility factor.

              Newman (Newman, Strogatz, and Watts, 2001) asserted that the infection of neighbors was independent. This does not hold as valid for malware as neighboring systems are commonly linked to form workgroups and domains that share files and even execute code across network boundaries, but it gives a good approximation where most systems are not openly connected as a grid. Interactions in systems and the ability of software to rescan the same systems carry a degree of dependence. Here the time to compromise may be modeled exponentially with meanclip_image139[2] (again the failure rate), such thatclip_image269.

              From this it can be shown that the probability of a host not being compromised is,

              clip_image271. EQ 4.18

              Likewise, the probability that n hosts in a system are not compromised is,

              clip_image273 EQ 4.19

              For cases where clip_image223[6]is not constant, Jensen’s inequality (Dempster, Laird, and Rubin, 1977) implies that for neighboring hosts, the probability of escaping compromise is positively correlated as

              clip_image275. EQ 4.20

              It is now viable to compute the expected number of systems that will be compromised by substituting clip_image277for clip_image233[1]and clip_image279which gives us,

              clip_image281 EQ 4.21

              if clip_image283, G is strictly convex as clip_image285, clip_image287.

              4.5. Applications to audit and review

              The first case whereclip_image289 has a host in the system being discovered as having been infected or compromised before the audit. Here, the rate of infection r determines the chance of other systems being uncovered during an audit. If the time between audits exceeds the time to compromise the first host is insufficient for the incident to spread (i.e. clip_image291), then only the initial host will have been compromised and this will be known prior to the audit.

              If clip_image231[1]a compromised host in the system will be undiscovered and attacking other hosts within the system when the audit occurs. At the end of the timeclip_image223[7], the system is found. As such, if clip_image293 (where c is the average time taken to conduct an audit), a compromised host is discovered during the audit through a process independent of the audit.

              The alternative scenario and that which is of most interest is whereclip_image295. In this case, the incident will not be discovered independent of the audit. In this instance, the calculation of the probability that an auditor will discover a compromised system during the audit process may be determined as there exists a time limited network function which is coupled with a discovery process that is formulated using the Bayesian prior. That is a risk professional can calculate the probability that all systems are not compromised given a selected audit strategy that finds that none of the audited hosts have been compromised.

              So, though it is never feasible to absolutely know if systems that are outside the audit have been compromised, there is a level of relative risk that can be calculated within confidence bounds and used as a means of calculating the expected loss for a variety of risk mitigation strategies. The security professional can then use hypothesis tests to determine if one strategy is significantly better than another and select risk strategies based on expected loss.

              4.6. False Negatives in an Audit

              False negatives result (Jacod, 1975) in an audit where an incorrectly reported result is supplied noting the organization as safe when it is not (i.e. no compromise was detected where hosts have been compromised). By letting A represent the condition where the organization has been compromised and let B represent the positive evidence of a compromise being reported:

              clip_image297 EQ 4.22

              Here it is practicable to model the actual rate of compromise in the system,clip_image299. Given a network compromise model (EQ 4.21) it is realistic to substitute the censored time:

              clip_image301 EQ 4.23

              From this the results show that the probability of a host not being compromised in the censor time is,

              clip_image303.EQ 4.24

              Equation (EQ 4.24) also derives the probability of any single host being compromised between the audits

              clip_image305. EQ 4.25

              Depending on whether clip_image223[8]is constant or varies; the process can calculate the expected number of hosts that will be compromised in the period between audits as a fixed or variable function. In either event, each calculation is an exercise in Bayesian estimation of the type where a random sample is selected and the defects or failures are analyzed

              clip_image307. EQ 4.26

              This binomial distribution is simplified in the case of a false negative (no failures or x=0 from a sample of n hosts)

              clip_image309. EQ 4.27

              Based on the types of systems, the audit periods can be selected to create an economically optimal choice. A baseline audit frequency is frequently set by regulatory guidelines. Here, the baseline requirement would be to have a minimum audit process designed around the most effective returns within the regulatory regime. In this, using an equation that calculates the expected number of compromised or infected hosts within a censor time allows for the selection of either a fixed audit schedule, C = Constant, or vary C over the course of the system life in order to maximize the detection, C=C(t).

              In conducting this exercise, the cost of the audit, and differences that occur would also need to be modeled. The required effort for an audit of 10 hosts in a 1 month period is not necessarily linearly related to the audit of 60 hosts in a 6 month period. In each case, the individual constraints faced by the selected organization also need to be incorporated.

              5. Automating the process

              The main advantage to a systems engineering approach is the ease with which it can be automated. The various inputs and formula noted throughout this paper can become inputs into a neural network algorithm (Fig. 5.1). Equation (2.1) could be modeled in three layers (Fig 5.2).

              Here, an input layer with one neuron for each Input (system or application) could be used to map for IP Options, Malware and Buffer overflow conditions, selected attacks etc. The system of perceptrons would be processed using a hidden neuron layer in which each neuron represents combinations of inputs and calculates a response based on current data coupled with expected future data, a prior data and external systems data. Data processed at this level would feed into an output layer. The result of the neural network would supply the output as an economic risk function.

              In this way, a risk function can be created that not only calculates data based on existing and known variables (He, Wang, & Yan, 1992), but also updates automatically using external sources and trends. Many external sources ( have become available in recent years that provide external trending and correlation points. Unfortunately, most of these services have clipped data as the determination of an attack is generally unclear and takes time to diagnose where much otherwise useful data is lost. When monitoring the operation of a system or the actions of uses, thresholds are characteristically defined above or below which alerting, alarms, and exceptions are not reported. This range of activity is regarded as baseline or routine activity.


              Fig 5.1 A depiction of a Multi-Layer layer topology neural network

              Multi-Layer layer topology neural networks can be used to accept data from risk models and automatically update the risk profile of an organization. In modeling risk, each application and system can be modeled using a perceptron.


              Fig 5.2 Inputs being fed into a perceptron.

              The perceptron is the computational workhorse in this system. In this it is reasonable to model the selected risk factors for the system and calculate a base risk that is trained and updated over time. The data from multiple organizations can be fed into a central system (Kay, 1977) that can be distributed to all users. This could be integrated and sold as a product enhancement by existing vendors or independent third parties could maintain external datasets.

              clip_image315 EQ 5.1

              EQ 5.1 defines the input variables as,

              · x1 … xn are the inputs of the neuron,

              · wi,j,0 … wi,j,n are the weights,

              · f is a non-linear activation function,

              · hyperbolic tangent (tanh),

              · vi,j is the output of the neuron.

              A large vendor such as Microsoft could create an implementation model. In place of offering stale recommended security settings (such as currently occurs with Microsoft’s MBSA), the risk application could automatically collect data from user systems on patch levels and group policy configurations and utilize these in order to calculate and report on an estimated level of risk and an expected survival time for the system in a number of different scenarios. For instance a notebook computer could have a set of risks. This would include the risk when connected to the corporate network, when connected to a wireless hotspot etc.

              The training of the network would require the determination of the correct weights for each neuron. This is possible in selected systems, but a far larger effort would be required to enable this process for more generalized deployment. The data needed for such an effort already exists in projects such as DShield, the Honeynet Project and in many similar endeavors. The question is whether there truly exists a will as a community to move from an art to a science.

              6. Conclusion

              The equations presented in this paper allow organizations to compare the deployed risk strategies against both their own historical data and that of third parties. In this manner, strategy can be formulated in order to optimize audits and system reviews in a manner that detects an incident in the most economical manner. Projects are all risk derived exercises and if our profession can better manage and calculate risk, society will benefit.

              Modeling the failure rate of systems and the propagation rate of an attack, allows us to calculate an expected number of hosts that are anticipated to have been compromised in the time between an audit given a specified survival function or threat. Past data and comparisons from similar systems (such as survival data from allow for the modeling of alternative systems where a reported number of events have been reported against those deployed.

              Dependence, variation, randomness, and frailty add to the risk toolset of multivariate failure event analysis. Using frailty theory to model information system risk allows us to better predict risk and to more effectively allocate scarce resources through selecting the most economically viable targets to defend as well as choosing the optimal detection strategies. The properties of censoring-handling and frailty modeling have turned multivariate survival analysis into an exceptional tool for the determination of system risk.

              For decades, information security practitioners have engaged in qualitatively derived risk practices due to the lack of a scientifically valid quantitative risk model. This has led to both a misallocation of valuable resources with alternative uses and a corresponding decrease in the levels of protection for many systems. Using a combination of modern scientific approaches and the advanced data mining techniques that are now available provides the technologies and data to create a new approach to information systems risk and security.

              The optimal distribution of economic resources across information system risk allocations can only lead to a combination of more secure systems for a lower overall cost. The reality is that, like all safety as an issue, information security is based on a set of competing trade-offs between economic constraints. The goals of any economically based quantitative process are to minimize cost and risk through the appropriate allocation of capital expenditure. To do this, the correct assignment of economic and legal liability to the parties best able to manage the risk (this is the lowest cost insurer) is essential and needs to be assessed. This will allow insurance firms to develop expert systems that can calculate risk management figures that can be associated with information risk. This will allow for the correct attribution of information security insurance products that can be provided businesses generally.

              Externality or the quantitative and qualitative effects on parties that are affected by, but who are not directly involved in a transaction is likewise seldom quantified, but is an integral component of any risk strategy. The costs (negative) or benefits (positive) that apply to third parties are an oft overlooked feature of economics and risk calculations. For instance, network externality (a positive effect that can be related to Metcalfe’s law; value of a network = 2 times the network’s number of users) attributes positive costs to most organizations with little associated costs to themselves. In these calculations, the time-to-market and first-mover advantages are critical components of the overall economic function with security playing both positive and negative roles at all stages of the process.

              The processes that can enable the creation and release of actuarially sound threat risk models that incorporate heterogeneous tendencies in variance across multidimensional determinants while maintaining parsimony already exist in rudimentary form. Extending these though a combination of Heteroscedastic predictors (GARCH/ARIMA etc) coupled with non-parametric survival models will make these tools more robust. Effort needs to be expended in the creation of models where the underlying hazard rate (rather than survival time) is a function of the independent variables (covariates). Cox's Proportional Hazard Model with Time-Dependent Covariates would be a starting point, with a number of non-parametric methods available where cost allows.

              As we move further into the 21st century, it is time we as a profession started to model risk as a scientific process and move away from the art based cottage industry that exists right now. This paper has presented a number of methods that can be used to gauge the expected failure events in a system of computer hosts.

              7. References

              Altmann, M., (1995) "Susceptible-infected-removed epidemic models with dynamic partnerships", Journal of Mathematical Biology Volume 33, Number 6, 661-675

              Annis, C. (2010) “Joint, Marginal, and Conditional Distributions” Retrieved October 11, 2011, from

              Benveniste, A. and Jacod, J. (1973). Systèmes de Lévy des processus de Markov. Invent. Math. 21 183--198. Mathematical Reviews

              Blanchard, B. and Fabrycky, W. (2006) “Systems Engineering and Analysis”, 4th Ed. Prentice Hall International Series in Industrial and Systems Engineering, USA

              Brémaud, P. (1981). Point Processes and Queues: Martingale Dynamics. Springer, New York. Mathematical Reviews

              Chakrabarty, A. and Guo, X. (2007). A note on optimal stopping times with filtration expansion. Preprint, Univ. California, Berkeley.

              Corcuera, J. M., Imkeller, P., Kohatsu-Higa, A. and Nualart, D. (2004). Additional utility of insiders with imperfect dynamic information. Finance and Stochastics 8 437--450.

              Dellacherie, C. and Meyer, P. A. (1982). Probabilities and Potential. B. North-Holland, Amsterdam.

              Dempster, A.P. and Laird, N.M. and Rubin, D.B. (1977).Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B39.

              Elliott, R. J., Jeanblanc, M. and Yor, M. (2000). On models of default risk. Math. Finance 10 179--195.

              Grandell, J. (1991) "Aspects of Risk Theory". Springer, New York

              Guo, X., Jarrow, R. and Zeng, Y. (2005). Modeling the recovery rate in a reduced form model. Preprint, Cornell Univ.

              He, S. W., Wang, J. G. and Yan, J. A. (1992). Semimartingale Theory and Stochastic Calculus. Science Press, Beijing.

              Ikeda, N. and Watanabe, S. (1962). On some relations between the harmonic measure and the Lévy measure for a certain class of Markov processes. J. Math. Kyoto Univ. 2 79--95.

              Internet Storm Center StormCast. Retrieved October 11, 2011, from

              Jacod, J. (1975). Multivariate point processes: Predictable projection, Radon--Nikodým derivatives, representation of martingales. Z. Wahrsch. Verw. Gebiete 31 235--253.

              Jeanblanc, M. and Valchev, S. (2005). Partial information and hazard process. Int. J. Theor. Appl. Finance 8 807--838.

              Joyce, James, (2008) "Bayes' Theorem", The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.), Retrieved October 11, 2011, from

              Kay, R. (1977), "Proportional Hazard Regression Models and the Analysis of Censored Survival Data" Journal of the Royal Statistical Society. Series C (Applied Statistics), Vol. 26, No. 3 pp. 227-237 Blackwell Publishing.

              Marti, K. (2008) "Computation of probabilities of survival/failure of technical, economic systems/structures by means of piecewise linearization of the performance function",Structural and Multidisciplinary Optimization, Vol35/3, Pp 225 - 244.

              Newman, M.E.J., Strogatz, S.H., and Watts, D.J. (2001) "Random graphs with arbitrary degree distributions and their applications". Phys. Rev. E 64. paper 026118

              Revuz, D. and Yor, M. (1999). Continuous Martingale and Brownian Motion, 3rd ed. Springer, Berlin.

              Rogers, T. (1996) “Type I and Type II Errors - Making Mistakes in the Justice SystemRetrieved October 11, 2011, from

              Survival/Failure Time Analysis. Retrieved October 11, 2011,from

              Therneau, T.; Sicks, J.; Bergstral, E. and Offord, J. (1994) “Expected Survival based on Hazard Rates”, Technical Report No. 52, Mayo Foundation, Retrieved October 11, 2011, from

              Thomson, I. (2007) “Google warns of web malware epidemic” Retrieved October 11, 2011, from,google-warns-of-web-malware-epidemic.aspx

              Weisstein, Eric W. "Bayes' Theorem" Retrieved October 11, 2011 from MathWorld--A Wolfram Web Resource.

              Woller, J (1996) “The Basics of Monte Carlo Simulations” Retrieved October 11, 2011, from

              Wright, C. (2007) “The IT Regulatory and Standards Compliance Handbook: How to Survive Information Systems Audit and Assessments” Syngress

              Wright, C. & Zia, T. (2011)”A Quantitative Analysis into the Economics of Correcting Software Bugs” CISIS Spain

              Wright, C. & Zia, T. (2011) “Modeling System Audit as a Sequential test with Discovery as a Failure Time Endpoint” in the proceedings of 2011 International Conference on Business Intelligence and Financial Engineering (ICBIFE 2011) December 12-13, 2011, Hong Kong

              Zhu, H; Zhang, Y; Huo, Q & Greenwood, S; (2002)"Application of Hazard Analysis to Software Quality Modelling" Computer Software and Applications Conference, Annual International, p. 139, 26th Annual International Computer Software and Applications Conference