Friday, 16 November 2007

Where Vulnerability Testing fails

This is the original unpunlished research paper I completed in 2004 that let to a couple published papers in audit and security journals and also as a SANS project. I hope that I have become a little more diplomatic in my writing in the preceeding years (Nah...).

Here we show that “Ethical Attacks” often do not provide the benefits they purport to hold. In fact it will be shown that this type of service may be detrimental to the overall security of an organisation.

It has been extensively argued that blind or black box testing can act as a substitute for more in depth internal tests by finding the flaws and allowing them to be fixed before they are exploited. This article will show that not only is the premise that external tests are more likely to determine vulnerabilities is inherently flawed, but that this style of testing may actually result in an organisation being more vulnerable to attack.

“Ethical Attacks” or as more commonly described “(white hat) hacker attacks” have become widely utilised tools in the organisational goal of risk mitigation. The legislative and commercial drivers are a pervasive force behind this push.
Many organisations do not perceive the gap in service offerings they are presented. It is often not known that “external testing” will not and by its very nature can not account for all vulnerabilities let alone risks.
For this reason, this article will address and compare the types and styles of security testing available and critique their shortfalls.

To do this we shall firstly look at what is an audit or review, what people are seeking from an audit and the results and the findings whether perceived or not. To do this it is necessary to both explore the perceptions and outcomes of an audit and review against the commercial realities of this style of service from both the provider and recipient’s perspective.

Next, it is essential to detail the actualities of a black box style test. The unfounded concerns that auditors are adverse to an organisation, derived from the ill-founded concept of the auditor as the “policeman”, have done more to damage any organisation than those they seek to defend themselves against.

This misconceived premise results in the mistrust of the very people entrusted to assess risk, detect vulnerabilities and report on threats to an organisation. Effectively this places the auditors in a position of censure and metaphorically “ties their hands behind their backs”.
Often this argument has been justified by the principle that the auditor has the same resources as the attacker. For simple commercial reasons this is never the case. All audit work is done to a budget, whether internal or externally sourced. When internal audit tests a control, they are assigned costs on internal budgets based on time and the effectiveness of the results.
Externally sourced auditors are charged at an agreed rate for the time expended. Both internal and external testing works to a fixed cost.
An external attacker (often referred to wrongly as a “hacker”) on the other hand has no such constraints upon them. They are not faced with budgetary shortfalls or time constraints. It is often the case that the more skilled and pervasive attacker will spend months (or longer) in the planning and research of an attack before embarking on the execution.

Further, audit staff are limited in number compared to the attackers waiting to gain entry through the back door. It is a simple fact the pervasiveness of the Internet has led to the opening of organisations to a previously unprecedented level of attack and risk. Where vulnerabilities could be open for years in the past without undue risk, systems are unlikely to last a week un-patched today.

The foundation of the argument that an auditor has the same resources must be determined to be false. There are numerous attackers all “seeking the keys to the kingdom” for each defender. There are the commercial aspects of security control testing and there are the realities of commerce to be faced.

It may be easier to give the customer what they perceive they want rather than to sell the benefits of what they need, but as security professionals, it is our role to ensure that we do what is right and not what is just easier.

What passes as an Audit
An “ethical attack” or “penetration testing” is a service designed to find and exploit (albeit legitimately) the vulnerabilities in a system rather than weaknesses in its controls. Conversely, an audit is a test of those controls in a scientific manner. An audit must by its nature be designed to be replicable and systematic through the collection and evaluation of empirical evidence.

The goal of an “ethical attack” is to determine and report the largest volume vulnerabilities as may be detected. Conversely, the goal of an audit is to corroborate or rebut the premise that systems controls are functionally correct through the collection of observed proofs.
This may result in cases where “penetration testing will succeed at detecting a vulnerability even though controls are functioning as they should be. Similarly, it is quite common for penetration testing to fail to detect a vulnerability even though controls are not operating at all as they should be” [i].
When engaged in the testing of a system, the common flaws will generally be found fairly quickly during testing. As the engagement goes on, less and less (and generally more obscure and difficult to determine) vulnerabilities will be discovered in a generally logarithmic manner. Most “ethical attacks” fail to achieve comparable results to an attacker for this reason. The “ethical attacker” has a timeframe and budgetary limits on what they can test.

On the contrary, an attacker is often willing to leave a process running long after the budget of the auditor has been exhausted. A resulting vulnerability that may be obscure and difficult to determine in the timeframe of an “external attack” is just as likely (if not more so) to be the one that compromises the integrity of your system than the one discovered early on in the testing.
Though it is often cast in this manner, an external test is in no way an audit.

What is External Testing Anyway?
There are several methods used in conducting external tests,
  • White box testing is a test where all of the data on a system is available to the auditor;

  • Grey box tests deliver a sample of the systems to the auditor but not all relevant information;

  • Black box tests are conducted “blind” with no prior knowledge of the systems to be tested.
White box tests are comparatively rare these days as they require the auditor to retain a high level of skill in the systems being tested and a complex knowledge of the ways that the systems interact. White box testing requires an in depth testing of all the controls protecting a system.
To complete a “white box” test, the auditor needs to have evaluated all (or as close to all as is practical) of the control and processes used on a system. These controls are tested to ensure that they are functionally correct and if possible that no undisclosed vulnerabilities exist. It is possible for disclosed vulnerabilities to exist on the system if they are documented as exceptions and the organisation understands and accepts the risk associated with not mitigating these.
One stage in the evaluation of a known vulnerability in a “white box” test is to ensure that it is fully disclosed and understood. One example of this is ensure that a vulnerability with a particular service where the risk has been mitigated through the use firewalling is not accessible from un-trusted networks.

Grey box testing is the most commonly employed external testing methodology used. An example would be an external test of a web server where the tester was informed of a firewall system but not of the Honeypots, NIDS (Network Intrusion Detection System) and HIDS (Host-Based Intrusion Detection System) which where deployed.

Grey box tests are more likely to occur then “white box” tests due to budgetary or contractual constraints. It is not often that an organisation is willing to check all the controls that they have in place to a high degree, rather trusting in selective products and concentrating their efforts on the controls they have the least confidence in.

This testing methodology is used as it requires a lower level of skill from the tester and is generally less expensive. Grey box tests usually rely on the testing of network controls or user level controls to a far greater degree than in “white box” test. Both Black and Grey box tests have a strong reliance on tools reducing the time and knowledge requirements for the tester.
The prevalence of tools based tests generally limits the findings to well known vulnerabilities and common mis-configurations and is unlikely to determine many serious systems flaws within the timeframe of the checking process.

Black box testing (commonly also known as “hacker testing”) is conducted with little or no initial knowledge of the system. In this type of test the party testing the system is expected to determine not only the vulnerabilities which may exist, but also the systems that they have to check! This methodology relies heavily on tools based testing – far more so than Grey box tests.
One of the key failures of black box testing is the lack of a correctly determined fault model. The fault model is a list of things that may go wrong. For example a valid fault model for an IIS web server could include attacks against the underlying Microsoft Operating system, but would likely exclude Apache Web server vulnerabilities.

In black box tests “you rarely get high coverage of any nontrivial function, but you can test for things like input overflows (for example, by sending enormous input strings) and off-by-one errors (by testing on each side of array size boundaries when you know them, for example), and so forth”[i]
After all is said and done, most vendors do not even do the above tests. Rather they rely on a toolset of programs such as NMAP or Nessus to do the work for them.
“It is often stated as an axiom that protection can only be done right if it's built in from the beginning. Protection as an afterthought tends to be very expensive, time-consuming, and ineffective”[ii]
Commonly, after all this (even using white box testing techniques), the tester will not always find all the intentionally placed vulnerabilities in a system.

What is an Audit? (Or what should an audit be)
An IT audit is a test of the controls in place on a system. An audit should always find more exposures than an “ethical attack” due to the depth it should cover. The key to any evaluation of an audit being the previous phrase “the depth it should cover”. Again budgetary and skills constraints effect the audit process.

The level of skills and knowledge of audit staff on selective systems will vary. The ensuing audit program, that is developed, will thus also vary based on the technical capabilities of the auditor on the systems they are evaluating. Further, the levels of knowledge held by the auditor or staff connecting the information needed to complete the audit will also affect the result.

One of the key issues in ensuring the completeness of an audit is that the audit staff are adequately trained both in audit skills as well as in the systems they have to audit. It is all too common to have auditors involved in router and network evaluations who have never been trained nor have any practical skills in networking or any network devices.

Often it is argued that a good checklist developed by a competent reviewer will make up for the lack of skills held by the work-floor audit member, but this person is less likely to know when they are not being entirely informed by the organisation they are meant to audit. Many “techies” will find great sport in feeding misinformation to an unskilled auditor leading to a compromise of the audit process. This of course has its roots in the near universal mistrust of the auditor in many sections of the community.

It needs to be stressed that the real reason for an audit is not the allocation of blame, but as a requirement in a process of continual improvement. One of the major failings in an audit is the propensity for organisations to seek to hide information from the auditor. This is true of many types of audit, not just IT.

For both of the preceding reasons it is important to ensure that all audit staff have sufficient technical knowledge and skills to both ensure that they have completed the audit correctly and to be able to determine when information is withheld.

From this table, it is possible to deduce that a report of findings issued from the penetration test would be taken to be significant when presented to an organisation’s management. Without taking reference to either the audit or the control results as to the total number of vulnerabilities on a system, the penetration test would appear to provide valuable information to an organisation.

However when viewed against the total number of vulnerabilities, which may be exploited on the system, the penetration test methodology fails to report a significant result. Of primary concern, the penetration test only reported 13.3% of the total number of high-level vulnerabilities, which may be exploited externally on the test systems. Compared to the system audit, which reported 96.7% of the externally exploitable high-level vulnerabilities on the system, the penetration test methodology has been unsuccessful.

External penetration testing is less effective than an IT audit
To demonstrate that an external penetration test is less effective than auditing the data it is essential to show that both the level of high-level vulnerabilities detected as well as the total level vulnerabilities discovered by the penetration test are significantly less than those discovered during an audit.

Figure 1 - Graph of Vulnerabilities found by Test type
As may be seen in Figure 1 - Graph of Vulnerabilities found by Test type and Figure 2 - Graph of Vulnerabilities found by exploit type that the total level of vulnerabilities discovered as well as a the high-level vulnerabilities are appreciably less in the penetration test results and from the audit results.

Figure 2 - Graph of Vulnerabilities found by exploit type

The primary indicator of the success of the penetration test would be both and detection of high-level vulnerabilities and the detection of a large number of vulnerabilities over all.

It is clear from Figure 3 - Graph of Vulnerabilities that the penetration test methodology, as reported, a smaller number of exploitable external vulnerabilities both as a whole and when comparing only the high-level vulnerability results.

Figure 3 - Graph of Vulnerabilities

It is not all Bad News
The key is sufficient planning. When an audit has been developed sufficiently, it becomes both a tool to ensure the smooth operations of an organisation and a method to understand the infrastructure more completely. Done correctly an audit may be a tool to not just point out vulnerabilities from external “hackers”. It may be used within an organisation to simultaneously gain an understanding or the current infrastructure and associated risks and to produce a roadmap towards where an organisation needs to be.

A complete audit will give more results and more importantly is more accurate than any external testing. The excess data needs to be viewed critically at this point as not all findings will be ranked to the same level of import. This is where external testing can be helpful.
After the completion for the audit and verification of the results, an externally (preferably white box) test may be conducted to help prioritise the vulnerable parts of a system. This is the primary areas where external testing has merit.

“Blind testing” by smashing away randomly does not help this process. The more details an auditor has, the better they may do their role and the lower the risk.


Just as Edsger W. Dijkstra in his paper “A Discipline of Programming” denigrates the concept of "debugging" as being necessitated by sloppy thinking, so to may we relegate external vulnerability tests to the toolbox of the ineffectual security professional.

In his lecture, "The Humble Programmer", Edsger W Dijkstra is promoting –
"Today a usual technique is to make a program and then to test it. But: program testing can be a very effective way to show the presence of bugs, but it is hopelessly inadequate for showing their absence. The only effective way to raise the confidence level of a program significantly is to give proof for its correctness. But one should not first make the program and then prove its correctness, because then the requirement of providing the proof would only increase the poor programmers’ burden. On the contrary: the programmer should let correctness proof and program to go hand in hand..."

Just as in programme development where the best way of avoiding bugs is to formally structure development, systems design and audit needs to be structured into the development phase rather that testing for vulnerabilities later.

It is necessary that the computer industry learns from the past. Similar to Dijkstra’s assertion that "the competent programmer is fully aware of the strictly limited size of his own skull; therefore he approaches the programming task in full humility, and among other things he avoids clever tricks like the plague...”[iv]. Security professionals, including testers and auditors need to be aware of their limitations. Clever tricks and skills in the creation of popular “hacker styled” testing are not effective.

As the market potential has grown, unscrupulous vendors have been quoted overemphasising dangers to expand customer base and in some cases selling products that may actually introduce more vulnerabilities than they protect against.

External testing is an immense industry. This needs to change. It is about time we started securing systems and not just reaping money in from them using ineffectual testing methodologies.


An audit is not designed to distribute the allocation of blame. It is necessary that as many vulnerabilities affecting a system as is possible are diagnosed and reported. The evidence clearly support to the assertion that external penetration testing is an ineffective method of assessing system vulnerabilities.

In some instances, it will not be possible or feasible to implement mitigating controls for all (even high-level) vulnerabilities. It is crucial however that all vulnerabilities are known and reported in order that compensating controls may be implemented.

The results of the experiment categorically show the ineffectiveness of vulnerability testing by "ethical attacks". This ineffectiveness makes the implementation of affected controls and countermeasures ineffective.

This type of testing results in an organisation's systems being susceptible and thus vulnerable to attack. The results of this experiment strongly support not using "ethical attacks" as a vulnerability reporting methodology.

The deployment of a secure system should be one of the goals in developing networks and information systems in the same way that meeting system performance objectives or business goals is essential in meeting an organisation’s functional goals.


I would like to thank Sonny Susilo for his help on this experiment and for BDO for their support. In particular I would like to that Allan Granger from BDO for his unwavering belief in this research.

Web Sites and Reference
S.C.O.R.E. – a standard for information security testing
The Auditor security collection is a Live-System based on KNOPPIX
Nessus is an Open Source Security Testing toolset

In support of the assertions made within this paper, an experimental research was conducted. The paper from this research has been completed and is available to support these assertions. First, the system tested is detailed as per the results of an audit. Next a scan of the system is completed as a Black, Grey and White box external Test.

The results of these tests below support the assertions made in this paper. The configuration of the testing tool has been tailored based on the knowledge of the systems as supplied.

[ii] “Fred Cohen”
[iii] Fred Cohen,
[iv] Edsger W Dijkstra, EWD 340: The humble programmer published in Commun. ACM 15 (1972), 10: 859–866.

1 comment:

Craig S Wright said...

And if you are interested the research I used in this paper:

Abbott, R.P.,, "Security Analysis and Enhancement of Computer Operating Systems," NBSIR 76-1041, National Bureau of Standards, ICST, Gaithersburg, MD., April 1976.
Anderson, Alison, and Michael Shain. "Risk Management." Information Security Handbook. Ed. William Caelli, Dennis Longley and Michael Shain. 1 ed. New York City: Stockton Press, 1991. 75-127.
Anderson, James P. (1972). Computer Security Technology Planning Study. Technical Report ESD-TR-73-51, Air Force Electronic Systems Division, Hanscom AFB, Bedford, MA. (Also available as Vol. I, DITCAD-758206. Vol II, DITCAD-772806).
Attanasio, C.R., P.W. Markstein and R.J. Phillips, "Penetrating an Operating System: A Study of VM/370 Integrity," IBM Systems Journal, Vol. 15, No. 1, 1976, pp. 102-106.
Bauer, D.S. and M.E. Koblentz, "NIDX - A Real-Time Intrusion Detection Expert System," Proceedings of the Summer 1988 USENIX Conference, June 1988
Bishop, M., "Security Problems with the UNIX Operating System," Computer Science Department, Purdue University, West Lafayette, Indiana, March 1982.
Bull, A., C.E. Landwehr, J.P. McDermott and W.S. Choi, "A Taxonomy of Computer Program Security Flaws," Center for Secure Information Technology, Naval Research Laboratory, draft in preparation, 1991.
Cohen, Fred, “Protection Testing”,, September 1998
Dias, G.V., et. al., "DIDS (Distributed Intrusion Detection System) - Motivation, Architecture, and An Early Prototype," Proceedings of the 14th National Computer Conference, Washington, D.C., October 1991, pp. 167-176.
Dijstra, Edsger W. (1976). A Discipline of Programming. Englewood Cliffs, NJ: Prentice Hall
Farmer, D. and E.H. Spafford, "The COPS Security Checker System," CSD-TR-993, Department of Computer Sciences, Purdue University, West Lafayette, Indiana, 1990. (Software available by anonymous ftp from
Farmer, Dan, and Weitse Venema. "Improving the Security of Your Site by Breaking into It." 1992
Garfinkel, S., G. Spafford, Practical Unix Security, O'Reilly & Associates, Inc., Sebastopol, CA., 1991.
Gasser, M., Building a Secure Computer System, Van Nostrand Reinhold, New York, N.Y., 1988. [GUPTA91] Gupta, S. and V.D. Gligor, "Towards a Theory of Penetration- Resistant Systems and its Application," Proceedings of the 4th IEEE Workshop on Computer Security Foundations, Franconia, N.H., June 1991, pp. 62-78.
Gupta, S. and V.D. Gligor, "Experience With A Penetration Analysis Method And Tool," Proceedings of the 15th National Computer Security Conference, Baltimore, MD.,October 1992, pp. 165-183.
Hamlet, Richard (1989). Testing for Trustworthiness. In J. Jackey and D. Schuler (Eds.), Directions and Implications of Advanced Computing, pp. 97-104. Norwood, NJ: Ablex Publishing Co. Myers, Phillip (1980). Subversion: The Neglected Aspect of Computer Security. MS thesis, Naval Postgraduate School, Monterey, CA.
Harris, Shon; Harper, Allen; Eagle, Chris; Ness, Jonathan; Leste, Michael, 2004, “Gray Hat Hacking: The Ethical Hacker's Handbook”, McGraw-Hill Osborne Media; 1st edition (November 9, 2004).
Hollingworth, D., S. Glaseman and M. Hopwood, "Security Test and Evaluation Tools: an Approach to Operating System Security Analysis," P-5298, The Rand Corporation, Santa Monica, CA., September 1974.
Humphrey, Christopher; Jones, Julian; Khalifa, Rihab; Robson, Keith; “Business Risk Auditing And The Auditing Profession: Status, Identity And Fragmentation”, Manchester School of Accounting and Finance), CMS 3 Conference Paper 2003.
Intal, Tiina and Do, Linh Thuy; “Financial Statement Fraud - Recognition of Revenue and the Auditor’s Responsibility for Detecting Financial Statement Fraud”, Göteborg University, Masters Thesis, 2002, ISSN 1403-851X
IRA WINKLER, “AUDITS, ASSESSMENTS & TESTS (OH, MY)”, Corporate Espionage (Prima, 2nd ed., 1999).
Jacobson, Robert V. "Risk Assessment and Risk Management." Computer Security Handbook. Ed. Seymour Bosworth and M. E. Kabay. 4 ed. New York: Wiley, 2002. 47.1-47.16.
Jones, Andrew. "Identification of a Method for the Calculation of Threat in an Information Environment.
Kramer, John B. “The CISA Prep Guide: Mastering the Certified Information Systems Auditor Exam”, John Wiley and Sons, USA 2003. ISBN:0471250325
Linde, R.R., "Operating System Penetration," Proceedings of the National Computer Conference, Vol. 44, AFIPS Press, Montvale, N.J., 1975.
Parker, D.B., "Computer Abuse Perpetrators and Vulnerabilities of Computer Systems," Stanford Research Institute, Menlo Park, Ca., December 1975.
Phillips, R. , "VM/370 Penetration Study Final Report," TM(L)-5196/006/00, System Development Corporation, Santa Monica, CA., October 1973.
Rub, J.W., "Penetration Handbook," The Aerospace Corporation, El Segundo, CA., January 1986.
Sanchez, Luis, et al. "Requirements for the Multidimensional Management and Enforcement (MSME) System."
Security Concepts for Distributed Component Systems. 1. Ed. Walt Smith. 16 June 1998. NIST. 14 Nov 2003 ( (page 53)
Security Tracker Statistics. 2002. LLC. 23 October 2003
Siebenlist, Frank. CORBA-SECURITY-Glossary. 14 Nov 2003
Stanger, James; Lane, Patrick T. and Crothers, Tim, “CIW: Security Professional Study Guide”, Sybex USA, 2002, ISBN:078214084x
Stephenson, Peter. Modeling of Post-Incident Root Cause Analysis. “International Journal of Digital Evidence” Volume 2, Issue 2,
Stephenson, Peter, Forensic Analysis of Risks in Enterprise Systems, Center for Regional and National Security, Eastern Michigan University
Sterne, Daniel F. (1991). On the Buzzword "Security Policy". In Proceedings 1991 IEEE Symposium on Research in Security and Privacy, Oakland, pp. 219-230. Los Alamitos, CA: IEEE Computer Society Press.
Thompson, Kenneth (1984). Reflections on Trusting Trust. Communications of the A.C.M. 27(8), 761-763.
Weissman, Clark., "Penetration Testing," Information Security Essays, Abrams, M.D., S. Jajodia, H. Podell, eds., IEEE Computer Society Press, 1994
Weissman, Clark (1995). Penetration Testing. In M. Abrams, S. Jajodia, and H. Podell (Eds.),Information Security: An Integrated Collection of Essays, pp. 269-296. Los Alamitos, CA: IEEE Computer Society Press.
van Wyk, Kenneth, “Finding the Elusive Value in Penetration Testing”,, August 11th 2004