Thursday, 1 July 2010

The fallacy inherent in the "Black Swan" Theory

The "Black Swan" theory so elegantly proposed by Taleb [1] asserts that risk is an unpredictable function which collapses under tong-tail events. This is based on a certain formulation of induction and the inductive process itself. The problem here is not induction per se, but a misapplication of the same. The following quote which is used as a foundation to the theory demonstrates the flaw as it can not be validly induced:

  • Every swan I’ve ever seen has been white, therefore all swans are white

To be logically correct and inductive in nature, this quote should rightly read in a similar manner to the one below:

  • Every swan I’ve ever seen in the limited location I have visited has been white, therefore it is highly probable that all swans in the given locations are white and I can state this with 95% certainty”.

As I noted already, the former statement is logically unsound. It is not inductive (nor is it deductive). For the statement to be deductive, the viewer would have seen all swans that have existed, do currently exist and will exist at any time in the future. This is of course the reason why little if any valid deductive thought can survive outside of philosophy and except under the extremes of unsupported assumptions.

Science has a strong reliance on induction for this reason. It assumes that a hypothesis is valid within a set uncertainty range. The later statement still allows for risk and we also have to have a set of processes based on statistically sound methods.We can also insure against the losses that are due to uncertainty and distribute these across market segments.

A poor comprehension of inductive processes and meaning does not make these errors logically valid. If we use the more correct inductive process we have a means of formulating risk. To truly make this exercise scientifically valid, we have to go further than the simplified inductive statement given as a more correct formulation. We need to report our data (and methodology, but I will leave this for another time).

For instance, if our hypothetical observer was truly a scientist and not simply a philosopher with little (if any) grounding in logic, we would need to report the swan case in a manner as follows (here I am assuming that a valid methodology for the collection of the data was used and am ignoring error rates in this process for the sake of brevity - which also have to be incorporated for this to be accurate):
  • I have noted 209 swans from 20 similar pools. These pools all have similar temperature and ecological structures and hold similar environmental conditions (as described in the appendix).  Each of the pools held 10 adult swans [at a 95% confidence interval we are 95% confident that similar ponds would hold a range of (9,11) swans]. The ponds where all sampled in the first week of Spring. All ponds where within a 250 radius of a defined central point. Every swan noted in this experiment has been white. Therefore it is highly probable that all swans in the given locations (latitude and longitude) and which experience similar environments are white (at least in the time periods being observed as no tests to whether swans change color at other times in the year have been conducted)  and I can state this with 95% certainty”.
The issue with risk is not the uncertainty of a calculation nor the long tail of many distributions. Rather, it is a combination of the lack of scientific vigor and a willingness to overlook the uncertainty and account for this.

If this principle had been applied to financial markets (and they had to insure - either self insure or via market methods), the error rates of the Copula and other algorithmic methods would have required a 1-2% risk assignment to account for error (this was simply ignored). Had this occurred, an amount of around 16 trillion would have been available to account for the losses and no government intervention would have been needed.

Too big to fail and the uncertainty as to who receives this treatment lead to perverse incentives for banks to ignore risk. This does not mean it is not able to be assessed.

[1] N.N. Taleb, The Black Swan, Penguin, 2007

Wednesday, 30 June 2010

(Ir)responsible disclosure

"Responsible disclosure" (RD) is (as has been noted) loaded. Without an obligation to withhold (such as a contract to a single party), there is no responsibility to not disclose. Overall, when looking at the costs, RD is particularly irresponsible.


The issue is that a un-patched vulnerability has a direct impact on the vendor share-price. Vendors do not want early disclosure as this has an immediate financial impact to them. In the case of a large firm (such as Microsoft) this can be several hundred million dollars. Early release is a strong incentive (esp. To those such as Oracle and Adobe) to produce fewer vulnerabilities and patch faster.

Most vulnerabilities are in fact known well before they are released. This means that they are being exploited well before they are released. Patching is also rarely the only option available and is not always an available or desirable solution. With knowledge of a vulnerability, workarounds may also suffice. This of course requires knowledge. Many pen testing firms do not want this as they have a strong incentive to have unpatched vulnerabilities. A vulnerability that is not common, but which is known to a specialized group is unlikely to be fixed at a client site but may be known to the tester. The tester seems to have a greater value due to imperfect information. So, many security testers also have a strong incentive to disclose later. The client has an incentive to receive information early, but the vendor’s clients are the ones without power to achieve this under RD. This is a result of strategies to fix the problem. There are more ways to correct a vulnerability then simply patching.

The answer is to look at incentives. Market based systems for disclosure (real systems, not the cert ones etc but an actual open market) allow more researchers (as they can be paid for their efforts). These researchers can also specialize. They do not need to bother about not being hired by a large firm if they are good enough at reversing and finding problems.

Releasing early via a market is the best overall strategy for the largest number of parties and it is the most economically cost effective solution. The problem is that there are too many insiders with too many vested interests in the status quo.

The creation of a security and risk derivative should change this. The user would have an upfront estimate of the costs and this could be forced back to the software vendor. Where the derivative costs more than testing, the vendor would conduct more in-depth testing and reduce the levels of bugs. This would most likely lead to product differentiation (as occurred in the past with Windows 95/Windows NT). Those businesses will to pay for security could receive it. Those wanting features would get what they asked for.

For more reading see:
[ 1] Arora, A. & Telang, R. (2005), “Economics of Software Vulnerability Disclosure”, IEEE Security and Privacy, 3 (1), 20-2
[ 2] Arora, A., Telang, R. & Xu, H. (2004) “Optimal Time Disclosure of Software Vulnerabilities”, Conference on Information Systems and Technology, Denver CO, October 23-2
[ 3] Arora, A., Telang, R. & Xu, H. (2008), “Optimal Policy for Software Vulnerability Disclosure”, Management Science 54(4), 642-6
[ 4] Bacon, D. F., Chen, Y., Parkes, D., & Rao, M. (2009). A market-based approach to software evolution. Paper presented at the Proceeding of the 24th ACM SIGPLAN conference companion on Object oriented programming systems languages and applications.
[ 5] Beach, J. R., & Bonewell, M. L. (1993). Setting-up a successful software vendor evaluation/qualification process for `off-the-shelve' commercial software used in medical devices. Paper presented at the Computer-Based Medical Systems, 1993. Proceedings of Sixth Annual IEEE Symposium on.
[ 6] Brookes, F. (1995) “The Mythical Man-Month”. Addison-Wesley
[ 7] Campodonico, S. (1994). A Bayesian Analysis of the Logarithmic-Poisson Execution Time Model Based on Expert Opinion and Failure Data. IEEE Transactions on Software Engineering, 20, 677-683.
[ 8] Cavusoglu, H., Cavusoglu, H. & Zhang, J. (2006) Economics of Security Patch Management, The Fifth Workshop on the Economics of Information Security (WEIS 2006)
[ 9] Cohen,. J. (2006). Best Kept Secrets of Peer Code Review (Modern Approach. Practical Advice.). Smartbearsoftware.com
[ 10] de Villiers, M (2005) “Free Radicals in Cyberspace, Complex Issues in Information Warefare” 4 Nw. J. Tech. & Intell. Prop. 13, http://www.law.northwestern.edu/journals/njtip/v4/n1/2
[ 11] Dijkstra, E. W. (1972). “Chapter I: Notes on structured programming Structured programming” (pp. 1-82): Academic Press Ltd.
[ 12] Kannan K & R Telang (2004) ‘Market for Software Vulnerabilities? Think Again.’ Management Science.
[ 13] Mills, H. D. (1971) "Top-down programming in large systems", Debugging techniques in large systems, R. Rustin Ed., Englewoods Cliffs, N.J. Prentice-Hall
[ 14] Murphy, R. & Regnery, P. (2009) “The Politically Incorrect Guide to the Great Depression and the New Deal”.
[ 15] Nissan, N., Roughgarden, T., Tardos, E. & Vazirani, V. (Eds.) (2007) “Algorithmic Game Theory” Cambridge University Press, {P14, Pricing Game; P24, Algorithm for a simple market; P639 Information Asymmetry).
[ 16] Nizovtsev, D., & Thursby, M. (2005) “Economic analysis of incentives to disclose software vulnerabilities”. In Fourth Workshop on the Economics of Information Security.
[ 17] Ounce Labs, 2. http://www.ouncelabs.com/about/news/337-the_cost_of_fixing_an_application_vulnerability
[ 18] Ozment, A. (2004). “Bug auctions: Vulnerability markets reconsidered”. In Third Workshop on the Economics of Information Security
[ 19] Perrow, C. (1984/1999). Normal Accidents: Living with High-Risk Technologies, Princeton University Press.
[ 20] Telang, R., & Wattal, S. (2004) “Impact of Software Vulnerability Announcements on the Market Value of Software Vendors – an Empirical Investigation” http://infosecon.net/workshop/pdf/telang_wattal.pdf
[ 21] Turing, A (1936), “On computable numbers, with an application to the Entscheidungsproblem”, Proceedings of the London Mathematical Society, Series 2, 42 pp 230–265
[ 22] Weigelt, K. & Camerer, C. (1988) “Reputation and Corporate Strategy: A Review of Recent Theory and Applications” Strategic Management Journal, Vol. 9, No. 5 (Sep. - Oct., 1988), pp. 443-454, John Wiley & Sons
[ 23] Donald, D. (2006), "Economic Foundations of Law and Organization" Cambridge University Press

Please note that the model tested against a CERT by Arora, Telang and Xu uses a game theoretic pricing game [14]. This model has players in the market that do not report their prices . These players use a model where information is simultaneously distributed to the client of the player and the vendor. The CERT model was touted as being optimal. It relies on waiting until a patch was publically released and only then releasing the patch to the public. This ignores many externalities and assumes the only control is a patch in place of other alternative compensating controls.

Consequently, the examined "market" model is in itself sub-optimal.