Saturday, 6 September 2008

Weekend Images

Another post with some of the local wildlife at the farm.
The male and female King parots are about now that we are coming into spring.
The azelia is in bloom - which makes the wallabies happy as they eat the flowers.
More parots.
And a small community picture.


Thursday, 4 September 2008

My next course

As it recieved the most votes, I am enrolling in my forth CSU IT Masters degree starting next year. I will be doing the ".Net Programming - Master of Systems Development (MSysDev) - http://www.itmasters.com.au/microsoft_mcsd_stream.htm

In 2 years (2011) I shall start the Masters degree on Database design.

On a related note, the Masters degree I have been proposing in digital forensics looks as if it shall be going ahead. The first intake will be in 2010 (jan).

Finding more

This is a note to point readers to another forum that I am writing on. I have been selected as an author for the SANS forensics blog. The associated blog site is located on wordpress.

I am still continuing this blog, but I am also starting a column covering mobile device forensics and malware issues.

Wednesday, 3 September 2008

The next tool - DD

I recently profiled the tool netcat. The net on the list is "DD". Over the next few days I shall be profiling this versitile tool.

dd is a program that is found on most UNIX and Linux (and even a few Windows) systems. The chief purpose of the program is the low level copy and converting of data at the bit level. This is a common tool in many forensic and incident response engagements.

DD stands for "data definition". In this series of posts I shall cover all the standard uses as well as many unusual ones. I will include uses as a stegonographic tool (a very good one), an encryption engine with a set key and many other uses that you probably never though of for a copy tool.

Watch the post over the next few days foir details.

Benford's Law in "R"

The following is a package to run a Benford's law analysis using the R statistical package. I first did this as a project with the University of Newcastle. It has been updated since with help from staff and will continue to develop.

Package: BenfordsLaw
Version: 0.9
Date: 2006-11-01
Title: An Implementation of Beford's Law
Author: Craig Wright
Maintainer: Craig Wright
Depends: R (>= 2.3.0)
Suggests:
Description: The package has a series of functions that calculate the
Distorion Factor (DF), The Bayesian Factor associated with the
functions and plot the fit of the dataset to a Benford
Distribution.
License: GPL version 2 or newer
URL: http://www.r-project.org, http://www.bdo.com.au
Packaged: Wed Jan 23 16:46:00 2008; 1179
Built: R 2.6.1; i386-pc-mingw32; 2008-01-23 16:46:01; windows


.packageName <- "BenfordsLaw"
# Average mean
"AM" <-
function(x){
# Actual Mean of the data
x.collapsed <- x.coll(x)
AM.value <- sum(x.collapsed)/length(x)
return(AM.value)
}
"BF.Bayesian" <-
function(x,alpha=5) {
x <- x.coll2(x)
pi0 <- alpha/100
n <- length(x)
x.bar <- mean(x)
# for a large sample - the SD of the distribution will approx. the SD for the population
S <- sd(x)
#
theta.o <- 90/(n * (10^(1/n) - 1)) # also set as EM above
#
z.beysian <- sqrt(n)*(x.bar - theta.o)/S
#
b1 <- sqrt(1 + n)/exp(((z.beysian^2)/2)/(1+1/n))
#
b2 <-1/b1
#
alpha.0 <- (1 + ((1 - pi0)/pi0)*exp((z.beysian^2/2)/(1 + 1/n))/sqrt(1+n))
#
bayes <- c()
bayes$b1 <- b1
bayes$b2 <- b2
bayes$alpha <- alpha.0
return (bayes)
}
# Z-statistics for the Distortion Factor model
"BF.Z.statistic" <-
function(x) {
DF.value <- DF(x)
SD.df <- SD.DF(x)
Z.statistic <- DF.value/SD.df
return(Z.statistic)
}
# The main function
#
"Benford.s.Law" <-
function(x,alpha=5) {
#
par(mfrow=c(2, 1))
BenfordObsExp1(x,alpha)
BenfordObsExp12(x,alpha)
bf.z <- BF.Z.statistic(x)
p.value <- dnorm((bf.z), mean=0, sd=1, log = FALSE)
# If p.value< alpha - possible fraud / risk indicated in dataset
ifelse(p.value>(alpha/100),
print("The dataset shows signs of non-conformance. The result is significant", width = 20),
print("The dataset conforms to Benford's Law and the expected values are within the confidence range as set.", width = 20)
)
b.value <- BF.Bayesian(x,alpha)
print("")
print("The Z Statistic is: ")
print(bf.z)
print("")
print("The P.Value is: ")
print(p.value)
print("The Bayseian Value is: ")
print(b.value)
}
################# Plotting first digit frequency #############

"BenfordObsExp1" <-
function(x, alpha=5){
# Modified from the Plot function of Keith Wright
data <- substitute(x)
n <- length(x)
# Peel off the first digit
x <- as.numeric(substring(formatC(x, format = 'e'), 1, 1))
obsFreq <- tabulate(x, nbins = 9)
benFreq <- dbenford(1:9)
benFreq.U <- benford.bound.upper(x, 1:9, alpha)
benFreq.L <- benford.bound.lower(x, 1:9, alpha)
plot(1:9, obsFreq/n, xlim = c(1, 9), ylim = c(0, max(obsFreq/n, benFreq, 0.350)), type = 'h',
main = paste("First Digit Benford's Law Plot for Alpha =", alpha),
sub = paste("(Confidence Intervals are Green, Expected is blue, Observed are Black)"),
xlab = "First Digits", ylab = " Proportions")
axis(1, at = 1:9)
points(1:9, benFreq, col = "black", pch = 23)
lines(spline(1:9, benFreq.U), col = "green", pch = 16)
lines(spline(1:9, benFreq), col = "blue", pch = 16)
lines(spline(1:9, benFreq.L), col = "green", pch = 16)
}
########### Ploting first and second digits frequency #########

"BenfordObsExp12" <-
function(x,alpha=5){
#
data <- substitute(x)
n <- length(x)
# Peel off the first and second digit
x.collapsed2 <- x.coll2(x)
x.tab<- tabulate(x.collapsed2-9)
obsFreq <- tabulate(x.collapsed2-9, nbins=90)
benFreq <- dbenford(10:99)
benFreq.U <- benford.bound.upper(x, 10:99, alpha)
benFreq.L <- benford.bound.lower(x, 10:99,alpha)
plot(10:99, obsFreq/n, xlim = c(10, 99), ylim = c(0, max(obsFreq/n, benFreq, 0.06)), type = 'h',
main = paste("Two Digit Benford's Law Plot for Alpha =", alpha),
sub = paste("(Confidence Intervals are Green, Expected is blue, Observed are Black)"),
xlab = "First Two Digits", ylab = "Proportions")
axis(1, at = 10)
lines(spline(10:99, benFreq.U), col = "green", pch = 16)
lines(spline(10:99, benFreq), col = "blue", pch = 16)
lines(spline(10:99, benFreq.L), col = "green", pch = 16)
}
.packageName <- "BenfordsLaw"
########################################################################################
############################# Calculation of the distortion factor #####################
# See page 66 of Mark J. Nigrini (2000) Benford's Law, 2nd Ed., Global Audit Publications, Vancouver
"x.coll" <-
function(x){
x.collapsed <- (10*x)/(10^as.integer(log10(x)))
return(x.collapsed)
}
# This added another as.integer into the Xcoll equation, but this is not described in the text
"x.coll2" <-
function(x){
x.collapsed <- (10*x)/(10^as.integer(log10(x)))
x.collapsed2 <- as.integer(x.collapsed)
return(x.collapsed2)
}

# Average mean
"AM" <-
function(x){
# Actual Mean of the data
x.collapsed <- x.coll(x)
AM.value <- sum(x.collapsed)/length(x)
return(AM.value)
}


# Expected mean
"EM" <-
function(x){
# Expected Mean of the Data as determined by a Benford Distribution
n <- length(x)
x.collapsed <- x.coll(x)
EM.value <- 90/(n*(10^(1/n)-1))
return(EM.value)
}

# Distortion factor
"DF" <-
function(x){
# Distortion Factor used to create a probability model
n <- length(x)
AM.value <-AM(x)
EM.value <-EM(x)
DF.value <- ((AM(x) - EM(x))/EM(x))
return(DF.value)
}

# Standard deviation of the distortion factor test
"SD.DF" <-
function(x){
# This is the approx. Standard Deviation of the DF which is expected
# if the distribution follows Benford’s Law
n <- length(x)
# Hey, I see hard coded numbers in use!!
SD.df <- 0.63825342/(n)^0.5
return(SD.df)
}

# Z-statistics for the Distortion Factor model
"BF.Z.statistic" <-
function(x) {
DF.value <- DF(x)
SD.df <- SD.DF(x)
Z.statistic <- DF.value/SD.df
return(Z.statistic)
}
# This gives you the p-value from the Distortion Factor model
"BF.p.value" <- function( x )
{
bf.z <- BF.Z.statistic(x)
return ( dnorm((bf.z), mean=0, sd=1, log = FALSE) )
}
########################################################################################

"BF.Bayesian" <-
function(x,alpha=5) {
x <- x.coll2(x)
pi0 <- alpha/100
n <- length(x)
x.bar <- mean(x)
# for a large sample - the SD of the distribution will approx. the SD for the population
S <- sd(x)
#
theta.o <- 90/(n * (10^(1/n) - 1)) # also set as EM above
#
z.beysian <- sqrt(n)*(x.bar - theta.o)/S
#
b1 <- sqrt(1 + n)/exp(((z.beysian^2)/2)/(1+1/n))
#
b2 <-1/b1
#
alpha.0 <- (1 + ((1 - pi0)/pi0)*exp((z.beysian^2/2)/(1 + 1/n))/sqrt(1+n))
#
bayes <- c()
bayes$b1 <- b1
bayes$b2 <- b2
bayes$alpha <- alpha.0
return (bayes)
}

#######################################################################################

# The main function
#
"Benford.s.Law" <-
function(x,alpha=5) {
#
par(mfrow=c(2, 1))
BenfordObsExp1(x,alpha)
BenfordObsExp12(x,alpha)
bf.z <- BF.Z.statistic(x)
p.value <- dnorm((bf.z), mean=0, sd=1, log = FALSE)
# If p.value< alpha - possible fraud / risk indicated in dataset
ifelse(p.value>(alpha/100),
print("The dataset shows signs of non-conformance. The result is significant", width = 20),
print("The dataset conforms to Benford's Law and the expected values are within the confidence range as set.", width = 20)
)
b.value <- BF.Bayesian(x,alpha)
print("")
print("The Z Statistic is: ")
print(bf.z)
print("")
print("The P.Value is: ")
print(p.value)
print("The Bayseian Value is: ")
print(b.value)
}

################# Graphing function ################
# Helper functions for plot
"benford.bound.lower" <-
function(x,y,alpha=5){
n <- length(x)
# replace 1.96 with Z.alpha
p <- 1-(alpha/2)
q <- qnorm((1-(alpha/200)), mean=0, sd=1, log.p = FALSE)
benFreq.lower <- dbenford(y) - q*sqrt(dbenford(y)*(1- dbenford(y))/n) - (1/(2*n))
return(benFreq.lower)
}
"benford.bound.upper" <-
function(x,y,alpha=5){
n <- length(x)
# replace 1.96 with Z.alpha
p <- 1-(alpha/2)
q <- qnorm((1-(alpha/200)), mean=0, sd=1, log.p = FALSE)
benFreq.upper <- dbenford(y) + q*sqrt(dbenford(y)*(1- dbenford(y))/n) + (1/(2*n))
return(benFreq.upper)
}
################# Plotting first digit frequency #############

"BenfordObsExp1" <-
function(x, alpha=5){
# Modified from the Plot function of Keith Wright
data <- substitute(x)
n <- length(x)
# Peel off the first digit
x <- as.numeric(substring(formatC(x, format = 'e'), 1, 1))
obsFreq <- tabulate(x, nbins = 9)
benFreq <- dbenford(1:9)
benFreq.U <- benford.bound.upper(x, 1:9, alpha)
benFreq.L <- benford.bound.lower(x, 1:9, alpha)
plot(1:9, obsFreq/n, xlim = c(1, 9), ylim = c(0, max(obsFreq/n, benFreq, 0.350)), type = 'h',
main = paste("First Digit Benford's Law Plot for Alpha =", alpha),
sub = paste("(Confidence Intervals are Green, Expected is blue, Observed are Black)"),
xlab = "First Digits", ylab = " Proportions")
axis(1, at = 1:9)
points(1:9, benFreq, col = "black", pch = 23)
lines(spline(1:9, benFreq.U), col = "green", pch = 16)
lines(spline(1:9, benFreq), col = "blue", pch = 16)
lines(spline(1:9, benFreq.L), col = "green", pch = 16)
}

########### Ploting first and second digits frequency #########

"BenfordObsExp12" <-
function(x,alpha=5){
#
data <- substitute(x)
n <- length(x)
# Peel off the first and second digit
x.collapsed2 <- x.coll2(x)
x.tab<- tabulate(x.collapsed2-9)
obsFreq <- tabulate(x.collapsed2-9, nbins=90)
benFreq <- dbenford(10:99)
benFreq.U <- benford.bound.upper(x, 10:99, alpha)
benFreq.L <- benford.bound.lower(x, 10:99,alpha)
plot(10:99, obsFreq/n, xlim = c(10, 99), ylim = c(0, max(obsFreq/n, benFreq, 0.06)), type = 'h',
main = paste("Two Digit Benford's Law Plot for Alpha =", alpha),
sub = paste("(Confidence Intervals are Green, Expected is blue, Observed are Black)"),
xlab = "First Two Digits", ylab = "Proportions")
axis(1, at = 10)
lines(spline(10:99, benFreq.U), col = "green", pch = 16)
lines(spline(10:99, benFreq), col = "blue", pch = 16)
lines(spline(10:99, benFreq.L), col = "green", pch = 16)
}

##################################################################
# assimulates the rnorm, pnorm, cnorm functions
# Helper functions for plot
# Expected proportion of the digit x, gives the density
"dbenford" <-
function(x){
log10(1 + 1/x)
}

# cumulative probability frequency, gives the distribution function
# Input: q vector of quantiles.
# d he fixed number of digits in the analyses
"pbenford" <-
function(q, digits = 1){
d <- digits
digits_range <- (10^(d-1)):(10^d -1)
cumprobs <- cumsum(dbenford(digits_range))
return(cumprobs[q])
}
# Quantile function
# Input: p vector of probabilities.
# d the fixed number of digits in the analyses
"qbenford" <-
function(p, digits = 1){
d <- digits
digits_range <- (10^(d-1)):(10^d -1)
cumprobs <- cumsum(dbenford(digits_range))
cumprobs[length(digits_range)] <- 1 # To fix a rounding error
quantiles <- sapply(p, function(x) {10^d - sum(cumprobs >= x)})
return(quantiles)
}
# Benford's Law, generate random deviates
# Input: n number of observations. If length(n) > 1, the length is taken to be the number required.
# d the fixed number of digits in the analyses
"rbenford" <-
function(n, digits = 1){
if ( length(n) > 1)
{
n <- length(n)
}
d <- digits
digits_range <- (10^(d-1)):(10^d -1)
sample( digits_range, size = n, replace = TRUE, prob = dbenford(digits_range))
}





# Distortion factor
"DF" <-
function(x){
# Distortion Factor used to create a probability model
n <- length(x)
AM.value <-AM(x)
EM.value <-EM(x)
DF.value <- ((AM(x) - EM(x))/EM(x))
return(DF.value)
}

# Expected mean
"EM" <-
function(x){
# Expected Mean of the Data as determined by a Benford Distribution
n <- length(x)
x.collapsed <- x.coll(x)
EM.value <- 90/(n*(10^(1/n)-1))
return(EM.value)
}
# Standard deviation of the distortion factor test
"SD.DF" <-
function(x){
# This is the approx. Standard Deviation of the DF which is expected
# if the distribution follows Benford’s Law
n <- length(x)
# Hey, I see hard coded numbers in use!!
SD.df <- 0.63825342/(n)^0.5
return(SD.df)
}
# helper function for plot
"benford.bound.lower" <-
function(x,y,alpha=5){
n <- length(x)
# replace 1.96 with Z.alpha
p <- 1-(alpha/2)
q <- qnorm((1-(alpha/200)), mean=0, sd=1, log.p = FALSE)
benFreq.lower <- dbenford(y) - q*sqrt(dbenford(y)*(1- dbenford(y))/n) - (1/(2*n))
return(benFreq.lower)
}
# Helper functions for plot
"benford.bound.upper" <-
function(x,y,alpha=5){
n <- length(x)
# replace 1.96 with Z.alpha
p <- 1-(alpha/2)
q <- qnorm((1-(alpha/200)), mean=0, sd=1, log.p = FALSE)
benFreq.upper <- dbenford(y) + q*sqrt(dbenford(y)*(1- dbenford(y))/n) + (1/(2*n))
return(benFreq.upper)
}
# Helper functions for plot
# Expected proportion of the digit x, gives the density
"dbenford" <-
function(x){
log10(1 + 1/x)
}
# cumulative probability frequency, gives the distribution function
# Input: q vector of quantiles.
# d he fixed number of digits in the analyses
"pbenford" <-
function(q, digits = 1){
d <- digits
digits_range <- (10^(d-1)):(10^d -1)
cumprobs <- cumsum(dbenford(digits_range))
return(cumprobs[q])
}
# Quantile function
# Input: p vector of probabilities.
# d the fixed number of digits in the analyses
"qbenford" <-
function(p, digits = 1){
d <- digits
digits_range <- (10^(d-1)):(10^d -1)
cumprobs <- cumsum(dbenford(digits_range))
cumprobs[length(digits_range)] <- 1 # To fix a rounding error
quantiles <- sapply(p, function(x) {10^d - sum(cumprobs >= x)})
return(quantiles)
}
# Benford's Law, generate random deviates
# Input: n number of observations. If length(n) > 1, the length is taken to be the number required.
# d the fixed number of digits in the analyses
"rbenford" <-
function(n, digits = 1){
if ( length(n) > 1)
{
n <- length(n)
}
d <- digits
digits_range <- (10^(d-1)):(10^d -1)
sample( digits_range, size = n, replace = TRUE, prob = dbenford(digits_range))
}
# See page 66 of Mark J. Nigrini (2000) Benford's Law, 2nd Ed., Global Audit Publications, Vancouver
"x.coll" <-
function(x){
x.collapsed <- (10*x)/(10^as.integer(log10(x)))
return(x.collapsed)
}
# This added another as.integer into the Xcoll equation, but this is not described in the text
"x.coll2" <-
function(x){
x.collapsed <- (10*x)/(10^as.integer(log10(x)))
x.collapsed2 <- as.integer(x.collapsed)
return(x.collapsed2)
}

Tuesday, 2 September 2008

Who ever said white wine does not age well.

Who ever said that white wines do not age is sorely mistaken. I opened and consumed a bottle of 1970 Chablis (Lindeman) tonight. There is hardly any of this left (world wide) and it will be sorely missed when it is gone (but then I have a little under 2 cases remaining).

This is a spectacular wine. Whenever I think that it must be approaching its peak, I am happily proven wrong.
This wine was the most successful wine to appear on the Australian Wine Show Circuit - until it was finally retired. The wine is a rich golden colour full of butterscotch, a fine aroma and a lingering aftertaste.
The Reidel glassware came out for this. No standard tasting glasses, this wine deserves to be treated well. This wine is a cultured lady. Like all women of culture and breeding, she has only inproved as she has aged.

There is a reason this has become a cult wine. With Handel's 1st Movement for Piano, a marvelous combination.


Tonight there is one less 1970 LINDEMANS Bin 3875 Hunter River Chablis in the world. Next week there will be one less 1990 Chateau Margaux Margaux.

This wine was a toss of the coin against a 1937 Lupe Cholet, Nuits Saint Georges I had a number of years back, but I had three of those whereas I still have over a case (and nearly 2) of the 1970 chablis.

Each of these wines where also enjoyed in the past with our current opposition leader, Dr. Brendan Nelson and his lovely wife. We had them over to a dinner in 1999 where we served both of these wines. In fact we also opened a bottle of 1947 Château Siran, a 1978 Château Lafite Rothchild, a 1990 Château Latour and a 1963 Croft fortified. This evening was completed with a Remy Martin champagne cognac.

If you ever have the chance to taste a sip of Remy Martin Cognac the Black Pearl Louis Xiii, I recomend this. This was the best cognac I have ever tried.

Dr. Nelson was also party to the same wines at my 30'th in 2000.

Monday, 1 September 2008

Advanced Methods to remotely determine Application Versions

I am speaking at SANS Network Security 2008 in Las Vegas this year. My talk is "Advanced Methods to remotely determine Application Versions". This is being held on Thursday, October 2 * 8:00pm - 9:00pm.

In this I shall be covering a method to determine the DNS application version (and patch level) from a remote server.

The topic is contained in the Abstract:
Statistical and Machine learning techniques make the hiding of information difficult. Statistical methods such as neural network perceptrons and classification algorithms including Random Forest ensembles allow for the determination of software version and patch levels.

These methods can be used to find server versions and patch levels using standard calls to the application server. This appears as standard traffic to the server and does not register as an attack.This bypasses controls (such as the renaming of DNS versions in Bind) allowing an attacker to remotely gather information regarding the patch levels of a system.