Saturday, 3 August 2013

Seeking Sydney Software Developers

Job Description

We are seeking several talented Software Developers with a passion for building automation frameworks and tools. We are creating a big data drive web based platform and expert system the likes of which will revolutionize the world.

We seek people in Sydney who want to be a part of a highly technical team. In this role, you will engage with some of the world's most influential companies, large developer community and work closely with our world-class engineering and product teams. The ideal candidate will combine excellent technical and business skills with industry experience. This position is based in our Sydney office, with travel to Brisbane and California.

  • Deeply understand HPC and Large data Platforms and underlying implementation
  • Design and develop systems and tools and the use of algorithms in intelligent systems
  • Code across a number of languages, PHP, Java, Python, Javascript, C++,etc
  • Write best practices, documentation, samples and other materials
  • Developer outreach through conferences, blogs, hacks, etc.
  • Manage product launches by ensuring verification, documentation and communication as well as testing
  • Reproduce, track and drive partner bugs and feature requests with Product and Engineering teams
  • BA/BS in Computer Science or related field
  • Previous experience (min 3+ years) as a developer or software engineer
  • Excellent software development skills in one or more of the following languages: Java, Objective C, Python, PHP, C/C++, Ruby or HTML5
  • Good understanding of Web technology and mobile technologies (iOS & Android)
  • Solid public speaking skills, experience presenting to large technical audiences a plus
  • Willingness and ability to travel approximately 10-15% of the time
  • The ideal candidate will already have a good understanding of large data (e.g. Hadoop for 10PB +) Platforms
Company Description
The world grows through change and knowledge. To thrive, people need to develop wisdom in a social web. To enable this, we must look ahead, understand the trends and forces that will shape society and business in the future and move swiftly to prepare people for what's to come. We will help the world to get ready for tomorrow today. That's what our 2020 Vision is all about. It creates a long-term destination for our business and provides us with a "Roadmap" for winning together with our community and the society we will help to foster through trust and assurance.

Our Mission
Our Roadmap starts with our mission, which is enduring. It declares our purpose as a company and serves as the standard against which we weigh our actions and decisions.
             To make the world wiser and better...
             To inspire enduring optimism and trust...
             To create value and make a difference.

Thursday, 1 August 2013

Principles of Database Development

This is a little something based on a paper of mine from last decade.


As the main repository of an organisation’s historical data, the data warehouse is evolving into the memory of a corporation[1]. Through storage of a wide variety of data sources in an integrated format, the data warehouse is becoming both the storeroom of past events, and the central predictive engine.

The data warehouse contains the unrefined substance that when fed through a decision support system can provide management with up-to-date analytics and corporate predictors. The data analyst can use this technology to perform complex queries and analysis of information without adversely impacting operational systems.

Through both the rise in computational speed and power and the growth of data storage, data warehouses have pressed other technologies into greater levels of development. Technologies such as data mining have grown symbiotically.

Section 1 - Review of Research as at 2007 (when I did this)

To be able to explore the effect of the rise of data warehouses on business and society, one first needs to characterize what the concept encompasses. William H. Inmon [2], known as one of the fathers of data warehousing, has stated that data warehouses are required to be subject orientated. In this, the data and thus database is organised in a manner such that all data relations relating to the same event or object are associated correspondingly in a manner that is concurrently time variant, nonvolatile and integrated.

Whereas operational systems are necessarily optimised for ease of use and the rapidity of response, the data warehouse is optimised for reporting and analysis. Online transaction processing (OLTP), crucial to the operational system, is of less importance to the data warehouse. Rather, on line analytical processing (OLAP) and the necessity to access unusual data patterns results in heavily denormalised or dimension based models that may not be required to achieve acceptable query response times.

By time variant, is meant that changes to the objects and tables within the database are tracked and evidenced. This process allows for the statistical analysis of the data over time to produce reports on time variant trends. Heteroscadastic (including ARIMA, GARCH and ARCH) time series analysis of data is one of the newer avenues of research.

A data warehouse requires that once data is committed to the database, it is stored as read only. In this it is used for future reporting and any point in time data becomes a individual snapshot of the database over time. This process allows both for historical analysis and also future predictions.

To be effective, and data warehouse needs to contain data from as many if not all of an organisations operational applications and individual databases. Further, to be of any effective use, the data in the warehouse must be contained in a consistent manner. A failure to either constrained are consistently all to provide an adequate sample of the organisation’s data leads to the GIGO issue. That is, garbage in, garbage out.

Research into data warehousing has expanded into what has been termed the Corporate Information Factory (or "CIF"). A CIF is an organisational data structure which encompasses ERP, eCommerce, customer relationship management (CRM) and many other formerly separate reporting structures. In some cases, a CIF has been known to encompass data marts, exploration warehouses, ODS, both nearline and secondary storage, and project warehouses.

However, the volumes of data, expense and lack of enterprise support leave CIF implementations is an idea for the future. There are still a number of difficulties associated with data warehousing which make their implementation less widespread then maybe expected in the future. Of particular concern, the process of extracting, cleaning and processing data is both time-consuming and difficult. The failure to implement and adhere to corporate wide naming standards amongst many organisations as only exasperated this problem.

Widespread incompatibility between many database products has slowed down the broadening of data warehousing across organisations. Technologies such as OLAP have aided in the development of Cross-application data warehouses, but issues of table structure, normalisation and the type of data stored within individual databases remains an issue.

Another issue that has hampered the implementation of data warehousing is security. In a world that is increasingly becoming reliant on the Internet and the Web, security could develop into a serious issue. With links into the data warehouse from the Internet, an organisations key informational assets are at risk from both discovery and compromise.

Section 2 Implications of Research to Current Practice

There are a number of clear advantages to the adoption of data warehouse technologies. It is easy to see that organisations which adopt these new technologies successfully will gain a clear competitive advantage. These technologies enhance end user access to data and reports in a manner that allows for greater creativity and more informed evaluations.

The ability to create trend reports and use statistical methods to accurately forecast probabilistic events based on the past occurrences and experience provides for more focused corporate activity. For instance, marketing information from a particular sales push can be compared across different regions to evaluate the impact of differing advertising initiatives.

Data warehousing technology can also significantly increase the effectiveness of several commercial business applications. In particular, customer relationship management (CRM) benefits greatly from this technology. The ability to both gain an overall view and to be able to drill down to specific areas and individuals provides a significant advantage to many organisations. CRM has been one of the principal applications to make early use of data warehousing.

Data Mining

Data mining is a relatively new development. The use of statistical methods to routinely investigate large capacities of data for patterns using techniques including classification, decision trees, association rule mining, and clustering, has resulted in a new field of computational mathematics. Data mining is a complex subject in itself and has associations with numerous core disciplines together with computer science and appends significance to influential computational techniques from statistics, information retrieval, machine learning and pattern recognition[3].

Too many people, the key issue introduced through the use of data mining is not commercial or technological in nature. It is a social issue. The protection of individual privacy is a concern that has increased exponentially with the rise in data mining. Data mining increases the possibility of an individual’s privacy being violated in some manner.

The processes used to analyse ordinary commercial transactions may be used to compile significant quantities of information about individuals from their purchasing behaviour and lifestyle preferences. In particular, where data is compiled across multiple organisations that need to protect the privacy of individuals identity is compounded.

Data mining consists of five major elements:

  • Extract, transform, and load transaction data onto the data warehouse system.
  • Store and manage the data in a multidimensional database system.
  • Provide data access to business analysts and information technology professionals.
  • Analyze the data by application software.
  • Present the data in a useful format, such as a graph or table.

Different levels of analysis are available:

  • Artificial neural networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure.
  • Genetic algorithms: Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of natural evolution.
  • Decision trees: Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID) . CART and CHAID are decision tree techniques used for classification of a dataset. They provide a set of rules that you can apply to a new (unclassified) dataset to predict which records will have a given outcome. CART segments a dataset by creating 2-way splits while CHAID segments using chi square tests to create multi-way splits. CART typically requires less data preparation than CHAID.
  • Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k 1). Sometimes called the k-nearest neighbor technique.
  • Rule induction: The extraction of useful if-then rules from data based on statistical significance.
  • Data visualization: The visual interpretation of complex relationships in multidimensional data. Graphics tools are used to illustrate data relationships.


We expect to see a massive growth in data not just now, but into the foreseeable future.

When I first wrote this as an internal document in 2007, it was the case that data was increasing as a source of growth. Now, it is becoming more and more an end.

The fact that more data improves even poor AI algorithms makes the collection and storage of data essential.


Beynon-Davis, P., 2004. Database systems, 3rd edn, Palgrave McMillan.

Date, C.J., 2004. An introduction to database systems, 8th edn, Addison Wesley.

Hoffer, J., Prescott, M., McFadden, F., 2007. Modern database management, 8th edn, Prentice Hall.

Inmon, W. H. (2003) “Building the Data Warehouse” Wiley Computer Publishing, USA

Keith, Steven; Kaser, Owen & Lemire, Daniel (2005) “Analyzing Large Collections of Electronic Text Using OLAP”, UNBSJ CSAS, TR-05-001, 2005.

Kimball, Ralph & Ross, Margy (2002) “The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling” 2nd Ed. Wiley Computer Publishing, USA

IBM 2005 “IBM Informix Database Design and Implementation Guide”, IBM Informix Dynamic Server Enterprise and Workgroup Edition, v10.0, Last updated: November 2 2005

Kroenke, D., (2003) “Database processing: Fundamentals, design and implementation”, 10th edn, Prentice Hall.

Mannino, M.V., (2004). “Database design, application development & administration”, 2nd edn, McGraw-Hill.

Mento, B & Rapple, B (2003) “Data Mining and Data Warehousing” Association of Research Libraries, USA

Pratt, P. & Adamski, J., (2005). “The concepts of database management”, 5th edn, ITP.

Sullivan, D. (2001) “Document Warehousing and Text Mining: Techniques for Improving Business Operations”, Marketing, and Sales, USA

Whitehouse, Peter R. (2006) “Case Studies in Database Design and Implementation” UQ Australia


[2] The founder of Prism Solutions in 1991


Monday, 29 July 2013

What is ITIL®


ITIL®, or Information Technology Infrastructure Library, is a set of ITSM best practices (like PMP® is a set of best practices for project management). In fact, it is the most widely adopted set of best practices for ITSM. It is non-proprietary and is maintained by experts and incorporates the learning experiences and practices of leading organizations. Most major corporations such as IBM, Bank of America, and Wal-Mart have adopted ITIL®, and many universities have as well—Ohio State, Stanford, BYU and Wake Forest to name a few.

The reason that ITIL® is so popular is that it provides many benefits such as:

· Alignment with University priorities and objectives

· Known and manageable IT costs

· Negotiated, achievable service levels

· Predictable, consistent processes

· Efficiency in service delivery

· Measurable, improvable services and processes

· A common set of terminology

ITIL® was created in the 1980s by the Office of Government Commerce in the United Kingdom. The current version is ITIL® v3 2011, and its basic concept is based on the lifecycle of a service. In total the current set of best practices is split into five Service Lifecycles that are made up of 24 processes and four functions (see diagram below).


Image: ©Crown 2007

ITIL–Just a little so I have the materials online for my class :)

ITIL is an ongoing process. It is not a one time activity. in the image below, we see how the process works.


  • Why is ITIL so successful?
  • What is a Service?
  • What is an IT Service?
  • What is the difference between core services, enabling services, and enhancing services?
  • What is Service Management?
  • What is IT Service Management?
  • What is a Service Provider?
  • What are the three different types of Service Providers?
  • Who are the key stakeholders in Service Management?
  • What is the difference between Utility and Warranty?
  • How do Best Practices evolve?

Basic Concepts

  • What are assets, resources, and capabilities all about?
  • What is the definition for a process?
  • What are some of the key characteristics for a process?
  • What does the process model look like?
  • What is a function all about?
  • What is the difference between a group and a team?
  • What is a department all about?
  • What are the four key functions that ITIL Service Operation covers?
  • What is the definition for a role?
  • Why is organizational culture and behaviour so important?
  • What is a Service Portfolio?
  • What components do you find in a Portfolio?
  • What is the difference between customer-facing services and supporting services?
  • What is knowledge management all about?
  • What is the SKMS all about?
  • What is the CMS all about?
  • What is the CMDB all about?
  • What are the four layers of the knowledge management architecture?

Governance and Management Systems

  • What is Governance?
  • What is a Management System?
  • Why is a Management System?s approach useful?
  • What type of Management System Standards can be adopted by an organization?s Management System?
  • What is ISO/IEC 20000 all about?
  • What is Deming?s PDCA cycle all about?

The Service Lifecycle

  • What does the ITIL Service Lifecycle Model look like?
  • Which processes do you find in Service Strategy?
  • Which processes do you find in Service Design?
  • Which processes do you find in Service Transition?
  • Which processes do you find in Service Operation?
  • Which processes do you find in Continual Service Improvement?
  • Which is the key factor that determines the strength of the Service Lifecycle?

We'll cover questions for some of the other sections of our subject.