Thursday, 15 March 2012

What forms a loop?

The following is a small excerpt from an article I am publishing in Hakin9 next month.

All shellcode loops are composed of five (5) parts. These are:

  1. A control variable. Each loop will contain a set of variables that can be evaluated to see if the loop should continue or end.
  2. The initial value that initialises the loop has to be set for each of the control variables.
  3. The block of code that acts as the body of the loop. This is the code that is run at each iteration of the loop.
  4. The modification process. This stage changes the control variable.
  5. An end condition. Although not strictly necessary (it is possible to have an endless loop) it is generally considered necessary to have some end to the loop such that it stops and does not run eternally.

Loops are an important component of creating shellcode. They enable the author to obscure their code (through encryption and decryption routines), to add port and IP scanning functions into the shellcode, to enact denial of services attacks and to create keystroke loggers amongst other things. Although there are many forms of looping instructions, the primary ones we will address are “for loops” and “while loops”.

For Loops

We can see a simple for loop disassembled into machine code in Figure4.

clip_image001

Figure 1: A For Loop in action

All loops are actually functionally equivalent (Zakharov, 1999) and can be written in different ways. We define them as we do for reasons of elegance and performance, an art more than a science.

In the “For loop” the initialisation, update routine and ending conditions are specified at the start of the loop. This is the primary difference to a while loop where the ending conditions are defined at the end of the loop and the control and update routine is set within the body of the loop. There are of course many ways to represent even a simple for loop and this makes the reversing process far more complex than it may seem it should be (and hence also comes to why there are as yet no truly automated decompilers).

From this, we can quickly deduce that a stopping condition at the start of the loop would best fit a “For Loop” whilst a stopping condition located at the end of the loop best forms a “While Loop”.

In the example (Figure 4), we start by setting our variable “i” to a value of 0 and create a routine to increment this value by one on each iteration of the routine. The loop is set to end or complete when the value of “i” reaches or exceeds 100. This means that our loop will iterate 100 times.

The C/C++ code is listed to the left of the figure with the functionally equivalent assembly code listed on the right. In order to initialise our variable “i” in assembly, we have set the EAX register to contain the value 0. As this i8s a “For Loop”, the completion or ending condition is checked at the start of the loop and we have this written as a “CMP EAX, 100” assembly instruction where the conditional jump (JNL) is taken if EAX is greater than or equal to 100. Basically, we loop until the value stored in EAX equals 100.

The value in the EAX register is incremente4d by 1 each time the code block is iterated and the check routine at (2) is again engaged.

References

Zakharov, V. A. (1999). On the decidability of the equivalence problem for orthogonal sequential programs. Grammars, 2(3), 271-281.

2 comments:

Andrew Catford said...

In the graphic, it should read "Add Eax, 1", not "Add Eax, i"

Dr Craig S Wright GSE said...

Thanks, typos happen :)

I am really glad somebody actually noticed :)