Sunday, 7 September 2008

The next tool - DD

This week I am covering DD. This is the swiss army knife of file tools - with /dev/tcp it can also be a network tool, but nc is simpler.

First we need the basics for DD. For this we have the man page and some definitions. I have taken (blatently paraphrased) the man file info for DD and included this below (which is simple to obtain - "man dd").

For the purpose over the next few days in reversion files and swaping them, we need to concentrate on the following options:

  • bs - This is block size. Setting "bs=1" means that we can use dd as a bit level (instead of a block level tool). Although it does slow down the process from a block copy, we are not looking at how fast we can copy here.
  • skip - this tells us to skip "n" blocks. In our case, we want "n" bits.
What we are going to do is start at the value of "n" set to our last bit in the file. We will loop the dd function to next copy bit "n - 1", then "n - 2", ... to "n=1". This means n gets copied to bit 1, "n - 1" to bit 2, ..., bit 1 to bit n.

In other words we need to copy bit "n - i" in the source file to bit "i - n" in the destination file.

More on this in the next few days (inc. a simple script).

dd [bs=s] [cbs=s] [conv=conversion] [count=n] [ibs=s] [if=file] [imsg=string] [iseek=n] [obs=s] [of=file] [omsg=string] [seek=n] [skip=n]

DESCRIPTION
dd reads and writes data by blocks, and can convert the data between formats. dd is often used for devices such as tapes which have discrete block sizes, or for fast multi-sector reads from disks. The conversions can accommodate systems that need de-blocking, conversion to/from EBCDIC and fixed length records.

dd processes input data as follows:
  1. dd reads an input block.
  2. If you specified conv=sync and this input block is smaller than the specified input block size, dd pads it to the specified size with null bytes. By also specifying a block or unblock conversion, dd implements spaces instead of null bytes.
  3. If bs=size is specified and requested no conversion other than sync or noerror, dd writes the input block (padded where necessary) to the output as a single block and omits the remaining steps.
  4. By Specifying the swab conversion, dd swaps each pair of input bytes. If there is an odd number of input bytes, dd does not attempt to swap the last byte.
  5. dd performs all remaining conversions on the input data independently of the input block boundaries. A fixed-length input or output record may span these boundaries.
  6. dd collects the converted data into output blocks of the specified size. When dd reaches the end of the input, it writes the remaining output as a block (with added padding if the conv=sync option is used). Consequently, the final output block can be smaller than the output block size.
Parameters
bs=size
This optioon sets both input and output block sizes to size bytes. You can suffix this decimal number with w, b, k, or xnumber to multiply it by 2, 512, 1024, or number respectively. You can also specify size as two decimal numbers (with or without suffixes) separated by x to indicate the product of the two values. Processing is faster when ibs and obs are equal, since this avoids buffer copying. The default block size is 1b. bs=size supersedes any settings of ibs=size or obs=size.

Specifing bs=size with no other conversions than noerror, notrunc, or sync, dd writes the data from each input block as a separate output block. In the event that the input data is less than a full block and you did not request sync conversion, the output block is the same size as the input block.

cbs=size
Sets the size of the conversion buffer used by various conv options. It is possible to specify this option in the same way as for bs.

conv=conversion[, conversion, ...]
This option specifies conversion method. conversion can be any of the following:
  • ascii
    Converts EBCDIC input to ASCII for output. dd copies cbs bytes at a time to the conversion buffer, maps them to ASCII, then strips trailing blanks, adds a newline, and copies this line to the output buffer.
  • block
    Converts variable-length records to fixed-length records. dd treats the input data as a sequence of variable-length records (each terminated by a newline or an EOF character) independent of the block boundaries. dd converts each input record by first removing any newline characters, then padding (with spaces) or truncating the record to the size of the conversion buffer. dd reports the number of truncated records on the standard error. It is necessary to specify cbs=size with this conversion setting.
    ebcdic
    Converts ASCII input to EBCDIC for output. dd copies a line of ASCII to the conversion buffer, discards the newline, pads it with trailing blanks to cbs bytes, maps it to EBCDIC and copies it to the output buffer.
  • ibm
    Converts ASCII to a variant of EBCDIC which gives better output on many IBM printers.
  • lcase
    Converts uppercase input to lowercase.
noerror
Ignore errors on input.

notrunc
The option sets dd so that it does not truncate the output file. If a block is explicitly written, it replaces the existing block; all other blocks are unchanged. See also of=file and seek=n.

swab
Swaps the order of every pair of input bytes. If the current input record has an odd number of bytes, this conversion does not attempt to swap the last byte of the record.

sync
Pads any input block shorter than ibs to that size with null bytes before conversion and output. If you also specified block or unblock, dd uses spaces instead of null bytes for padding.

ucase
Converts lowercase input to uppercase.

unblock
Converts fixed-length records to variable-length records by reading a number of bytes equal to the size of the conversion buffer (or the number of bytes remaining in the input, if less than the conversion buffer size), deleting all trailing spaces, and appending a newline character. You must specify cbs=size with this conversion.

convfile
Deploys convfile as a translation table if it is not one of the conversion formats listed here and it is the name of a file of exactly 256 bytes. It is possible to perform multiple conversions at the same time by separating arguments to conv with commas; however, some conversions are mutually exclusive (for example, ucase and lcase).

count=n
Copies only n input blocks to the output.

ibs=size
Sets the input block size to size bytes. Specify this option in the same way as bs.

if=file
Reads input data from file. If you don't specify this option, dd reads data from the standard input.

imsg=string
Displays string when all data has been read from the current volume, replacing all occurrences of %d in string with the number of the next volume to be read. dd then reads and discards a line from the controlling terminal, giving you a chance to change volumes (usually a floppy disk).

iseek=n
Seeks to the nth block of the input file. The distinction between this and skip is that iseek does not read the discarded data; however there are some devices, such as tape drives and communication lines, on which seeking is not possible, so only skip is appropriate.

obs=size
Sets the output block size to size bytes. Specify this option in the same way as bs. The size of the destination should be a multiple of the value chosen for size. For example, if you choose obs=10k, the destination's size should be a multiple of 10k.

of=file
Writes output data to file. Without setting this option, dd writes data to the standard output. dd truncates the output file before writing to it, unless you specified the seek=n operand. If you specify seek=n, but do not specify conv=notrunc, dd preserves only those blocks in the output file over which it seeks. If the size of the seek plus the size of the input file is less than the size of the output file, this can result in a shortened output file.

omsg=string
Displays string when dd runs out of room while writing to the current volume. Any occurrences of %d in string are replaced with the number of the next volume to be written. dd then reads and discards a line from the controlling terminal, giving you a chance to change volumes (usually a floppy disk).

seek=n
Initially seeks to the nth block of the output file.

skip=n
Reads and discards the first n blocks of input.

No comments: