A GRF file is a collection of "sprites", meaning rectangular graphics
objects that are drawn to the screen. Examples are the vehicle graphics,
the landscape tiles, and pretty much everything that you see on the
screen.
The GRF file is simply one sprite following after another until the end
of the file with no meta-information. The list of sprites is terminated
by a zero-size sprite, and followed by what looks to be a
four-byte checksum at the very end of the file. TTD never even looks at the
checksum though.
Sprite Info
Each sprite starts with a record of the following data:
WORD size
Note, size can be either compressed or uncompressed size,
see below. It includes the following info bytes.
BYTE info
Bitcoded value that determines what type this sprite is.
Bit
Val
Meaning
0
1
?
1
2
Size is compressed size if set
If this bit is set, the given size is simply the
size in the file. If it is unset, you *must*
decompress it to find out how large it is in the
file.
2
4
?
3
8
Has transparency (i.e. is a tile), see below
others
?
If info==0xff, the sprite is a special type and has none
of the following info byte, it is simply a stream of some
bytes with the given size. For example, these are used
for specifying colour maps for the transparency feature and
the company colours, as well as making a grey-scale image
for the newspaper. With TTDPatch, these sprites are so-called
"actions", see TTDPatch's sprites.txt for more details.
BYTE ydim
How many lines there are in the sprite (y dimension)
WORD xdim
How many columns there are (x dimension)
WORD xrel
Horizontal offset. The offset is counted from the base coordinate for each sprite.
WORD yrel
Vertical offset
After this follows the actual compressed data. If info bit 3 is not set, the
data is simply a stream of pixels from left to right, and from top to bottom,
making up xdim*ydim bytes.
Finally, the file ends with a four byte checksum. I do not know the algorithm
to calculate this, however it isn't important because this checksum is never
even looked at anyway.
Tile sprites
If info bit 3 is set, the sprite is a tile and has some special transparency
information that is encoded like follows. Each line is encoded separately
and split into "chunks". Each chunk contains pixels, but the chunks may
skip a few pixels which are then transparent.
The sprite data first starts off with a list of two-byte offsets, one for each
line. These determine at which offset each line starts, counted from the
first info byte. Then follow the chunks for the lines:
BYTE cinfo
The high bit is set if this is the last chunk in the line.
The line need not be filled entirely, any remaining pixels
are simply transparent.
The lower seven bits give the length of this chunk in pixels.
BYTE cofs
x offset at which this chunk starts. The pixels between this
chunk and the last one will be transparent.
After this follow (cinfo & 0x7f) bytes of pixels.
Compression algorithm
The compression used is a variation on the LZ77 algorithm which detects
redundancy and losslessly reduces the size of the data. Here's how the
compressed data looks in a GRF file.
The compressed stream contains either a pointer to an earlier location and
a length, which means that these bytes are copied over from the given location,
or it contains a length and a verbatim chunk which is copied to the output
stream.
BYTE code
The high bit of the code shows whether this is a verbatim
chunk (not set) or a repetition of earlier data (set).
The meaning of the following bytes depends on whether the high bit of code is set.
If the high bit is not set, what follows is code&0x7f bytes of verbatim data,
or 0x80 bytes of verbatim data if code==0.
If the high bit is set, the code has a slightly different meaning. Bits 3 to 7
are now five bits defining the negative value of the length, that is how much data
should be copied from the earlier location. Bits 0 to 2 are the high bits of an offset,
with the low bits being in the next byte.
BYTE lofs
Low bits of the offset
Use this to extract length and offset:
unsigned long length = -(code >> 3);
unsigned long offset = ( (code & 7) << 8 ) | lofs;
It's important that the variables are unsigned and at least two bytes large.
(But "code" must be a signed char, of course.)
The offset is counted backwards from the current location. So you subtract
the offset from your position in the output stream and copy the given number
of bytes.
And that's pretty much all you need to know about a GRF file!