using the Length Shannon-Fano tree, read and decode
the Length value.
Length <- Length + Minimum Match Length
if Length = 63 + Minimum Match Length
read 8 bits from the input stream,
add this value to Length.
move backwards Distance+1 bytes in the output stream, and
copy Length characters from this position to the output
stream. (if this position is before the start of the output
stream, then assume that all the data before the start of
the output stream is filled with zeros).
end loop
VIII. Tokenizing - Method 7
---------------------------
This method is not used by PKZIP.
IX. Deflating - Method 8
------------------------
The Deflate algorithm is similar to the Implode algorithm using
a sliding dictionary of up to 32K with secondary compression
from Huffman/Shannon-Fano codes.
The compressed data is stored in blocks with a header describing
the block and the Huffman codes used in the data block. The header
format is as follows:
Bit 0: Last Block bit This bit is set to 1 if this is the last
compressed block in the data.
Bits 1-2: Block type
00 (0) - Block is stored - All stored data is byte aligned.
Skip bits until next byte, then next word = block
length, followed by the ones compliment of the block
length word. Remaining data in block is the stored
data.
01 (1) - Use fixed Huffman codes for literal and distance codes.
Lit Code Bits Dist Code Bits
--------- ---- --------- ----
0 - 143 8 0 - 31 5
144 - 255 9
256 - 279 7
280 - 287 8
Literal codes 286-287 and distance codes 30-31 are
never used but participate in the huffman construction.
10 (2) - Dynamic Huffman codes. (See expanding Huffman codes)
11 (3) - Reserved - Flag a "Error in compressed data" if seen.
Expanding Huffman Codes
-----------------------
If the data block is stored with dynamic Huffman codes, the Huffman
codes are sent in the following compressed format:
5 Bits: # of Literal codes sent - 256 (256 - 286)
All other codes are never sent.
5 Bits: # of Dist codes - 1 (1 - 32)
4 Bits: # of Bit Length codes - 3 (3 - 19)
The Huffman codes are sent as bit lengths and the codes are built as
described in the implode algorithm. The bit lengths themselves are
compressed with Huffman codes. There are 19 bit length codes:
0 - 15: Represent bit lengths of 0 - 15
16: Copy the previous bit length 3 - 6 times.
The next 2 bits indicate repeat length (0 = 3, ... ,3 = 6)
Example: Codes 8, 16 (+2 bits 11), 16 (+2 bits 10) will
expand to 12 bit lengths of 8 (1 + 6 + 5)
17: Repeat a bit length of 0 for 3 - 10 times. (3 bits of length)
18: Repeat a bit length of 0 for 11 - 138 times (7 bits of length)
The lengths of the bit length codes are sent packed 3 bits per value
(0 - 7) in the following order:
16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15
The Huffman codes should be built as described in the Implode algorithm
except codes are assigned starting at the shortest bit length, i.e. the
shortest code should be all 0's rather than all 1's. Also, codes with
a bit length of zero do not participate in the tree construction. The
codes are then used to decode the bit lengths for the literal and
distance tables.