欢迎来到蓝梦软件下载中心!
免责声明:本站软件仅用于恢复和销毁存储介质数据,如果涉及个人隐私等问题,请使用者自行承担,使用软件默认同意本声明!
Q Q:1731278955
传真:0510-82737376
手机:13400027332
E-mail:1731278955@qq.com

技术文章
您所在的位置:首页 > 技术文章 >

ZIP压缩文件数据结构解析十九



作者: 来源: 日期:2018/1/7 7:22:41 人气:4 

For compression factor 4:

    L(X) equals the lower 4 bits of X.

    F(X) equals 2 if X equals 15 otherwise F(X) equals 3.

    D(X,Y) equals the (upper 4 bits of X) * 256 + Y + 1.

VII. Imploding - Method 6

The Imploding algorithm is actually a combination of two distinct

algorithms.  The first algorithm compresses repeated byte

sequences using a sliding dictionary.  The second algorithm is

used to compress the encoding of the sliding dictionary output,

using multiple Shannon-Fano trees.

The Imploding algorithm can use a 4K or 8K sliding dictionary

size. The dictionary size used can be determined by bit 1 in the

general purpose flag word; a 0 bit indicates a 4K dictionary

while a 1 bit indicates an 8K dictionary.

The Shannon-Fano trees are stored at the start of the compressed

file. The number of trees stored is defined by bit 2 in the

general purpose flag word; a 0 bit indicates two trees stored, a

1 bit indicates three trees are stored.  If 3 trees are stored,

the first Shannon-Fano tree represents the encoding of the

Literal characters, the second tree represents the encoding of

the Length information, the third represents the encoding of the

Distance information.  When 2 Shannon-Fano trees are stored, the

Length tree is stored first, followed by the Distance tree.

The Literal Shannon-Fano tree, if present is used to represent

the entire ASCII character set, and contains 256 values.  This

tree is used to compress any data not compressed by the sliding

dictionary algorithm.  When this tree is present, the Minimum

Match Length for the sliding dictionary is 3.  If this tree is

not present, the Minimum Match Length is 2.

The Length Shannon-Fano tree is used to compress the Length part

of the (length,distance) pairs from the sliding dictionary

output.  The Length tree contains 64 values, ranging from the

Minimum Match Length, to 63 plus the Minimum Match Length.

The Distance Shannon-Fano tree is used to compress the Distance

part of the (length,distance) pairs from the sliding dictionary

output. The Distance tree contains 64 values, ranging from 0 to

63, representing the upper 6 bits of the distance value.  The

distance values themselves will be between 0 and the sliding

dictionary size, either 4K or 8K.

The Shannon-Fano trees themselves are stored in a compressed

format. The first byte of the tree data represents the number of

bytes of data representing the (compressed) Shannon-Fano tree

minus 1.  The remaining bytes represent the Shannon-Fano tree

data encoded as:

    High 4 bits: Number of values at this bit length + 1. (1 - 16)

    Low  4 bits: Bit Length needed to represent value + 1. (1 - 16)

The Shannon-Fano codes can be constructed from the bit lengths

using the following algorithm:

1)  Sort the Bit Lengths in ascending order, while retaining the

    order of the original lengths stored in the file.

2)  Generate the Shannon-Fano trees:

    Code <- 0

    CodeIncrement <- 0

    LastBitLength <- 0

    i <- number of Shannon-Fano codes - 1   (either 255 or 63)

    loop while i >= 0

        Code = Code + CodeIncrement

        if BitLength(i) <> LastBitLength then

            LastBitLength=BitLength(i)

            CodeIncrement = 1 shifted left (16 - LastBitLength)

        ShannonCode(i) = Code

        i <- i - 1

    end loop