图像课设Huffman编码

标签：info 编码课设 index len length vector codeword Huffman

它是一种编码方式，哈夫曼编码是可变字长编码(VLC)的一种。 Huffman于1952年提出一种编码方法，该方法完全依据字符出现概率来构造异字头的平均长度最短的码字，有时称之为最佳编码，一般就叫作Huffman编码。以哈夫曼树─即最优二叉树，带权路径长度最小的二叉树，经常应用于数据压缩。在计算机信息处理中，“哈夫曼编码”是一种一致性编码法（又称"熵编码法"），用于数据的无损耗压缩。这一术语是指使用一张特殊的编码表将源字符（例如某文件中的一个符号）进行编码。这张编码表的特殊之处在于，它是根据每一个源字符出现的估算概率而建立起来的（出现概率高的字符使用较短的编码，反之出现概率低的则使用较长的编码，这便使编码之后的字符串的平均期望长度降低，从而达到无损压缩数据的目的）。这种方法是由David.A.Huffman发展起来的。例如，在英文中，e的出现概率很高，而z的出现概率则最低。当利用哈夫曼编码对一篇英文进行压缩时，e极有可能用一个位(bit)来表示，而z则可能花去25个位（不是26）。用普通的表示方法时，每个英文字母均占用一个字节（byte），即8个位。二者相比，e使用了一般编码的1/8的长度，z则使用了3倍多。倘若我们能实现对于英文中各个字母出现概率的较准确的估算，就可以大幅度提高无损压缩的比例。

clear

load woman; %读入图像数据

%X=imread('girl.bmp','bmp');

data=uint8(X);

[zipped,info]=huffencode(data);

%调用Huffman编码程序进行压缩

unzipped=huffdecode(zipped,info,data);

%调用Huffman编码程序进行解码

%显示原始图像和经编码后的图像，显示压缩比，并计算均方根误差得erms=0，表示是Huffman是无失真编码

subplot(121);imshow(data);

subplot(122);imshow(unzipped);

%erms=compare(data(:),unzipped(:))

cr=info.ratio

whos data unzipped zipped

%huffencode函数对输入矩阵vector进行Huffman编码，返回%编码后的向量（压缩后数据）及相关信息

function [zipped,info]=huffencode(vector)

%输入和输出都是unit8格式

%info返回解码需要的机构信息

%info.pad是添加的比特数

%info.huffcodes是Huffman码字

%info.rows是原始图像行数

%info.cols是原始图像行数

%info.length是原始图像数据长度

%info.maxcodelen是最长码长

if ~isa(vector,'uint8')

error('input argument must be a uint8 vector');

end

[m,n]=size(vector);

vector=vector(:)';

f=frequency(vector); %计算各符号出现的概率

symbols=find(f~=0);

f=f(symbols);

[f,sortindex]=sort(f);

%将符号按照出现的概率大小排序

symbols=symbols(sortindex);

len=length(symbols);

symbols_index=num2cell(1:len);

codeword_tmp=cell(len,1);

while length(f)>1 %生产Huffman树，得到码字编码表

index1=symbols_index{1};

index2=symbols_index{2};

codeword_tmp(index1)=addnode(codeword_tmp(index1),uint8(0));

codeword_tmp(index2)=addnode(codeword_tmp(index2),uint8(1));

f=[sum(f(1:2)) f(3:end)];

symbols_index=[{[index1,index2]} symbols_index(3:end)];

[f,sortindex]=sort(f);

symbols_index=symbols_index(sortindex);

end

codeword=cell(256,1);

codeword(symbols)=codeword_tmp;

len=0;

for index=1:length(vector) %得到整个图像所有比特数

len=len+length(codeword{double(vector(index))+1});

end

string=repmat(uint8(0),1,len);

pointer=1;

for index=1:length(vector) %对输入图像进行编码

code=codeword{double(vector(index))+1};

len=length(code);

string(pointer+(0:len-1))=code;

pointer=pointer+len;

end

len=length(string);

pad=8-mod(len,8);

%非8整数倍时，最后补pad个0

if pad>0

string=[string uint8(zeros(1,pad))];

end

codeword=codeword(symbols);

codelen=zeros(size(codeword));

weights=2.^(0:23);

maxcodelen=0;

for index=1:length(codeword)

len=length(codeword{index});

if len>maxcodelen

maxcodelen=len;

end

if len>0

code=sum(weights(codeword{index}==1));

code=bitset(code,len+1);

codeword{index}=code;

codelen(index)=len;

end

codeword=[codeword{:}];

%计算压缩后的向量

cols=length(string)/8;

string=reshape(string,8,cols);

weights=2.^(0:7);

zipped=uint8(weights*double(string));

%码表存储到一个稀疏矩阵

huffcodes=sparse(1,1);

for index=1:nnz(codeword)

huffcodes(codeword(index),1)=symbols(index);

end

%填写解码时所需的结构信息

info.pad=pad;

info.huffcodes=huffcodes;

info.ratio=cols./length(vector);

info.length=length(vector);

info.maxcodelen=maxcodelen;

info.rows=m;

info.cols=n;

%huffdecode函数对输入矩阵vector进行Huffman编码，

%返回解压后的图像数据

function vector=huffdecode(zipped,info,image)

if~isa(zipped,'uint8')

error('input argument must be a uint8 vector');

end

%产生0，1序列，每位占一个字节

len=length(zipped);

string=repmat(uint8(0),1,len.*8);

bitindex=1:8;

for index=1:len

string(bitindex+8.*(index-1))=uint8(bitget(zipped(index),bitindex));

end

string=logical(string(:)');

len=length(string);

%开始解码

weights=2.^(0:51);

vector=repmat(uint8(0),1,info.length);

vectorindex=1;

codeindex=1;

code=0;

for index=1:len

code=bitset(code,codeindex,string(index));

codeindex=codeindex+1;

byte=decode(bitset(code,codeindex),info);

if byte>0

vector(vectorindex)=byte-1;

codeindex=1;

code=0;

vectorindex=vectorindex+1;

end

%vector=reshape(vector,info.rows,info.cols);

%函数addnode添加节点

function codeword_new=addnode(codeword_old,item)

codeword_new=cell(size(codeword_old));

for index=1:length(codeword_old)

codeword_new{index}=[item codeword_old{index}];

end

%函数frequency计算各符号出现的概率

function f=frequency(vector)

if~isa(vector,'uint8')

error('input argument must be a uint8 vector');

end

f=repmat(0,1,256);

len=length(vector);

for index=0:255

f(index+1)=sum(vector==uint8(index));

end

f=f./len;

%函数decode返回码字对应的符号

function byte=decode(code,info)

byte=info.huffcodes(code);

标签：info,编码,课设,index,len,length,vector,codeword,Huffman
From： https://blog.51cto.com/u_15815923/5743805

相关文章

赞助商

阅读排行