1、参考书《数据压缩导论(第4版)》 Page 100 5, 6 ?
5、给定如表4-9所示的概率模型,求出序列a1a1a3a2a3a1 的实值标签。
解:由表4-9,P(a1)=0.2,P(a2)=0.3,P(a3)=0.5.
定义随机变量X(ai)=i,就有X(a1)=1,X(a2)=2,X(a3)=3,所以序列a1a1a3a2a3a1为113231,
由累积密度函数,
得Fx(0)=0, Fx(1)=P(a0)+ P(a1)=0.2, Fx(2)=P(a1)+ P(a2)=0.5, Fx(3)=P(a1)+ P(a2)+P(a3)=1,
因为公式,
u(k)=l(k-1)+(u(k-1)-l(k-1))*Fx(xk)
l(k)=l(k-1)+(u(k-1)-l(k-1))*Fx(xk-1)
可以得出上界和下界。初始化u(0) =1, l(0)=0,
序列的第一个元素是1,
u(1)=l(0)+(u(0)-l(0))*Fx(1)=0+(1-0)*0.2=0.2
l(1)=l(0)+(u(0)-l(0))*Fx(0)=0+(1-0)*0=0
所以标签在区间[0, 0.2)中;
序列的第二个元素是1,
u(2)=l(1)+(u(1)-l(1))*Fx(1)=0+(0.2-0)*0.2=0.04
l(2)=l(1)+(u(1)-l(1))*Fx(0)=0+(0.2-0)*0=0
所以标签在区间[0,0.04)中;
序列的第三个元素是3,
u(3)=l(2)+(u(2)-l(2))*Fx(3)=0+(0.04-0)*1 =0.04
l(3)=l(2)+(u(2)-l(2))*Fx(2)=0+(0.04-0)*0.5=0.02
所以标签在区间[0.02,0.04)中;
序列的第四个元素是2,
u(4)=l(3)+(u(3)-l(3))*Fx(2)=0.02+(0.04-0.02)*0.5 =0.03
l(4)=l(3)+(u(3)-l(3))*Fx(1)=0.02+(0.04-0.02)*0.2=0.024
所以标签在区间[0.024,0.03)中;
序列的第五个元素是3,
u(5)=l(4)+(u(4)-l(4))*Fx(3)=0.024+(0.03-0.024)*1=0.03
l(5)=l(4)+(u(4)-l(4))*Fx(2)=0.024+(0.03-0.024)*0.5=0.027
所以标签在区间[0.027,0.03)中;
序列的第六个元素是1,
u(6)=l(5)+(u(5)-l(5))*Fx(1)=0.027+(0.03-0.027)*0.2=0.0276
l(6)=l(5)+(u(5)-l(5))*Fx(0)=0.027+(0.03-0.027)*0=0.027
所以标签在区间[0.027,0.0276)中。
通常取区间的中点作为标签,所以序列113231的标签为:
Tx(113231)=(0.027+0.0276)/2=0.0273,
即序列a1a1a3a2a3a1 的实值标签为0.0273。
6、对于表4-9所示的概率模型,对于一个标签为0.63215699的长度为10的序列进行解码。
解:程序代码:
#include<stdio.h> #define N 100 int main() { double T,tag; double F[4]={0.0,0.2,0.5,1.0}; double l[N]={0.0},u[N]={1.0}; int n,j,M[N]; printf("输入标签的值:"); scanf("%lf",&tag); printf("输入序列的长度:"); scanf("%d",&n); for(int i=1;i<=n;i++) { T=(tag-l[i-1])/(u[i-1]-l[i-1]); if(T>=F[0]&&T<=F[1]) { M[i]=1; j=1; } else if(T>F[1]&&T<=F[2]) { M[i]=2; j=2; } else if(T>F[2]&&T<=F[3]) { M[i]=3; j=3; } u[i]=l[i-1]+(u[i-1]-l[i-1])*F[j]; l[i]=l[i-1]+(u[i-1]-l[i-1])*F[j-1]; } for(i=1;i<=n;i++) { printf("%d",M[i]); } printf(" "); return 0; }
输出结果: