• PTA 5-9 Huffman Codes (30)


    题目:http://pta.patest.cn/pta/test/16/exam/4/question/671

    PTA - Data Structures and Algorithms (English) - 5-9

    In 1953, David A. Huffman published his paper "A Method for the Construction of Minimum-Redundancy Codes", and hence printed his name in the history of computer science. As a professor who gives the final exam problem on Huffman codes, I am encountering a big problem: the Huffman codes are NOT unique. For example, given a string "aaaxuaxz", we can observe that the frequencies of the characters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode the symbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001, 'z'=000}, both compress the string into 14 bits. Another set of code can be given as {'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correct since "aaaxuaxz" and "aazuaxax" can both be decoded from the code 00001011001001. The students are submitting all kinds of codes, and I need a computer program to help me determine which ones are correct and which ones are not.

    Input Specification:

    Each input file contains one test case. For each case, the first line gives an integer N (2≤N≤63), then followed by a line that contains all the N distinct characters and their frequencies in the following format:

    c[1] f[1] c[2] f[2] ... c[N] f[N]

    where c[i] is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and f[i] is the frequency of c[i]and is an integer no more than 1000. The next line gives a positive integer M (≤1000), then followed by M student submissions. Each student submission consists of N lines, each in the format:

    c[i] code[i]
    

    where c[i] is the i-th character and code[i] is an non-empty string of no more than 63 '0's and '1's.

    Output Specification:

    For each test case, print in each line either "Yes" if the student's submission is correct, or "No" if not.

    Note: The optimal solution is not necessarily generated by Huffman algorithm. Any prefix code with code length being optimal is considered correct.

    Sample Input:
    7                             //结点数目num
    A 1 B 1 C 1 D 3 E 3 F 6 G 6   //每个结点数据data及出现的次数weight
    4                             //测试数据的组数checkNum
    A 00000                       //之后的 4*7行 是结点数据ch及其编码s
    B 00001
    C 0001
    D 001
    E 01
    F 10
    G 11
    A 01010
    B 01011
    C 0100
    D 011
    E 10
    F 11
    G 00
    A 000
    B 001
    C 010
    D 011
    E 100
    F 101
    G 110
    A 00000
    B 00001
    C 0001
    D 001
    E 00
    F 10
    G 11
    
    Sample Output:
    Yes
    Yes
    No
    No

    题目分析:

    这是一道考察“哈夫曼编码”的问题,但是这里不一定非要把哈夫曼树构造出来。Note: The optimal solution is not necessarily generated by Huffman algorithm

    - 输入:第一行是结点数目num;第二行是每个结点数据data及出现的次数weight;第三行是测试数据的组数checkNum;第四行及以后是结点数据ch及编码s。

    - 输出:对于每一组测试数据,输出编码是否符合“哈夫曼编码”,是则输出Yes,否则输出No。

    - 符合“哈夫曼编码”需要符合两个条件:①WPL最小 ②编码的前缀不能是其他编码的前缀。

    解法转自:http://www.cnblogs.com/clevercong/p/4193370.html

    1) map 用于存放:A 1 B 1 C 1 D 3 E 3 F 6 G 6   //每个结点的数据data及出现的次数(权值)weight

    2) 使用C++标准库中的优先队列:priority_queue,引入头文件 #include <queue>。优先队列底层由堆实现,数据放入队列后,会自动按照“优先级”排好顺序。

    #include <map>
    #include <queue>
    
    map<char, int> myMap;
    priority_queue<int, vector<int>, greater<int> >pq;  //小顶堆
    
    for(int i=0; i<num; i++)  // 输入结点的数据c[i]、权值f[i]
    {
        cin >> c[i] >> f[i];
        myMap[c[i]] = f[i];  // 映射
        pq.push(f[i]);  // 向队列中添加元素
    }
    

    3) 计算WPL的值,从priority_queue中取出两个元素,相加之后再放回队列里。

    // 计算WPL的值
    int myWpl = 0;
    while(!pq.empty())
    {
        int myTop = pq.top();
        pq.pop();
        if(!pq.empty())
        {
            int myTop2 = pq.top();
            pq.pop();
            pq.push(myTop + myTop2);
            int m = myTop + myTop2;
            myWpl += m;  //每次加m(子节点权值重复加入) 等效于 路径长度*权值
        }
    }

    4) 测试数据需按编码排序,但标准库并没有为map制定sort函数,因此我们用vector装载pair类型,既可以模仿出map的功能,又可以用vector的排序函数。

    #include <algorithm>  // sort()
    typedef pair<char, string> PAIR;  // 用PAIR来代替pair<char, string> (编码类型:string)
    
    // cmp():自定义按什么内容或大小顺序排序
    // 这里是按编码的长度排序
    int cmp(const PAIR& x, const PAIR& y)
    {
        return x.second.size() < y.second.size();
    }
    // vector + pair<,> 模仿 map
    vector<PAIR> checkVec;
    checkVec.push_back(make_pair(ch, s));  // 向vector中添加元素
    sort(checkVec.begin(), checkVec.end(), cmp);  // 按照编码的长度排序

    5) 判断前缀问题:substr函数,取字符串中的一段并与当前编码进行比较。

    bool flag = true;  //已符合条件一:wpl最小
    for(int i=0; i<num; i++)
    {
        string tmp = checkVec[i].second;
      for(int j=i+1; j<num; j++)
        {
            if(checkVec[j].second.substr(0,tmp.size())==tmp)
                flag = false;
        }
    }
    

    完整代码:

    #include <iostream>
    #include <algorithm>  // 排序函数 sort()
    #include <map>
    #include <queue>
    using namespace std;
    
    typedef pair<char, string> PAIR;  // + vector来模仿 map
    
    int cmp(const PAIR& x, const PAIR& y)  // 自定义让sort()按哪种方式排序
    {
        return x.second.size() < y.second.size();
    }
    
    int main()
    {
        int num;
        cin >> num;
        char *c = new char[num];
        int *f = new int[num];
        map<char, int> myMap;  // 用来存节点数据及权值,并构成映射
        // 使用优级队列
        priority_queue<int, vector<int>, greater<int> >pq;
    
        for(int i=0; i<num; i++)  // 输入结点及出现次数(权值)
        {
            cin >> c[i] >> f[i];
            myMap[c[i]] = f[i];
            pq.push(f[i]);  // 将权值压入优先队列
        }
        // 计算WPL的值
        int myWpl = 0;
        while(!pq.empty())
        {
            int myTop = pq.top();
            pq.pop();
            if(!pq.empty())
            {
                int myTop2 = pq.top();
                pq.pop();
                pq.push(myTop + myTop2);
                int m = myTop + myTop2;
                myWpl += m;
            }
        }
        // 输入测试数据
        int checkNum;  // 测试数据的组数
        cin >> checkNum;
        for(int i=0; i<checkNum; i++)
        {
            int wpl = 0;
            char ch;
            string s;
            // vector + PAIR 模仿 map,使其可排序
            vector<PAIR> checkVec;
            for(int j=0; j<num; j++)
            {
                cin >> ch >> s;
                checkVec.push_back(make_pair(ch, s));  // 向vector中添加测试数据及其编码
                wpl += s.size() * myMap[ch];
            }
            sort(checkVec.begin(), checkVec.end(), cmp);  // 按照编码长度排序
            if(wpl != myWpl)
            {
                cout << "No" << endl;
                continue;
            }
            else
            {
                bool flag = true;  // 表示已满足条件一:wpl最小(wpl==myWpl)
    
                //条件二:编码的前缀不能是其他编码的前缀:substr()
                for(int i=0; i<num; i++)
                {
                    string tmp = checkVec[i].second;
                    for(int j=i+1; j<num; j++)
                    {
                        if(checkVec[j].second.substr(0,tmp.size())==tmp)
                            flag = false;
                    }
                }
                if(flag == true)
                    cout << "Yes" << endl;
                else
                    cout << "No" << endl;
                continue;
            }
            cout << "Yes" << endl;
        }
        return 0;
    }
    
  • 相关阅读:
    发起qq临时会话
    easyUI-textbox回车获取不到正确的textbox值问题
    Linq in条件查询
    常用js-API
    MVC4不支持EF6解决方案 && Nuget控制台操作说明
    JS报表打印分页CSS
    关于phpinfo页面展开的渗透
    基于phpmyadmin的攻击
    upload_labs靶场
    文件上传漏洞
  • 原文地址:https://www.cnblogs.com/claremore/p/4826473.html
Copyright © 2020-2023  润新知