• C++实现二进制序列化/反序列化


    toc

    背景

    涉及对象传输或存储时,均需要序列化的参与,将对象转为连续的二进制数据后才能进行传输或存储,需要还原对象时,通过反序列化逆向处理二进制数据遍能得到原对象

    • 这里的对象是一种广泛的概念,往大了说就是一段有意义的内存

    实现

    • 实现过程中主要使用模板应对各种类型,特化+宏应对基础数据类型,逐层处理应对STL容器嵌套情况
    • C++语言本身不带反射,自定义结构需继承接口类Serializable,并实现相应接口
    • 实现时使用了std::move()进行构造与赋值,自定义结构最好实现自己的移动构造函数与移动赋值函数
    • 获取序列化buffer时,先计算序列化目标在序列化后的总字节数,一次性申请足够内存
    • 获取序列化buffer时,返回的buffer之前藏了序列化目标的总字节数,藏东西这招跟redis学的
    • 藏的目标总字节数目的用于,反序列化前简单判断待反序列化buffer是否有效
    • 未支持指针类型,传递错误类型的指针(强转目标类型与实际类型不符合)会导致内存错误,序列化过程中没法判断类型是否正确
    • 未支持C风格字符串,C风格字符串以0结尾,非二进制安全
    • 需要序列化一段连续内存时,构造std::string后传入
    • 反序列化时,调用者需自行保证,待反序列化buffer的原始类型与目标类型一致,代码不能识别出类型差异与结构改变
    • 使用方式 申请序列化buffer->序列化->三方库压缩->三方库加密->传输/落盘->接收/读取->三方库解密->三方库解压->简单检查->反序列化->释放buffer
    • 序列化后的数据格式如下
      • 先写入对象序列化后总长度,后面才接着写入对象序列化数据
        • 对象是基础类型则直接写入数据
        • 对象是STL容器先写入元素个数,再写入每个元素序列化数据
          • 如果元素是std::pairl类型,则先写入序列化Key,再写入序列化Value
        • 对象是自定义结构,则需用户定义数据格式

    总体结构:

    对象序列化后总长度 对象序列化数据

    对象是基础数据:

    对象序列化后总长度 对象序列化数据(charintshortlong.........)

    对象是STL容器:

    对象序列化后总长度 元素个数 元素1序列化数据 元素2序列化数据 元素3序列化数据 ..... 元素n序列化数据

    对象是STL容器之Map:

    对象序列化后总长度 元素个数 Pair 1之Key序列化数据 Pair 1之Value序列化数据 ..... Pair n之Key序列化数据 Pair n之Value序列化数据

    代码

    #ifndef _Serialization_H_
    #define _Serialization_H_
    
    #include <map>
    #include <set>
    #include <list>
    #include <string>
    #include <vector>
    
    //基础数据长度计算
    #define BASIC_DATA_LENGTH(BasicType)                                            
        template<>                                                                  
        static unsigned Length(const BasicType& elem){                              
            return sizeof(elem);                                                    
        }                                        
    
        //申请用于序列化的buffer         总格式    |数据总长度|数据|
    #define MAKE_BUFFER(uLen){                                                      
            char *pBuffer = new char[uLen + sizeof(LEN)];                           
            if(nullptr == pBuffer){                                                 
                return nullptr;                                                     
            }                                                                       
            *reinterpret_cast<LEN *const>(pBuffer) = uLen;                          
            return pBuffer + sizeof(LEN);                                           
        }
    
        //基础数据序列化buffer获取
    #define BASIC_DATA_GET_BUFFER(BasicType)                                        
        template<>                                                                  
        static char *const GetBuffer(const BasicType& elem){                        
            LEN uLen = sizeof(elem);                                                
            MAKE_BUFFER(uLen);                                                      
        }
    
        //基础数据序列化主体
    #define BASIC_DATA_SERIALIZE(BasicType)                                         
        template<>                                                                  
        static unsigned Serialize(const BasicType& elem, char *const pBuffer){      
            ::memcpy(pBuffer, &elem, sizeof(BasicType));                            
            return sizeof(BasicType);                                               
        }
    
        //基础数据反序列化主体
    #define BASIC_DATA_UNSERIALIZE(BasicType)                                       
        template<>                                                                  
        static unsigned UnSerialize(const char *const pBuffer, BasicType& elem){    
            ::memcpy(&elem, pBuffer, sizeof(BasicType));                            
            return sizeof(BasicType);                                               
        }
    
    namespace Serialization{
    
        //存放长度信息类型
        using LEN = unsigned;
    
        //自定义结构序列化基类
        class Serializable{
        public:
            virtual unsigned Length() const = 0;
            virtual unsigned Serialize(char *const pBuffer) const = 0;
            virtual unsigned UnSerialize(const char *const pBuffer) = 0;
            virtual ~Serializable(){};
        };
    
        /*
            前项声明部分,STL容器嵌套编译问题
        */
        template<typename Elem>
        unsigned Length(const std::list<Elem>& list);
        template<typename Key, typename Value>
        unsigned Length(const std::map<Key, Value>& map);
        template<typename Elem>
        unsigned Length(const std::vector<Elem>& vec);
        template<typename Elem>
        unsigned Length(const std::set<Elem>& set);
    
        template<typename Elem>
        char *const GetBuffer(const std::list<Elem>& list);
        template<typename Key, typename Value>
        char *const GetBuffer(const std::map<Key, Value>& map);
        template<typename Elem>
        char *const GetBuffer(const std::vector<Elem>& vec);
        template<typename Elem>
        char *const GetBuffer(const std::set<Elem>& set);
    
        template<typename Elem>
        unsigned Serialize(const std::list<Elem>& list, char *const pBuffer);
        template<typename Key, typename Value>
        unsigned Serialize(const std::map<Key, Value>& map, char *const pBuffer);
        template<typename Elem>
        unsigned Serialize(const std::vector<Elem>& vec, char *const pBuffer);
        template<typename Elem>
        unsigned Serialize(const std::set<Elem>& set, char *const pBuffer);
    
        template<typename Elem>
        unsigned UnSerialize(const char* pBuffer, std::list<Elem>& list);
        template<typename Key, typename Value>
        unsigned UnSerialize(const char* pBuffer, std::map<Key, Value>& map);
        template<typename Elem>
        unsigned UnSerialize(const char* pBuffer, std::vector<Elem>& vec);
        template<typename Elem>
        unsigned UnSerialize(const char* pBuffer, std::set<Elem>& set);
    
        /*
            长度计算部分
            得到数据长度
        */
        template<typename Elem>
        static unsigned Length(const Elem& elem){
            return elem.Length();
        }
    
        BASIC_DATA_LENGTH(char)
            BASIC_DATA_LENGTH(short)
            BASIC_DATA_LENGTH(int)
            BASIC_DATA_LENGTH(float)
            BASIC_DATA_LENGTH(long)
            BASIC_DATA_LENGTH(double)
            BASIC_DATA_LENGTH(long long)
            BASIC_DATA_LENGTH(unsigned char)
            BASIC_DATA_LENGTH(unsigned short)
            BASIC_DATA_LENGTH(unsigned int)
            BASIC_DATA_LENGTH(unsigned long)
            BASIC_DATA_LENGTH(unsigned long long)
    
            template<>
        static unsigned Length(const std::string& str){
            return sizeof(LEN) + static_cast<unsigned>(str.size());                //容器对象 格式 |元素个数|数据|
        }
    
        template<typename Elem>
        static unsigned Length(const std::list<Elem>& list){
            LEN uLen = 0;
            for(const auto& elem : list){
                uLen += Length(elem);
            }
            return sizeof(LEN) + uLen;
        }
    
        template<typename Key, typename Value>
        static unsigned Length(const std::map<Key, Value>& map){
            LEN uLen = 0;
            for(const auto& elem : map){
                uLen += Length(elem.first);
                uLen += Length(elem.second);
            }
            return sizeof(LEN) + uLen;
        }
    
        template<typename Elem>
        static unsigned Length(const std::vector<Elem>& vec){
            LEN uLen = 0;
            for(const auto& elem : vec){
                uLen += Length(elem);
            }
            return sizeof(LEN) + uLen;
        }
    
        template<typename Elem>
        static unsigned Length(const std::set<Elem>& set){
            LEN uLen = 0;
            for(const auto& elem : set){
                uLen += Length(elem);
            }
            return sizeof(LEN) + uLen;
        }
    
        /*
            buffer申请、释放部分
            一次性申请足够内存
        */
        template<typename Elem>
        static char *const GetBuffer(const Elem& elem){
            LEN uLen = elem.Length();
            MAKE_BUFFER(uLen);
        }
    
        BASIC_DATA_GET_BUFFER(char)
            BASIC_DATA_GET_BUFFER(short)
            BASIC_DATA_GET_BUFFER(int)
            BASIC_DATA_GET_BUFFER(float)
            BASIC_DATA_GET_BUFFER(long)
            BASIC_DATA_GET_BUFFER(double)
            BASIC_DATA_GET_BUFFER(long long)
            BASIC_DATA_GET_BUFFER(unsigned char)
            BASIC_DATA_GET_BUFFER(unsigned short)
            BASIC_DATA_GET_BUFFER(unsigned int)
            BASIC_DATA_GET_BUFFER(unsigned long)
            BASIC_DATA_GET_BUFFER(unsigned long long)
    
            template<>
        static char *const GetBuffer(const std::string& str){
            LEN uLen = Length(str);
            MAKE_BUFFER(uLen);
        }
    
        template<typename Elem>
        static char *const GetBuffer(const std::list<Elem>& list){
            LEN uLen = Length(list);
            MAKE_BUFFER(uLen);
        }
    
        template<typename Key, typename Value>
        static char *const GetBuffer(const std::map<Key, Value>& map){
            LEN uLen = Length(map);
            MAKE_BUFFER(uLen);
        }
    
        template<typename Elem>
        static char *const GetBuffer(const std::vector<Elem>& vec){
            LEN uLen = Length(vec);
            MAKE_BUFFER(uLen);
        }
    
        template<typename Elem>
        static char *const GetBuffer(const std::set<Elem>& set){
            LEN uLen = Length(set);
            MAKE_BUFFER(uLen);
        }
    
        static void ReleaseBuffer(char *const pBuffer){
            delete[] (pBuffer - sizeof(LEN));
        }
    
        /*
            序列化部分
        */
        template<typename Elem>
        static unsigned Serialize(const Elem& elem, char *const pBuffer){
            return elem.Serialize(pBuffer);
        }
    
        BASIC_DATA_SERIALIZE(char)
            BASIC_DATA_SERIALIZE(short)
            BASIC_DATA_SERIALIZE(int)
            BASIC_DATA_SERIALIZE(float)
            BASIC_DATA_SERIALIZE(long)
            BASIC_DATA_SERIALIZE(double)
            BASIC_DATA_SERIALIZE(long long)
            BASIC_DATA_SERIALIZE(unsigned char)
            BASIC_DATA_SERIALIZE(unsigned short)
            BASIC_DATA_SERIALIZE(unsigned int)
            BASIC_DATA_SERIALIZE(unsigned long)
            BASIC_DATA_SERIALIZE(unsigned long long)
    
            template<>
        static unsigned Serialize(const std::string& str, char* const pBuffer){
            *reinterpret_cast<LEN *const>(pBuffer) = static_cast<LEN>(str.size());            //元素个数
            ::memcpy(pBuffer + sizeof(LEN), str.data(), str.size());                        //数据
            return static_cast<unsigned>(str.size()) + sizeof(LEN);
        }
    
        template<typename Elem>
        static unsigned Serialize(const std::list<Elem>& list, char *const pBuffer){
            unsigned uPos = sizeof(LEN);
            *reinterpret_cast<LEN *const>(pBuffer) = static_cast<LEN>(list.size());            //元素个数
            for(const auto& elem : list){
                uPos += Serialize(elem, pBuffer + uPos);
            }
            return uPos;
        }
    
        template<typename Key, typename Value>
        static unsigned Serialize(const std::map<Key, Value>& map, char *const pBuffer){
            unsigned uPos = sizeof(LEN);
            *reinterpret_cast<LEN *const>(pBuffer) = static_cast<LEN>(map.size());            //元素个数
            for(const auto& elem : map){
                uPos += Serialize(elem.first, pBuffer + uPos);
                uPos += Serialize(elem.second, pBuffer + uPos);
            }
            return uPos;
        }
    
        template<typename Elem>
        static unsigned Serialize(const std::vector<Elem>& vec, char *const pBuffer){
            unsigned uPos = sizeof(LEN);
            *reinterpret_cast<LEN *const>(pBuffer) = static_cast<LEN>(vec.size());            //元素个数
            for(const auto& elem : vec){
                uPos += Serialize(elem, pBuffer + uPos);
            }
            return uPos;
        }
    
        template<typename Elem>
        static unsigned Serialize(const std::set<Elem>& set, char *const pBuffer){
            unsigned uPos = sizeof(LEN);
            *reinterpret_cast<LEN *const>(pBuffer) = static_cast<LEN>(set.size());            //元素个数
            for(const auto& elem : set){
                uPos += Serialize(elem, pBuffer + uPos);
            }
            return uPos;
        }
    
        /*
            检查反序列化之前的buffer
            BufferLen指包含数据总长度在内的长度,即从硬盘上读取并解压缩后的整个buffer长度
        */
        static int CheckLength(const char *const pBuffer, int BufferLen){
            if(*reinterpret_cast<const LEN *const>(pBuffer - sizeof(LEN)) + sizeof(LEN) != BufferLen){
                return 0;
            }
            return 1;
        }
    
        /*
            反序列化部分
        */
        template<typename Elem>
        static unsigned UnSerialize(const char* pBuffer, Elem& elem){
            return elem.UnSerialize(pBuffer);
        }
    
        BASIC_DATA_UNSERIALIZE(char)
            BASIC_DATA_UNSERIALIZE(short)
            BASIC_DATA_UNSERIALIZE(int)
            BASIC_DATA_UNSERIALIZE(float)
            BASIC_DATA_UNSERIALIZE(long)
            BASIC_DATA_UNSERIALIZE(double)
            BASIC_DATA_UNSERIALIZE(long long)
            BASIC_DATA_UNSERIALIZE(unsigned char)
            BASIC_DATA_UNSERIALIZE(unsigned short)
            BASIC_DATA_UNSERIALIZE(unsigned int)
            BASIC_DATA_UNSERIALIZE(unsigned long)
            BASIC_DATA_UNSERIALIZE(unsigned long long)
    
            template<>
        static unsigned UnSerialize(const char* pBuffer, std::string& str){
            LEN uLen = *reinterpret_cast<const LEN *>(pBuffer);
            std::string strTemp(pBuffer + sizeof(LEN), uLen);
            str = std::move(strTemp);
            return uLen + sizeof(LEN);
        }
    
        template<typename Elem>
        static unsigned UnSerialize(const char* pBuffer, std::list<Elem>& list){
            LEN uLen = *reinterpret_cast<const LEN *>(pBuffer);
            unsigned uPos = sizeof(LEN);
            for(LEN i = 0; i < uLen; i++){
                Elem elem;
                uPos += UnSerialize(pBuffer + uPos, elem);
                list.push_back(std::move(elem));
            }
            return uPos;
        }
    
        template<typename Key, typename Value>
        static unsigned UnSerialize(const char* pBuffer, std::map<Key, Value>& map){
            LEN uLen = *reinterpret_cast<const LEN *>(pBuffer);
            unsigned uPos = sizeof(LEN);
            for(LEN i = 0; i < uLen; i++){
                Key key;
                Value val;
                uPos += UnSerialize(pBuffer + uPos, key);
                uPos += UnSerialize(pBuffer + uPos, val);
                map.insert(std::make_pair(std::move(key), std::move(val)));
            }
            return uPos;
        }
    
        template<typename Elem>
        static unsigned UnSerialize(const char* pBuffer, std::vector<Elem>& vec){
            LEN uLen = *reinterpret_cast<const LEN *>(pBuffer);
            vec.resize(uLen);
            unsigned uPos = sizeof(LEN);
            for(LEN i = 0; i < uLen; i++){
                Elem elem;
                uPos += UnSerialize(pBuffer + uPos, elem);
                vec[i] = std::move(elem);
            }
            return uPos;
        }
    
        template<typename Elem>
        static unsigned UnSerialize(const char* pBuffer, std::set<Elem>& set){
            LEN uLen = *reinterpret_cast<const LEN *>(pBuffer);
            unsigned uPos = sizeof(LEN);
            for(LEN i = 0; i < uLen; i++){
                Elem elem;
                uPos += UnSerialize(pBuffer + uPos, elem);
                set.emplace(std::move(elem));
            }
            return uPos;
        }
    };
    #endif // !_Serialization_H_
    

    测试代码

    #include <cstdlib>
    #include <cassert>
    #include <iostream>
    
    #include "Serialization.h"
    
    template<typename T>
    void Test(const T& a){
        std::cout << Serialization::Length(a) << std::endl;
    
        auto pBuffer = Serialization::GetBuffer(a);
    
        auto len = Serialization::Serialize(a, pBuffer);
        std::cout << len << std::endl;
    
        assert(1 == Serialization::CheckLength(pBuffer, Serialization::Length(a) + sizeof(Serialization::LEN)));
    
        T b = T();
        len = Serialization::UnSerialize(pBuffer, b);
        std::cout << len << std::endl;
    
        Serialization::ReleaseBuffer(pBuffer);
        assert(a == b);
    }
    
    class MyTest : public Serialization::Serializable{
    public:
        MyTest() : m_i(0), m_c(0), m_d(0){}
    
        MyTest(int i, char c, double d, const std::string& str, const std::list<int>& list,
            const std::map<int, double>& map, const std::vector<int>& vec, const std::set<int>& set)
            :m_i(i), m_c(c), m_d(d), m_str(str), m_list(list), m_map(map), m_vec(vec), m_set(set){}
    
        MyTest(int i, char c, double d, std::string&& str, std::list<int>&& list,
            std::map<int, double>&& map, std::vector<int>&& vec, std::set<int>&& set)        //针对列表初始化
            :m_i(i), m_c(c), m_d(d), m_str(std::move(str)), m_list(std::move(list)), m_map(std::move(map)), m_vec(std::move(vec)), m_set(std::move(set)){}
    
        bool operator==(const MyTest& rhs) const{
            return m_i == rhs.m_i && m_c == rhs.m_c && m_d == rhs.m_d && m_str == rhs.m_str && m_list == rhs.m_list && m_map == rhs.m_map && m_vec == rhs.m_vec && m_set == rhs.m_set;
        }
    
        virtual unsigned Length() const override{
            return Serialization::Length(m_i) + Serialization::Length(m_c) + Serialization::Length(m_d) + Serialization::Length(m_str) + Serialization::Length(m_list)
                + Serialization::Length(m_map) + Serialization::Length(m_vec) + Serialization::Length(m_set);
        }
        virtual unsigned Serialize(char *const pBuffer) const override{
            unsigned uPos = 0;
            uPos += Serialization::Serialize(m_i, pBuffer + uPos);
            uPos += Serialization::Serialize(m_c, pBuffer + uPos);
            uPos += Serialization::Serialize(m_d, pBuffer + uPos);
            uPos += Serialization::Serialize(m_str, pBuffer + uPos);
            uPos += Serialization::Serialize(m_list, pBuffer + uPos);
            uPos += Serialization::Serialize(m_map, pBuffer + uPos);
            uPos += Serialization::Serialize(m_vec, pBuffer + uPos);
            uPos += Serialization::Serialize(m_set, pBuffer + uPos);
            return  uPos;
        }
        virtual unsigned UnSerialize(const char *const pBuffer)override{
            unsigned uPos = 0;
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_i);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_c);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_d);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_str);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_list);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_map);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_vec);
            uPos += Serialization::UnSerialize(pBuffer + uPos, m_set);
            return uPos;
        }
    
    private:
        int m_i;
        char m_c;
        double m_d;
        std::string m_str;
        std::list<int> m_list;
        std::map<int, double> m_map;
        std::vector<int> m_vec;
        std::set<int> m_set;
    };
    
    
    
    int main(int argc, char** argv){
        //基础类型
        Test<char>('a');
        Test<short>(-1);
        Test<int>(-1);
        Test<float>(-1.0);
        Test<long>(-1);
        Test<double>(-1.0);
        Test<long long>(-1);
        Test<unsigned char>(1);
        Test<unsigned short>(1);
        Test<unsigned int>(1);
        Test<unsigned long>(1);
        Test<unsigned long long>(1);
    
        //STL容器
        Test<std::string>("Test!!!");
        Test<std::list<int>>({1, 2, 3, 4, 5, 6});
        Test<std::map<int, double>>({{1, 1.0}, {2, 2.0}, {3, 3.0}});
        Test<std::vector<int>>({1, 2, 3, 4, 5, 6});
        Test<std::set<int>>({1, 2, 3, 4, 5, 6});
    
        //STL容器嵌套
        Test<std::set<std::string>>({"test1", "test2", "test3"});
        Test<std::list<std::set<std::string>>>({{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}});
        Test<std::map<int, std::list<std::set<std::string>>>>({{1, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {2, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {3, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}}});
        Test<std::vector<std::map<int, std::list<std::set<std::string>>>>>({{{1, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {2, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {3, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}}},
            {{1, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {2, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {3, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}}},
            {{1, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {2, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}},
            {3, {{"test1", "test2", "test3"}, {"test1", "test2", "test3"}, {"test1", "test2", "test3"}}}}});
    
        //自定义结构
        MyTest test(1, 'a', 1.0, "Test!!!", {1, 2, 3, 4, 5, 6}, {{1, 1.0}, {2, 2.0}, {3, 3.0}}, {1, 2, 3, 4, 5, 6}, {1, 2, 3, 4, 5, 6});
        Test<MyTest>(test);
    
        system("pause");
        return 0;
    }
    

    改进

    • 在序列化后总长度前方增加两个字段
      • 字段一:特定标识字符串,用于直接判断buffer是否是可序列化内存,如rdb文件开头就有"REDIS"
      • 字段二:增加校验和字段,存储序列化后数据的校验和,用于检查序列化数据是否被篡改,如网络协议上一般都会有此字段

    无法增加协议版本号,因为自定义结构的序列化、反序列化由用户控制,无法统一决定版本





    原创不易,转载请注明出处,谢谢
  • 相关阅读:
    ORACLE 11g RAC-RAC DG Duplicate 搭建(生产操作文档)
    1.kafka是什么
    11.扩展知识-redis持久化
    10.Redis-服务器命令
    9.扩展知识-redis批量操作-事务(了解)
    8.扩展知识-多数据库(了解)
    7.Redis扩展知识-消息订阅与发布(了解)
    K8S上部署ES集群报错
    ORM 常用字段和参数
    celery的使用
  • 原文地址:https://www.cnblogs.com/Keeping-Fit/p/14952742.html
Copyright © 2020-2023  润新知