• 如何使用maven进行avro序列化


    maven导入avro:

    <dependency>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro</artifactId>
        <version>1.7.7</version>
    </dependency>
    maven导入avro的构建插件:

    <build>
        <plugins>
            <plugin>
                <groupId>org.apache.avro</groupId>
                <artifactId>avro-maven-plugin</artifactId>
                <version>1.7.7</version>
                <executions>
                    <execution>
                        <phase>generate-sources</phase>
                        <goals>
                            <goal>schema</goal>
                            <goal>protocol</goal>
                            <goal>idl-protocol</goal>
                        </goals>
                        <configuration>
                            <!-- 源目录,用于存放 avro的schema文件及protocol文件 ,如果没加如下配置,那么默认从/src/main/avro下面找avsc文件,生成的java文件放到target/generated-sources/avro下面-->
                            <sourceDirectory> ${project.basedir}/src/main/avro/</sourceDirectory>
                            <outputDirectory> ${project.basedir}/src/main/java/</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.7</source>
                    <target>1.7</target>
                </configuration>
            </plugin>
        </plugins>
    </build>
    在${project.basedir}/src/main/avro/ 下导入json文件,就是所谓的数据schema
     
    {
       "namespace":"user_machine_learning",
       "type":"record",
       "name":"product",
       "fields":[
              {"name":"product_id","type":"string","default":"null"},
              {"name":"company_name","type":"string","default":"null"},
              {"name":"product_info","type":"string","default":"null"},
              {"name":"direction","type":"string","default":"null"}   
        ]
    }
    maven和schema构建好之后需要进行install,然后就会在 ${project.basedir}/src/main/avro/ 目录下产生构建好的序列化代码,这个代码只需要使用java进行调用即可

    使用java进行序列化和反序列化的操作:

    public class Test_avro {
        public static void main(String[] args) throws IOException {
    
            //TODO 序列化操作
            product pro = product.newBuilder().build();
            pro.setProductId("1");
            pro.setCompanyName("这是一个测试");
            pro.setProductInfo("测试的详细说明");
            pro.setDirection("1");
            //将生成的数据保存到本地文件中
            File file = new File("/Users/niutao/Desktop/avro_test/user.avro");
            DatumWriter<product> productDatumWriter = new SpecificDatumWriter<product>(product.class);
            DataFileWriter<product> dataFileWriter = new DataFileWriter<product>(productDatumWriter);
            dataFileWriter.create(product.getClassSchema() , file);
            dataFileWriter.append(pro);
            dataFileWriter.close();
    
            //TODO 反序列
            DatumReader<product> productDatumReader = new SpecificDatumReader<product>(product.class);
            DataFileReader<product> productDataFileReader = new DataFileReader<product>(file , productDatumReader);
            product pro_reader = null;
            while (productDataFileReader.hasNext()){
                pro_reader = productDataFileReader.next();
                System.out.println(pro_reader);
            }
        }
    }
  • 相关阅读:
    conda 激活环境失败解决办法
    openSmile-2.3.0在Linux下安装
    Ubuntu16.04下安装多版本cuda和cudnn
    几个最新免费开源的中文语音数据集
    train loss与test loss结果分析
    文件路径
    Properties类与配置文件
    内省
    Junit单元测试
    Hdfs常用命令
  • 原文地址:https://www.cnblogs.com/niutao/p/10548003.html
Copyright © 2020-2023  润新知