• Hadoop添加LZO压缩支持


    启用lzo的压缩方式对于小规模集群是很有用处,压缩比率大概能降到原始日志大小的1/3。同时解压缩的速度也比较快。

    安装

    准备jar包

    1)先下载lzo的jar项目
    https://github.com/twitter/hadoop-lzo/archive/master.zip

    2)下载后的文件名是hadoop-lzo-master,它是一个zip格式的压缩包,先进行解压,然后用maven编译。生成hadoop-lzo-0.4.20。

    3)将编译好后的hadoop-lzo-0.4.20.jar 放入hadoop-2.7.2/share/hadoop/common/

    [root@bigdata-01 common]$ pwd
    /export/servers/hadoop-2.7.4/share/hadoop/common
    [root@bigdata-01 common]$ ls
    hadoop-lzo-0.4.20.jar

    4)scp同步hadoop-lzo-0.4.20.jar到其他节点

    配置

    1)core-site.xml增加配置支持LZO压缩

    <?xml version="1.0" encoding="UTF-8"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <configuration>
    
    <property>
    <name>io.compression.codecs</name>
    <value>
    org.apache.hadoop.io.compress.GzipCodec,
    org.apache.hadoop.io.compress.DefaultCodec,
    org.apache.hadoop.io.compress.BZip2Codec,
    org.apache.hadoop.io.compress.SnappyCodec,
    com.hadoop.compression.lzo.LzoCodec,
    com.hadoop.compression.lzo.LzopCodec
    </value>
    </property>
    <property>
        <name>io.compression.codec.lzo.class</name>
        <value>com.hadoop.compression.lzo.LzoCodec</value>
    </property>
    
    </configuration>

    2)scp同步core-site.xml到其他节点

    测试

    1)启动hive创建lzo表

    CREATE TABLE lzo_test (
    id STRING,
    name STRING
    )
    partitioned by (
    dt STRING
    )
    row format delimited
    fields terminated by '	'
    STORED AS INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat"
    OUTPUTFORMAT "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat";

    2)导入数据

    load data inpath '/xxx/xxx/2019-07-25' into table lzo_test partition(dt='2019-07-25');
  • 相关阅读:
    Array.prototype.slice.call()
    闭包与变量
    XML处理指令
    XSLT学习(九)通过JavaScript转化xml
    chrome浏览器canvas画图不显示
    B.储物点的距离
    A.约数个数的和
    F.求最大值
    STVD+COSMIC编译工程时出现Error creating process for executable mapinfo
    STVD+COSMIC编译工程时can't open file crtsi0.sm8
  • 原文地址:https://www.cnblogs.com/blazeZzz/p/11244543.html
Copyright © 2020-2023  润新知