• hbase 自定义过滤器


    1、首先生成自定义过滤器,生成jar包,然后拷贝到服务器hbase目录的lib下。

    1.1 自定义过滤器CustomFilter

    import com.google.protobuf.InvalidProtocolBufferException;
    import org.apache.hadoop.hbase.Cell;
    import org.apache.hadoop.hbase.CellUtil;
    import org.apache.hadoop.hbase.exceptions.DeserializationException;
    import org.apache.hadoop.hbase.filter.Filter;
    import org.apache.hadoop.hbase.filter.FilterBase;
    import org.apache.hadoop.hbase.util.ByteStringer;
    
    import java.io.IOException;
    
    
    /**
     * @Author: xu.dm
     * @Date: 2019/4/14 12:16
     * @Description: 自定义过滤器,选择列值匹配的行数据
     */
    public class CustomFilter extends FilterBase {
        private byte[] value = null;
        private boolean filterRow = true;
    
        public CustomFilter() {
            super();
        }
    
        public CustomFilter(byte[] value) {
            this.value = value;
        }
    
        @Override
        public void reset() throws IOException {
            this.filterRow = true;
        }
    
    
        @Override
        public boolean filterRow() throws IOException {
            return this.filterRow;
        }
    
    
        //匹配的数据不过滤
        @Override
        public ReturnCode filterCell(Cell c) throws IOException {
            if(CellUtil.matchingValue(c,value))
                filterRow = false;
            return ReturnCode.INCLUDE;
        }
    
        /**
         * protobuf生成MyFilterProtos
         */
        @Override
        public byte[] toByteArray() throws IOException {
            MyFilterProtos.CustomFilter.Builder builder = MyFilterProtos.CustomFilter.newBuilder();
            if(value!=null)
                builder.setValue(ByteStringer.wrap(value));
    
            return builder.build().toByteArray();
        }
    
        public static Filter parseFrom(final byte[] pbBytes)
                throws DeserializationException {
            MyFilterProtos.CustomFilter proto;
            try {
                proto = MyFilterProtos.CustomFilter.parseFrom(pbBytes);
            } catch (InvalidProtocolBufferException e) {
                throw new DeserializationException(e);
            }
            return new CustomFilter(proto.getValue().toByteArray());
        }
    }

    1.2 MyFilterProtos是通过protobuf生成的,这里需要注意hbase使用的是protobuf2.5.0版本,不要使用高于2.5.0版本的protobuf,不然hbase会报找不到类的错误。

    proto文件

    syntax = "proto2";
    
    option java_package = "";
    option java_outer_classname = "MyFilterProtos";
    option java_generic_services = true;
    option java_generate_equals_and_hash = true;
    option optimize_for = SPEED;
    
    message CustomFilter {
      required bytes value = 1;
    }

    1.3 编辑成jar包,发布到hbase的lib目录,hbase需要重启

    [root@bigdata-senior01 lib]# ls $HBASE_HOME/lib/Custom*.*
    /opt/hbase-2.0.4/lib/CustomFilter.jar

     2、使用自定义过滤

    2.1 程序中引入刚才发布的jar包

     2.2 使用自定义过滤器过滤数据

       //使用自定义过滤器,只显示匹配列值的行
        private static void customFilterData() throws IOException{
            Table table = helper.getConnection().getTable(TableName.valueOf("testtable"));
    
            List<Filter> filters = new ArrayList<Filter>();
    
            Filter filter1 = new CustomFilter(Bytes.toBytes("user30"));
            filters.add(filter1);
    
            Filter filter2 = new CustomFilter(Bytes.toBytes("user20"));
            filters.add(filter2);
    
            Filter filter3 = new CustomFilter(Bytes.toBytes("user90"));
            filters.add(filter3);
    
            FilterList filterList = new FilterList(
                    FilterList.Operator.MUST_PASS_ONE, filters);
    
            Scan scan = new Scan();
            scan.setFilter(filterList);
            ResultScanner scanner = table.getScanner(scan);
            for(Result result:scanner){
                helper.dumpResult(result);
            }
    
            scanner.close();
            table.close();
        }
    }

    输出结果:

    Cell: rowKey20/ex:addr/1555078771906/Put/vlen=8/seqid=0, Value: street20
    Cell: rowKey20/info:username/1555078771906/Put/vlen=6/seqid=0, Value: user20
    Cell: rowKey20/memo:detail/1555078771906/Put/vlen=8/seqid=0, Value: remark20
    Cell: rowKey30/ex:addr/1555078771906/Put/vlen=8/seqid=0, Value: street30
    Cell: rowKey30/info:username/1555078771906/Put/vlen=6/seqid=0, Value: user30
    Cell: rowKey30/memo:detail/1555078771906/Put/vlen=8/seqid=0, Value: remark30
    Cell: rowKey90/ex:addr/1555078771906/Put/vlen=8/seqid=0, Value: street90
    Cell: rowKey90/info:username/1555078771906/Put/vlen=6/seqid=0, Value: user90
    Cell: rowKey90/memo:detail/1555078771906/Put/vlen=8/seqid=0, Value: remark90
  • 相关阅读:
    VC 使用json cpp 静态库 问题解决
    <转>c++ builder JSONCPP 注意事项 XE2 解决编译问题 _Mfl
    窥探try ... catch与__try ... __except的区别
    二叉树与其它树
    《DirectX 9.0 3D游戏开发编程基础》 第二章 绘制流水线 读书笔记
    《DirectX 9.0 3D游戏开发编程基础》 第一章 初始化Direct3D 读书笔记
    《DirectX 9.0 3D游戏开发编程基础》必备的数学知识 读书笔记
    windows lua 多线程 线程同步
    <转>多线程中的lua同步问题
    php 检查email电子邮件函数(奇葩写法)
  • 原文地址:https://www.cnblogs.com/asker009/p/10706457.html
Copyright © 2020-2023  润新知