• HBase中使用过滤器筛选数据


    一、过滤器能干什么

    HBase为筛选数据提供了一组过滤器,通过过滤器可以在HBase中的数据的多个维度(行,列,数据版本)上进行对数据的筛选操作。
    通常来说,通过行键、列来筛选数据的应用场景较多。

    二、常见的过滤器

    基于行的过滤器

    PrefixFilter: 行的前缀匹配
    PageFilter: 基于行的分页

    基于列的过滤器

    ColumnPrefixFilter: 列前缀匹配
    FirstKeyOnlyFilter: 只返回每一行的第一列

    基于单元值的过滤器

    KeyOnlyFilter: 返回的数据不包括单元值,只包含行键与列
    TimestampsFilter: 根据数据的时间戳版本进行过滤

    基于列和单元值的过滤器

    SingleColumnValueFilter: 对该列的单元值进行比较过滤
    SingleColumnValueExcludeFilter: 对该列的单元值进行比较过滤

    比较过滤器

    比较过滤器通常需要一个比较运算符以及一个比较器来实现过滤
    RowFilter、 FamilyFilter、 QualifierFilter、 ValueFilter
    常见过滤器总结

    过滤器(Filter) 功能

    • RowFilter 筛选出匹配的所有的行
    • PrefixFilter 筛选出具有特定前缀的行键的数据
    • KeyOnlyFilter 只返回每行的行键,值全部为空
    • ColumnPrefixFilter 按照列名的前缀来筛选单元格
    • ValueFilter 按照具体的值来筛选单元格的过滤器
    • TimestampsFilter 根据数据的时间戳版本进行过滤
    • FilterList 用于综合使用多个过滤器

    三、开发演示

    /**
    * @title HBaseFilterTest
    * @date 2019/12/9 15:01
    * @description 尝试使用过滤器
    */
    public class HBaseFilterTest {
    
    @Test
    public void createTable(){
    HBaseUtil.createTable("FileTable", new String[]{"fileInfo", "saveInfo"});
    }
    
    @Test
    public void addFileDetails(){
    HBaseUtil.putRow("FileTable", "rowkey1", "fileInfo", "name", "file1.txt");
    HBaseUtil.putRow("FileTable", "rowkey1", "fileInfo", "type", "txt");
    HBaseUtil.putRow("FileTable", "rowkey1", "fileInfo", "size", "1024");
    HBaseUtil.putRow("FileTable", "rowkey1", "saveInfo", "creator", "suiwo1");
    HBaseUtil.putRow("FileTable", "rowkey2", "fileInfo", "name", "file2.jpg");
    HBaseUtil.putRow("FileTable", "rowkey2", "fileInfo", "type", "jpg");
    HBaseUtil.putRow("FileTable", "rowkey2", "fileInfo", "size", "2048");
    HBaseUtil.putRow("FileTable", "rowkey2", "saveInfo", "creator", "suiwo3");
    HBaseUtil.putRow("FileTable", "rowkey3", "fileInfo", "name", "file3.jpg");
    HBaseUtil.putRow("FileTable", "rowkey3", "fileInfo", "type", "jpg");
    HBaseUtil.putRow("FileTable", "rowkey3", "fileInfo", "size", "2048");
    HBaseUtil.putRow("FileTable", "rowkey3", "saveInfo", "creator", "suiwo3");
    }
    
    /**
    * rowkey = rowkey1
    * fileName = file1.txt
    */
    @Test
    public void rowFilterTest(){
    Filter filter = new RowFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("rowkey1")));
    
    // MUST_PASS_ALL指必须通过所有的Filter
    FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL, Arrays.asList(filter));
    
    ResultScanner scanner = HBaseUtil.getScanner("FileTable","rowkey1","rowkey3", filterList);
    
    if(scanner != null){
    scanner.forEach(result -> {
    System.out.println("rowkey = " + Bytes.toString(result.getRow()));
    System.out.println("fileName = " + Bytes.toString(result.getValue(Bytes.toBytes("fileInfo"), Bytes.toBytes("name"))));
    });
    scanner.close();
    }
    }
    
    /**
    * rowkey = rowkey2
    * fileName = file2.jpg
    */
    @Test
    public void prefixFilterTest(){
    Filter filter = new PrefixFilter(Bytes.toBytes("rowkey2"));
    FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL, Arrays.asList(filter));
    ResultScanner scanner = HBaseUtil.getScanner("FileTable","rowkey1","rowkey3", filterList);
    
    if(scanner != null){
    scanner.forEach(result -> {
    System.out.println("rowkey = " + Bytes.toString(result.getRow()));
    System.out.println("fileName = " + Bytes.toString(result.getValue(Bytes.toBytes("fileInfo"), Bytes.toBytes("name"))));
    });
    scanner.close();
    }
    }
    
    /**
    * rowkey = rowkey1
    * fileName = ���
    * rowkey = rowkey2
    * fileName = ���
    */
    @Test
    public void keyOnlyFilterTest(){
    Filter filter = new KeyOnlyFilter(true);
    FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL, Arrays.asList(filter));
    ResultScanner scanner = HBaseUtil.getScanner("FileTable","rowkey1","rowkey3", filterList);
    
    if(scanner != null){
    scanner.forEach(result -> {
    System.out.println("rowkey = " + Bytes.toString(result.getRow()));
    System.out.println("fileName = " + Bytes.toString(result.getValue(Bytes.toBytes("fileInfo"), Bytes.toBytes("name"))));
    });
    scanner.close();
    }
    }
    
    /**
    * rowkey = rowkey1
    * fileName = file1.txt
    * fileType = null
    * rowkey = rowkey2
    * fileName = file2.jpg
    * fileType = null
    */
    @Test
    public void columnPrefixFilterTest(){
    Filter filter = new ColumnPrefixFilter(Bytes.toBytes("nam"));// 前缀为nam
    FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL, Arrays.asList(filter));
    ResultScanner scanner = HBaseUtil.getScanner("FileTable","rowkey1","rowkey3", filterList);
    
    if(scanner != null){
    scanner.forEach(result -> {
    System.out.println("rowkey = " + Bytes.toString(result.getRow()));
    System.out.println("fileName = " + Bytes.toString(result.getValue(Bytes.toBytes("fileInfo"), Bytes.toBytes("name"))));
    System.out.println("fileType = " + Bytes.toString(result.getValue(Bytes.toBytes("fileInfo"), Bytes.toBytes("type"))));
    });
    scanner.close();
    }
    }
    }
    
  • 相关阅读:
    ScreenToGif 使用教程
    无问西东
    php如何解决中文乱码问题?
    layer父页面调用子页面的方法
    弹层组件文档
    关于svn获取获取文件时 Unable to connect to a repository at URL"https://..."执行上下文错误:参数错误
    centos下修改文件后如何保存退出
    Linux CentOS 7的图形界面安装(GNOME、KDE等)
    CentOS7安装详解
    Could not attach to pid : "xx"最近启动Xcode运行项目都会出现这个问题,再次启动或者多启动几次,就可以正常运行工程了。
  • 原文地址:https://www.cnblogs.com/cfas/p/15959458.html
Copyright © 2020-2023  润新知