使用hbase小结
背景
hbase中一張表的rowkey定義為時間戳+字符串
需求
根據時間戳和列簇中某列的值為"abc",導出一天內的數據到excel中。
使用FilterList
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);SingleColumnValueFilter filter=new SingleColumnValueFilter("info".getBytes(),"supplier".getBytes(), CompareFilter.CompareOp.EQUAL,"abc".getBytes());filter.setFilterIfMissing(true);filterList.addFilter(filter);List<String> list = new ArrayList<String>();List<ResultDTO> listSpider = new ArrayList<ResultDTO>();Scan scan = new Scan(); scan.setStartRow(Bytes.toBytes(startKey));scan.setStopRow(Bytes.toBytes(endtKey));scan.setFilter(filterList);Connection conn = null;HTable table = null;try {conn = getConnection();table = (HTable) conn.getTable(TableName.valueOf(tableName));ResultScanner rs = table.getScanner(scan);1.rowkey的range,設置startrow和StopRow值
2.列值過濾,使用
SingleColumnValueFilter默認情況下,列值為空時把此行結果算入
filter.setFilterIfMissing(true);//排除列值為空的官方說明:To prevent the entire row from being emitted if the column is not found on a row, use?setFilterIfMissing(boolean). Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.
轉載于:https://www.cnblogs.com/davidwang456/p/8303152.html
總結
- 上一篇: HBase Filter及对应Shell
- 下一篇: 支付宝架构师眼里的高并发架构