比较运算符 CompareFilter.CompareOp
比较运算符用于定义比较关系,可以有以下几类值供选择:
- EQUAL 相等
- GREATER 大于
- GREATER_OR_EQUAL 大于等于
- LESS 小于
- LESS_OR_EQUAL 小于等于
- NOT_EQUAL 不等于
比较器 ByteArrayComparable
通过比较器可以实现多样化目标匹配效果,比较器有以下子类可以使用:
- BinaryComparator 匹配完整字节数组
- BinaryPrefixComparator 匹配字节数组前缀
- BitComparator 不常用
- NullComparator 不常用
- RegexStringComparator 匹配正则表达式
- SubstringComparator 匹配子字符串
1.多重过滤器--FilterList(Shell不支持)
FilterList代表一个过滤器链,它可以包含一组即将应用于目标数据集的过滤器,过滤器间具有“与”FilterList.Operator.MUST_PASS_ALL 和“或” FilterList.Operator.MUST_PASS_ONE 关系。
//结合过滤器,获取所有age在15到30之间的行
private static void scanFilter() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); // And
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
// >=15
SingleColumnValueFilter filter1 = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.GREATER_OR_EQUAL, "15".getBytes());
// =<30
SingleColumnValueFilter filter2 = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.LESS_OR_EQUAL, "30".getBytes());
filterList.addFilter(filter1);
filterList.addFilter(filter2); Scan scan = new Scan();
// set Filter
scan.setFilter(filterList); ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
2. 列值过滤器--SingleColumnValueFilter
用于测试列值相等(CompareOp.EQUAL ),不等(CompareOp.NOT_EQUAL),或单侧范围 (如CompareOp.GREATER)。构造函数:
2.1.比较的关键字是一个字符数组(Shell不支持?)
SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, byte[] value)
//SingleColumnValueFilter例子
private static void scanFilter01() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, "18".getBytes());
Scan scan = new Scan();
scan.setFilter(scvf);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
2.2.比较的关键字是一个比较器ByteArrayComparable
SingleColumnValueFilter(byte[] family, byte[] qualifier, CompareFilter.CompareOp compareOp, ByteArrayComparable comparator)
//SingleColumnValueFilter例子2 -- RegexStringComparator
private static void scanFilter02() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users");
//值比较的正则表达式 -- RegexStringComparator
//匹配info:age值以"4"结尾
RegexStringComparator comparator = new RegexStringComparator(".4");
//第四个参数不一样
SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, comparator);
Scan scan = new Scan();
scan.setFilter(scvf);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):032:0> scan 'users',{FILTER=>"SingleColumnValueFilter('info','age',=,'regexstring:.4')"}
ROW COLUMN+CELL
xiaoming01 column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=info:age, timestamp=1441998917568, value=24
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
3 row(s) in 0.0130 seconds
//SingleColumnValueFilter例子2 -- SubstringComparator
private static void scanFilter03() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //检测一个子串是否存在于值中(大小写不敏感) -- SubstringComparator
//过滤age值中包含'4'的RowKey
SubstringComparator comparator = new SubstringComparator("4");
//第四个参数不一样
SingleColumnValueFilter scvf = new SingleColumnValueFilter("info".getBytes(), "age".getBytes(), CompareOp.EQUAL, comparator);
Scan scan = new Scan();
scan.setFilter(scvf);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):033:0> scan 'users',{FILTER=>"SingleColumnValueFilter('info','age',=,'substring:4')"}
ROW COLUMN+CELL
xiaoming01 column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=info:age, timestamp=1441998917568, value=24
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
3 row(s) in 0.0180 seconds
3.列名过滤器
由于HBase采用键值对保存内部数据,列名过滤器过滤一行的列名(ColumnFamily:Qualifiers)是否存在 , 对应前节所述列值的情况。
3.1.基于Columun Family列族过滤数据的FamilyFilter
FamilyFilter(CompareFilter.CompareOp familyCompareOp, ByteArrayComparable familyComparator)
注意:
1.如果希望查找的是一个已知的列族,则使用 scan.addFamily(family); 比使用过滤器效率更高.
2.由于目前HBase对多列族支持不完善,所以该过滤器目前用途不大.
//基于列族过滤数据的FamilyFilter
private static void scanFilter04() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //过滤 = 'address'的列族
//FamilyFilter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryComparator("address".getBytes())); //过滤以'add'开头的列族
FamilyFilter familyFilter = new FamilyFilter(CompareOp.EQUAL, new BinaryPrefixComparator("add".getBytes())); Scan scan = new Scan();
scan.setFilter(familyFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):021:0> scan 'users',{FILTER=>"FamilyFilter(=,'binaryprefix:add')"}
ROW COLUMN+CELL
xiaoming column=address:city, timestamp=1441997498965, value=hangzhou
xiaoming column=address:contry, timestamp=1441997498911, value=china
xiaoming column=address:province, timestamp=1441997498939, value=zhejiang
xiaoming01 column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD
zhangyifei column=address:city, timestamp=1441997499108, value=jieyang
zhangyifei column=address:contry, timestamp=1441997499077, value=china
zhangyifei column=address:province, timestamp=1441997499093, value=guangdong
zhangyifei column=address:town, timestamp=1441997500711, value=xianqiao
3 row(s) in 0.0400 seconds
3.2.基于Qualifier列名过滤数据的QualifierFilter
QualifierFilter(CompareFilter.CompareOp op, ByteArrayComparable qualifierComparator)
说明:该过滤器应该比FamilyFilter更常用!
//基于Qualifier(列名)过滤数据的QualifierFilter
private static void scanFilter05() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //过滤列名 = 'age'所有RowKey
//QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryComparator("age".getBytes())); //过滤列名 以'age'开头 所有RowKey(包含age)
//QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new BinaryPrefixComparator("age".getBytes())); //过滤列名 包含'age' 所有RowKey(包含age)
//QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new SubstringComparator("age")); //过滤列名 符合'.ge'正则表达式 所有RowKey
QualifierFilter qualifierFilter = new QualifierFilter(CompareOp.EQUAL, new RegexStringComparator(".ge")); Scan scan = new Scan();
scan.setFilter(qualifierFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):020:0> scan 'users',{FILTER=>"QualifierFilter(=,'regexstring:.ge')"}
ROW COLUMN+CELL
xiaoming column=info:age, timestamp=1441997971945, value=38
xiaoming01 column=info:age, timestamp=1441998917568, value=24
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
zhangyifei column=info:age, timestamp=1442247255446, value=18
5 row(s) in 0.0460 seconds
3.3.基于列名前缀过滤数据的ColumnPrefixFilter(该功能用QualifierFilter也能实现)
ColumnPrefixFilter(byte[] prefix)
注意:一个列名是可以出现在多个列族中的,该过滤器将返回所有列族中匹配的列。
//ColumnPrefixFilter例子
private static void scanFilter06() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //匹配 以'ag'开头的所有的列
ColumnPrefixFilter columnPrefixFilter = new ColumnPrefixFilter("ag".getBytes()); Scan scan = new Scan();
scan.setFilter(columnPrefixFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):018:0> scan 'users',{FILTER=>"ColumnPrefixFilter('ag')"}
ROW COLUMN+CELL
xiaoming column=info:age, timestamp=1441997971945, value=38
xiaoming01 column=info:age, timestamp=1441998917568, value=24
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
zhangyifei column=info:age, timestamp=1442247255446, value=18
5 row(s) in 0.0280 seconds
3.4.基于多个列名前缀过滤数据的MultipleColumnPrefixFilter
MultipleColumnPrefixFilter 和 ColumnPrefixFilter 行为差不多,但可以指定多个前缀。
//MultipleColumnPrefixFilter例子
private static void scanFilter07() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //匹配 以'a'或者'c'开头 所有的列{二维数组}
byte[][] prefixes =new byte[][]{"a".getBytes(), "c".getBytes()};
MultipleColumnPrefixFilter multipleColumnPrefixFilter = new MultipleColumnPrefixFilter(prefixes ); Scan scan = new Scan();
scan.setFilter(multipleColumnPrefixFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):017:0> scan 'users',{FILTER=>"MultipleColumnPrefixFilter('a','c')"}
ROW COLUMN+CELL
xiaoming column=address:city, timestamp=1441997498965, value=hangzhou
xiaoming column=address:contry, timestamp=1441997498911, value=china
xiaoming column=info:age, timestamp=1441997971945, value=38
xiaoming column=info:company, timestamp=1441997498889, value=alibaba
xiaoming01 column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=info:age, timestamp=1441998917568, value=24
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
zhangyifei column=address:city, timestamp=1441997499108, value=jieyang
zhangyifei column=address:contry, timestamp=1441997499077, value=china
zhangyifei column=info:age, timestamp=1442247255446, value=18
zhangyifei column=info:company, timestamp=1441997499039, value=alibaba
5 row(s) in 0.0430 seconds
3.5.基于列范围(不是行范围)过滤数据ColumnRangeFilter
- 可用于获得一个范围的列,例如,如果你的一行中有百万个列,但是你只希望查看列名从bbbb到dddd的范围
- 该方法从 HBase 0.92 版本开始引入
- 一个列名是可以出现在多个列族中的,该过滤器将返回所有列族中匹配的列
构造函数:
ColumnRangeFilter(byte[] minColumn, boolean minColumnInclusive, byte[] maxColumn, boolean maxColumnInclusive)
参数解释:
- minColumn - 列范围的最小值,如果为空,则没有下限
- minColumnInclusive - 列范围是否包含minColumn
- maxColumn - 列范围最大值,如果为空,则没有上限
- maxColumnInclusive - 列范围是否包含maxColumn
//ColumnRangeFilter例子
private static void scanFilter08() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //匹配 以'a'开头到以'c'开头(不包含c) 所有的列
ColumnRangeFilter columnRangeFilter = new ColumnRangeFilter("a".getBytes(), true, "c".getBytes(), false); Scan scan = new Scan();
scan.setFilter(columnRangeFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):016:0> scan 'users',{FILTER=>"ColumnRangeFilter('a',true,'c',false)"}
ROW COLUMN+CELL
xiaoming column=info:age, timestamp=1441997971945, value=38
xiaoming column=info:birthday, timestamp=1441997498851, value=1987-06-17
xiaoming01 column=info:age, timestamp=1441998917568, value=24
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
zhangyifei column=info:age, timestamp=1442247255446, value=18
zhangyifei column=info:birthday, timestamp=1441997498990, value=1987-4-17
5 row(s) in 0.0340 seconds
4.RowKey
当需要根据行键特征查找一个范围的行数据时,使用Scan的startRow和stopRow会更高效,但是,startRow和stopRow只能匹配行键的开始字符,而不能匹配中间包含的字符。当需要针对行键进行更复杂的过滤时,可以使用RowFilter。
构造函数:RowFilter(CompareFilter.CompareOp rowCompareOp, ByteArrayComparable rowComparator)
//RowFilter例子
private static void scanFilter09() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //匹配 行键包含'01' 所有的行
RowFilter rowFilter = new RowFilter(CompareOp.EQUAL, new SubstringComparator("01")); Scan scan = new Scan();
scan.setFilter(rowFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
hbase(main):013:0> scan 'users',{FILTER=>"RowFilter(=,'substring:01')"}
ROW COLUMN+CELL
xiaoming01 column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=address:country, timestamp=1442000228945, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming01 column=info:age, timestamp=1441998917568, value=24
1 row(s) in 0.0190 seconds
5.PageFilter(Shell不支持?)
指定页面行数,返回对应行数的结果集。
需要注意的是,该过滤器并不能保证返回的结果行数小于等于指定的页面行数,因为过滤器是分别作用到各个region server的,它只能保证当前region返回的结果行数不超过指定页面行数。
构造函数:PageFilter(long pageSize)
//PageFilter例子
private static void scanFilter10() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //从RowKey为 "xiaoming" 开始,取3行(包含xiaoming)
PageFilter pageFilter = new PageFilter(3L); Scan scan = new Scan();
scan.setStartRow("xiaoming".getBytes());
scan.setFilter(pageFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
注意:由于该过滤器并不能保证返回的结果行数小于等于指定的页面行数,所以更好的返回指定行数的办法是ResultScanner.next(int nbRows),即:
//上面Demo的改动版
private static void scanFilter11() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //从RowKey为 "xiaoming" 开始,取3行(包含xiaoming)
//PageFilter pageFilter = new PageFilter(3L); Scan scan = new Scan();
scan.setStartRow("xiaoming".getBytes());
//scan.setFilter(pageFilter);
ResultScanner rs = ht.getScanner(scan);
//指定返回3行数据
for(Result result : rs.next(3)){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
6.SkipFilter(Shell不支持)
根据整行中的每个列来做过滤,只要存在一列不满足条件,整行都被过滤掉。
构造函数:SkipFilter(Filter filter)
例如,如果一行中的所有列代表的是不同物品的重量,则真实场景下这些数值都必须大于零,我们希望将那些包含任意列值为0的行都过滤掉。在这个情况下,我们结合ValueFilter和SkipFilter共同实现该目的:
scan.setFilter(new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL,new BinaryComparator(Bytes.toBytes(0))));
//SkipFilter例子
private static void scanFilter12() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //跳过列值中包含"24"的所有列
SkipFilter skipFilter = new SkipFilter(new ValueFilter(CompareOp.NOT_EQUAL, new BinaryComparator("24".getBytes()))); Scan scan = new Scan();
scan.setFilter(skipFilter);
ResultScanner rs = ht.getScanner(scan);
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
}
}
ht.close();
}
7.Utility--FirstKeyOnlyFilter
该过滤器仅仅返回每一行中第一个cell的值,可以用于高效的执行行数统计操作。估计实战意义不大。
构造函数:public FirstKeyOnlyFilter()
//FirstKeyOnlyFilter例子
private static void scanFilter12() throws IOException,
UnsupportedEncodingException {
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.rootdir", "hdfs://ncst:9000/hbase");
conf.set("hbase.zookeeper.quorum", "ncst");
HTable ht = new HTable(conf, "users"); //返回每一行中的第一个cell的值
FirstKeyOnlyFilter firstKeyOnlyFilter = new FirstKeyOnlyFilter(); Scan scan = new Scan();
scan.setFilter(firstKeyOnlyFilter);
ResultScanner rs = ht.getScanner(scan);
int i = 0;
for(Result result : rs){
for(Cell cell : result.rawCells()){
System.out.println(new String(CellUtil.cloneRow(cell))+"\t"
+new String(CellUtil.cloneFamily(cell))+"\t"
+new String(CellUtil.cloneQualifier(cell))+"\t"
+new String(CellUtil.cloneValue(cell),"UTF-8")+"\t"
+cell.getTimestamp());
i++;
}
}
//输出总的行数
System.out.println(i);
ht.close();
}
hbase(main):009:0> scan 'users',{FILTER=>'FirstKeyOnlyFilter()'}
ROW COLUMN+CELL
xiaoming column=address:city, timestamp=1441997498965, value=hangzhou
xiaoming01 column=address:contry, timestamp=1442000277200, value=\xE4\xB8\xAD\xE5\x9B\xBD
xiaoming02 column=info:age, timestamp=1441998917594, value=24
xiaoming03 column=info:age, timestamp=1441998919607, value=24
zhangyifei column=address:city, timestamp=1441997499108, value=jieyang
5 row(s) in 0.0240 seconds
HBase Filter及对应Shell的更多相关文章
-
HBase filter shell操作
创建表 create 'test1', 'lf', 'sf' lf: column family of LONG values (binary value) -- sf: column family ...
-
hbase的常用的shell命令&;hbase的DDL操作&;hbase的DML操作
前言 笔者在分类中的hbase栏目之前已经分享了hbase的安装以及一些常用的shell命令的使用,这里不仅仅重新复习一下shell命令,还会介绍hbase的DDL以及DML的相关操作. hbase的 ...
-
hbase各种遍历查询shell语句 包含过滤组合条件
import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; import java.util.Li ...
-
HBase Filter 过滤器之RowFilter详解
前言:本文详细介绍了HBase RowFilter过滤器Java&Shell API的使用,并贴出了相关示例代码以供参考.RowFilter 基于行键进行过滤,在工作中涉及到需要通过HBase ...
-
HBase Filter 过滤器之FamilyFilter详解
前言:本文详细介绍了 HBase FamilyFilter 过滤器 Java&Shell API 的使用,并贴出了相关示例代码以供参考.FamilyFilter 基于列族进行过滤,在工作中涉及 ...
-
HBase Filter 过滤器之QualifierFilter详解
前言:本文详细介绍了 HBase QualifierFilter 过滤器 Java&Shell API 的使用,并贴出了相关示例代码以供参考.QualifierFilter 基于列名进行过滤, ...
-
HBase Filter 过滤器之 ValueFilter 详解
前言:本文详细介绍了 HBase ValueFilter 过滤器 Java&Shell API 的使用,并贴出了相关示例代码以供参考.ValueFilter 基于列值进行过滤,在工作中涉及到需 ...
-
一个自定义 HBase Filter -“通过RowKeys来高性能获取数据”
摘要: 大家在使用HBase和Solr搭建系统中经常遇到的一个问题就是:“我通过SOLR得到了RowKeys后,该怎样去HBase上取数据”.使用现有的Filter性能差劲,网上也没有现成的自定义Fi ...
-
生成HFile文件后倒入数据出现Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.filter.Filter
数据导入的时候出现: at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclar ...
随机推荐
-
在Visual Studio中设置多核并行编译
VS是一款非常强大实用的IDE,是在Windows环境下学习编程的首选软件. 有些时候大一点的工程项目编译要耗时挺长时间,随便修改一下代码就可能要编译将近一分钟,甚至更多.即便在开启的增量编译的情况下 ...
-
owin
app.Properties["Hello"] = System.DateTime.Now; app.Run(async context => await context.R ...
-
[转] GPS坐标转换经纬度及换算方法
GPS坐标和经纬度的算法和概率不太一样,但是我们可能会将他们互通起来用,下面先贴上我做的转换工具:http://map.yanue.net/gps.html.里面实现了gps到谷歌地图百度地图经纬度的 ...
-
自己写的Ext树,Ext3.4,静态全部加载
var load = function(){ /** * 书籍资料目录 */ var bookIT = new Ext.tree.TreeNode({ text:"IT", lea ...
-
转 C#String与string的区别
C#是区分大小写的,但是我却发现C#中同时存在String与string,于是我很困惑,于是我上网搜索了一下,于是我了解了一些小知识. MSDN中对string的说明:string is an ali ...
-
Qt在各平台上的搭建qt-everywhere
Qt for windows7-64bit 在电脑上安装mingw(搜索mingw for windows),将C:\MinGW\bin添加进环境变量,打开命令行输入gcc --version和g++ ...
-
java中import static和import的区别【转】
转自:http://blog.csdn.net/ygc87/article/details/7371254
-
jdk和二进制 常量.变量
java中的jdk和jre之间的关系 二进制和十进制之间的转换问题 使用的方法是碾转相除法:就是让一个数除以2,取余数,除到商为0为止,然后倒着将余数组合起来. 入门案例 HelloWorld /* ...
-
Android内容提供者(Content provider)
使用ContentProvider共享数据 当应用继承ContentProvider类,并重写该类用于提供数据和存储数据的方法,就可以向其他应用共享其数据.虽然使用其他方法也可以对外共享数据,但数据访 ...
-
Docker实战(六)之使用Dockerfile创建镜像
Dockervile是一个文本格式的配置文件,用户可以使用Dockerfile来快速创建自定义镜像. 1.基本结构 Dockerfile由一行行命令语句组成,并且支持以#开头的注释行. 一般而言,Do ...