尝试使用延续哈希 - NoSQL分页返回页面结果与Riak

时间:2020-12-11 08:48:53

Edit: I added an answer with a more generic approach for NoSQL situations.

编辑:我为NoSQL情况添加了一个更通用的方法的答案。

I am working on a project using Riak (with LevelDB).

我正在使用Riak(使用LevelDB)开展一个项目。

Using the REST API that Riak offers, I am able to get data based on indexes and a range, which returns the results sorted alpha-num by the index, and a continuation hash.

使用Riak提供的REST API,我能够根据索引和范围获取数据,该范围返回按索引排序的alpha-num结果和延续哈希。

Example call: http://server/buckets/bucketname/index/someindex_int/333333333/555555555?max_results=10&return_terms=true&continuation=somehashhere

示例调用:http:// server / buckets / bucketname / index / someindex_int / 333333333/555555555?max_results = 10&return_terms = true&continuation = somehashhere

Example results:

{
results: [
{
about_river: "12312"
},
{
balloon_tall: "45345"
},
{
basket_written: "23434523"
}
],
continuation: "g2987392479789879087987asdfasdf="
}

I am also making a separate call without specifying max_results and return_terms to get a count of the docs that are in the set. I will know the number of docs per set and the total number of docs, which easily lets us know the number of "pages".

我也在进行单独的调用而没有指定max_results和return_terms来获取集合中文档的计数。我将知道每组文档的数量和文档的总数,这很容易让我们知道“页面”的数量。

While I am able to make a call for each set of documents based on the hash, then receive a next hash with the results set, I am looking for a way to predict the hashes, therefore pre-populate the client with pagination links.

虽然我能够根据哈希调用每组文档,然后接收结果集的下一个哈希,我正在寻找一种预测哈希的方法,因此预先用分页链接填充客户端。

Is this possible? Are the hashes dynamic based on the index/range info or are they some random value generated by the node your data is returned from?

这可能吗?散列是基于索引/范围信息动态还是由您返回数据的节点生成的随机值?

A coworker has mentioned that the hashes are based on what node you are hitting in the cluster, but I am unable to find documentation on this.

同事已经提到哈希基于您在群集中遇到的节点,但我无法找到关于此的文档。

Secondarily, the idea was brought up to cycle through the entire set in the background to get the hashes. This will work, but seems pretty expensive.

其次,这个想法被提出来在背景中循环整个集合来获得哈希。这可行,但似乎相当昂贵。

I am brand new to Riak and any advice here would be great. I am not able to find any good examples of pagination with Riak. The one that did exist is gone from the internet as far as I can tell.

我是Riak的新手,这里的任何建议都会很棒。我无法找到任何与Riak分页的好例子。据我所知,确实存在的那个已经从互联网上消失了。

3 个解决方案

#1


1  

No, the continuation is not "predictable" nor is anything your co-worker saying correct.

不,延续不是“可预测的”,也不是你的同事说的正确。

Unfortunately there is no way to know the total number of objects in the range specified except for querying the range without the max_results parameter as you are doing (outside of a 1:1 relation between index key and object key, obviously).

遗憾的是,除了查询没有max_results参数的范围之外,没有办法知道指定范围内的对象总数(显然,在索引键和对象键之间的1:1关系之外)。

#2


0  

The other answer was the answer I needed, but with some help from CodingHorror, I came up with the answer I wanted.

另一个答案是我需要的答案,但在CodingHorror的帮助下,我想出了我想要的答案。

No pagination. With no pagination, only getting the hash for the next results set is no problem, in fact, it's ideal for my use-case. Just merge that next set onto your existing set(s). But don't let it go on forever.

没有分页。没有分页,只获得下一个结果集的哈希是没有问题的,事实上,它是我的用例的理想选择。只需将下一组合并到现有的集合中即可。但是不要让它永远持续下去。

My inspiration: http://blog.codinghorror.com/the-end-of-pagination/

我的灵感:http://blog.codinghorror.com/the-end-of-pagination/

Thanks, Jeff Atwood!

谢谢,杰夫阿特伍德!

#3


0  

Ain't the number of results in the response the same?

响应中的结果数量是否相同?

something like

RiakFuture<SearchOperation.Response, BinaryValue> searchResult = client.executeAsync(searchOp);
searchResult.await();
com.basho.riak.client.core.operations.SearchOperation.Response response = searchResult.get();
logger.debug("number of results {} ", response.numResults());

#1


1  

No, the continuation is not "predictable" nor is anything your co-worker saying correct.

不,延续不是“可预测的”,也不是你的同事说的正确。

Unfortunately there is no way to know the total number of objects in the range specified except for querying the range without the max_results parameter as you are doing (outside of a 1:1 relation between index key and object key, obviously).

遗憾的是,除了查询没有max_results参数的范围之外,没有办法知道指定范围内的对象总数(显然,在索引键和对象键之间的1:1关系之外)。

#2


0  

The other answer was the answer I needed, but with some help from CodingHorror, I came up with the answer I wanted.

另一个答案是我需要的答案,但在CodingHorror的帮助下,我想出了我想要的答案。

No pagination. With no pagination, only getting the hash for the next results set is no problem, in fact, it's ideal for my use-case. Just merge that next set onto your existing set(s). But don't let it go on forever.

没有分页。没有分页,只获得下一个结果集的哈希是没有问题的,事实上,它是我的用例的理想选择。只需将下一组合并到现有的集合中即可。但是不要让它永远持续下去。

My inspiration: http://blog.codinghorror.com/the-end-of-pagination/

我的灵感:http://blog.codinghorror.com/the-end-of-pagination/

Thanks, Jeff Atwood!

谢谢,杰夫阿特伍德!

#3


0  

Ain't the number of results in the response the same?

响应中的结果数量是否相同?

something like

RiakFuture<SearchOperation.Response, BinaryValue> searchResult = client.executeAsync(searchOp);
searchResult.await();
com.basho.riak.client.core.operations.SearchOperation.Response response = searchResult.get();
logger.debug("number of results {} ", response.numResults());