如何最好通过流星'尾随-f'mongo中的大型集合?

时间:2021-08-05 16:47:22

I have a collection in a mongo database that I append some logging-type of information. I'm trying to figure out the most efficient/simplest method to "tail -f" that in a meteor app - as a new document is added to the collection, it should be sent to the client, who should append it to the end of the current set of documents in the collection.

我在mongo数据库中有一个集合,我附加了一些日志记录类型的信息。我试图找出最有效/最简单的方法来“尾随-f”在流星应用程序中 - 当一个新文档被添加到集合中时,它应该发送给客户端,应该将它追加到最后集合中的当前文档集。

The client isn't going to be sent nor keep all of the documents in the collection, likely just the last ~100 or so.

客户端不会被发送也不会保留集合中的所有文档,可能只是最后的~100左右。

Now, from a Mongo perspective, I don't see a way of saying "the last N documents in the collection" such that we wouldn't need to apply any sort at all. It seems like the best option available is doing natural sort descending, then a limit call, so something like what's listed in the mongo doc on $natural

现在,从Mongo的角度来看,我没有看到说“集合中的最后N个文档”的方式,因此我们根本不需要应用任何类型。看起来最好的选择是进行自然降序,然后是限制调用,所以类似于$ natural上的mongo doc中列出的内容

db.collection.find().sort( { $natural: -1 } )

So, on the server side AFAICT the way of publishing this 'last 100 documents' Meteor collection would be something like:

因此,在服务器端AFAICT发布这个“最后100个文件”Meteor集合的方式将是这样的:

Meteor.publish('logmessages', function () {
  return LogMessages.find({}, { sort: { $natural: -1 }, limit: 100 });
});

Now, from a 'tail -f' perspective, this seems to have the right effect of sending the 'last 100 documents' to the server, but does so in the wrong order (the newest document would be at the start of the Meteor collection instead of at the end).

现在,从'tail -f'的角度来看,这似乎具有将'最后100个文档'发送到服务器的正确效果,但是以错误的顺序发送(最新的文档将在Meteor集合的开头)而不是在结束时)。

On the client side, this seems to mean needing to (unfortunately) reverse the collection. Now, I don't see a reverse() in the Meteor Collection docs and sorting by $natural: 1 doesn't work on the client (which seems reasonable, since there's no real Mongo context). In some cases, the messages will have timestamps within the documents and the client could sort by that to get the 'natural order' back, but that seems kind of hacky.

在客户端,这似乎意味着需要(不幸的是)反转集合。现在,我没有在Meteor Collection文档中看到reverse()并且按$ natural排序:1在客户端上不起作用(这似乎是合理的,因为没有真正的Mongo上下文)。在某些情况下,消息将在文档中包含时间戳,并且客户端可以按此排序以获得“自然顺序”,但这似乎有点hacky。

In any case, it feels like I'm likely missing a much simpler way have a live 'last 100 documents inserted into the collection' collection published from mongo through meteor. :)

在任何情况下,感觉我可能错过了一种更简单的方式,即从mongo通过流星发布的“最后100个文件插入集合”集合。 :)

Thanks!

谢谢!

EDIT - looks like if I change the collection in Mongo to a capped collection, then the server could create a tailable cursor to efficiently (and quickly) get notified of new documents added to the collection. However, it's not clear to me if/how to get the server to do so through a Meteor collection.

编辑 - 看起来如果我将Mongo中的集合更改为上限集合,那么服务器可以创建一个可用的光标,以便有效(并且快速)获得添加到集合中的新文档的通知。但是,我不清楚是否/如何通过Meteor集合让服务器这样做。

An alternative that seems a little less efficient but doesn't require switching to a capped collection (AFAICT) is using Smart Collections which does tailing of the oplog so at least it's event-driven instead of polling, and since all the operations in the source collection will be inserts, it seems like it'd still be pretty efficient. Unfortunately, AFAICT I'm still left with the sorting issues since I don't see how to define the server side collection as 'last 100 documents inserted'. :(

另一个似乎效率稍低但不需要切换到上限集合(AFAICT)的替代方案是使用智能集合来执行oplog的拖尾,因此至少它是事件驱动而不是轮询,并且因为源中的所有操作收集将是插入,似乎它仍然非常有效。不幸的是,AFAICT我仍然留下了排序问题,因为我没有看到如何将服务器端集合定义为“最后插入100个文档”。 :(

If there is a way of creating a collection in Mongo as a query of another ("materialized view" of sorts), then maybe I could create a log-last-100 "collection view" in Mongo, and then Meteor would be able to just publish/subscribe the entire pseudo-collection?

如果有一种方法可以在Mongo中创建一个集合作为另一个集合的查询(“物化视图”),那么也许我可以在Mongo中创建一个log-last-100“集合视图”,然后Meteor将能够只是发布/订阅整个伪集合?

1 个解决方案

#1


3  

For insert-only data, $natural should get you the same results as indexing on timestamp and sorting so that's a good idea. The reverse thing is unfortunate; I think you have a couple choices:

对于仅插入数据,$ natural应该为您提供与时间戳和排序索引相同的结果,这是一个好主意。相反的是不幸的;我想你有几个选择:

  1. use $natural and do the reverse yourself
  2. 使用$ natural并自行完成
  3. add timestamp, still use $natural
  4. 添加时间戳,仍然使用$ natural
  5. add timestamp, index by time, sort
  6. 添加时间戳,按时间索引,排序

'#1' - For 100 items, doing the reverse client-side should be no problem even for mobile devices and that will off-load it from the server. You can use .fetch() to convert to an array and then reverse it to maintain order without needing to use timestamps. You'll be playing in normal array-land though; no more nice mini-mongo features so do any filtering first before reversing.

'#1' - 对于100个项目,即使对于移动设备,执行反向客户端应该没有问题,并且将从服务器卸载它。您可以使用.fetch()转换为数组,然后将其反转以维护顺序,而无需使用时间戳。你会在正常的阵地中玩;没有更好的mini-mongo功能,所以在倒车之前先进行任何过滤。

'#2' - This one is interesting because you don't have to use an index but you can still use the timestamp on the client to sort the records. This gives you the benefit of staying in mini-mongo-land.

'#2' - 这个很有趣,因为您不必使用索引,但您仍然可以使用客户端上的时间戳来对记录进行排序。这为您提供了入住迷你蒙古土地的好处。

'#3' - Costs space on the db but its the most straight-forward

'#3' - 数据库上的成本空间,但它是最直接的

If you don't need the capabilities of mini-mongo (or are comfortable doing array filtering yourself) then #1 is probably best.

如果你不需要mini-mongo的功能(或者你自己很容易进行阵列过滤),那么#1可能是最好的。

Unfortunately MongoDB doesn't have views so can't do your log-last-100 view idea (although that would be a nice feature).

不幸的是MongoDB没有视图,所以不能做你的log-last-100视图的想法(虽然这将是一个很好的功能)。

Beyond the above, keep an eye on your subscription life-cycle so users don't continually pull down log updates in the background when not viewing the log. I could see that quickly becoming a performance killer.

除此之外,请密切注意您的订阅生命周期,以便用户在不查看日志时不会在后台持续下拉日志更新。我可以看到很快成为性能杀手。

#1


3  

For insert-only data, $natural should get you the same results as indexing on timestamp and sorting so that's a good idea. The reverse thing is unfortunate; I think you have a couple choices:

对于仅插入数据,$ natural应该为您提供与时间戳和排序索引相同的结果,这是一个好主意。相反的是不幸的;我想你有几个选择:

  1. use $natural and do the reverse yourself
  2. 使用$ natural并自行完成
  3. add timestamp, still use $natural
  4. 添加时间戳,仍然使用$ natural
  5. add timestamp, index by time, sort
  6. 添加时间戳,按时间索引,排序

'#1' - For 100 items, doing the reverse client-side should be no problem even for mobile devices and that will off-load it from the server. You can use .fetch() to convert to an array and then reverse it to maintain order without needing to use timestamps. You'll be playing in normal array-land though; no more nice mini-mongo features so do any filtering first before reversing.

'#1' - 对于100个项目,即使对于移动设备,执行反向客户端应该没有问题,并且将从服务器卸载它。您可以使用.fetch()转换为数组,然后将其反转以维护顺序,而无需使用时间戳。你会在正常的阵地中玩;没有更好的mini-mongo功能,所以在倒车之前先进行任何过滤。

'#2' - This one is interesting because you don't have to use an index but you can still use the timestamp on the client to sort the records. This gives you the benefit of staying in mini-mongo-land.

'#2' - 这个很有趣,因为您不必使用索引,但您仍然可以使用客户端上的时间戳来对记录进行排序。这为您提供了入住迷你蒙古土地的好处。

'#3' - Costs space on the db but its the most straight-forward

'#3' - 数据库上的成本空间,但它是最直接的

If you don't need the capabilities of mini-mongo (or are comfortable doing array filtering yourself) then #1 is probably best.

如果你不需要mini-mongo的功能(或者你自己很容易进行阵列过滤),那么#1可能是最好的。

Unfortunately MongoDB doesn't have views so can't do your log-last-100 view idea (although that would be a nice feature).

不幸的是MongoDB没有视图,所以不能做你的log-last-100视图的想法(虽然这将是一个很好的功能)。

Beyond the above, keep an eye on your subscription life-cycle so users don't continually pull down log updates in the background when not viewing the log. I could see that quickly becoming a performance killer.

除此之外,请密切注意您的订阅生命周期,以便用户在不查看日志时不会在后台持续下拉日志更新。我可以看到很快成为性能杀手。