MongoDB:哪个更快?对字符串进行Regex搜索还是对数组进行直接搜索?

时间:2021-05-30 04:15:19

My application currently performs a regex search on a text field which is a comma separated Objectids. According to Mongodb documentation, Mongo uses indexes while doing regex searches.

我的应用程序目前在文本字段(一个逗号分隔的目标)上执行regex搜索。根据Mongodb文档,Mongo在执行regex搜索时使用索引。

My initial thought was to use an array to store the ObjectIds instead of using the string. But will the array search have better performance than the regex search since both are using indexes?

我最初的想法是使用一个数组来存储目标,而不是使用字符串。但是数组搜索会比regex搜索有更好的性能吗?因为它们都使用索引。

1 个解决方案

#1


2  

Using an array of ObjectIds instead of a comma-separated list of ObjectId strings is the way to go here.

使用一个objective数组而不是用逗号分隔的objective字符串列表是进入这里的方法。

  1. An array will use less space: an ObjectId string is 24 characters while a BSON ObjectId is 12 bytes.
  2. 数组将使用更少的空间:一个ObjectId字符串是24个字符,而BSON ObjectId是12个字节。
  3. An array index is more effective: for a regex search that isn't rooted to the beginning of the text (i.e. not starting with ^), the entire index must be searched O(n) while with an array, each element has its own multikey index entry O(log n).
  4. 数组索引更有效:正则表达式搜索,根源不是文本的开始(即不是从^),整个指数必须搜索O(n),而一个数组,每个元素都有自己的多键索引条目O(log n)。
  5. The size of an index entry must be less than 1024 bytes, which would limit you to about 42 ObjectIds in a text field.
  6. 索引条目的大小必须小于1024字节,这将限制您在文本字段中使用大约42个目标。
  7. Array elements are atomically modifiable: you can use array update operators to directly modify individual elements.
  8. 数组元素是原子可修改的:可以使用数组更新操作符直接修改单个元素。

#1


2  

Using an array of ObjectIds instead of a comma-separated list of ObjectId strings is the way to go here.

使用一个objective数组而不是用逗号分隔的objective字符串列表是进入这里的方法。

  1. An array will use less space: an ObjectId string is 24 characters while a BSON ObjectId is 12 bytes.
  2. 数组将使用更少的空间:一个ObjectId字符串是24个字符,而BSON ObjectId是12个字节。
  3. An array index is more effective: for a regex search that isn't rooted to the beginning of the text (i.e. not starting with ^), the entire index must be searched O(n) while with an array, each element has its own multikey index entry O(log n).
  4. 数组索引更有效:正则表达式搜索,根源不是文本的开始(即不是从^),整个指数必须搜索O(n),而一个数组,每个元素都有自己的多键索引条目O(log n)。
  5. The size of an index entry must be less than 1024 bytes, which would limit you to about 42 ObjectIds in a text field.
  6. 索引条目的大小必须小于1024字节,这将限制您在文本字段中使用大约42个目标。
  7. Array elements are atomically modifiable: you can use array update operators to directly modify individual elements.
  8. 数组元素是原子可修改的:可以使用数组更新操作符直接修改单个元素。