Mongodb:如何返回查询列表中存在的数组元素

时间:2022-01-05 21:17:54

I have a collection called shops. Structure is like:

我有一个叫做商店的系列。结构如下:

[
     {
          '_id' : id1,
          'details' : {name: 'shopA'},
          'products' : [{
               _id: 'p1',
               details:  {
                    'name': 'product1'
               }
          },{
               _id: 'p2',
               details:  {
                    'name': 'product2'
               }
          }, {
               _id: 'p4',
               details:  {
                    'name': 'product4'
               }
          }
     },{
          '_id' : id2,
          'details' : {name: 'shopB'},
          'products' : [{
               _id: 'p1',
               details:  {
                    'name': 'product1'
               }
          },{
               _id: 'p4',
               details:  {
                    'name': 'product4'
               }
          }, {
               _id: 'p5',
               details:  {
                    'name': 'product5'
               }
          }
     },{
          '_id' : id3,
          'details' : {name: 'shopC'},
          'products' : [{
               _id: 'p1',
               details:  {
                    'name': 'product1'
               }
          },{
               _id: 'p2',
               details:  {
                    'name': 'product2'
               }
          }, {
               _id: 'p3',
               details:  {
                    'name': 'product3'
               }
          }
     },{
          '_id' : id4,
          'details' : {name: 'shopOther'},
          'products' : [{
               _id: 'p10',
               details:  {
                    'name': 'product10'
               }
          },{
               _id: 'p12',
               details:  {
                    'name': 'product12'
               }
          }, {
               _id: 'p13',
               details:  {
                    'name': 'product13'
               }
          }
     }
]

Now user can select some of the products from menu and try to find the shops for those. The result should be all the shops which provide atleast one of the selected items.

现在,用户可以从菜单中选择一些产品,并尝试找到那些商店。结果应该是提供至少一个所选项目的所有商店。

Example,

Suppose users select ['p1', 'p2', 'p3'] //ids Then only three shops id1, id2, id3 will be listed(id4 has none of these items), plus the structure is such that it removes rest of the products of a shop(which were not listed) from the document in the results array.

假设用户选择['p1','p2','p3'] // ids那么将只列出三个商店id1,id2,id3(id4没有这些项目),加上结构是这样的,它删除其余的结果数组中文档中的商店(未列出)的产品。

Is there a way, I can get such result from mongodb directly?

有没有办法,我可以直接从mongodb得到这样的结果?

1 个解决方案

#1


2  

Since you did ask nicely and also so well formed then there is some consideration that similar answers may not actually suit for reference, especially if your experience level with the MongoDB product is low.

既然你确实提出了很好的问题而且形式也很好,那么有一些考虑因素,类似的答案实际上可能不适合参考,特别是如果你的MongoDB产品的经验水平很低。

Options like $redact may seem simple, and they are often well suited. But this is not a case for how you would need to construct the statement:

像$ redact这样的选项看起来很简单,而且它们通常很适合。但这不是你需要如何构造语句的情况:

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$redact": {
    "$cond": {
      "if": {
        "$or": [
          { "$eq": [ "$_id", "p1" ] },
          { "$eq": [ "$_id", "p2" ] },
          { "$eq": [ "$_id", "p3" ] }
        ]
      },
      "then": "$$DESCEND",
      "else": "$$PRUNE"
    }
  }}
])

That works with the "not so obvious" use of $or in an aggregation operator. At least in terms of the correct syntax and form, but it is actually a "complete fail". The reasoning is that because $redact is generally a "recursive" operation, and it inspects at "all levels" of the document and not just at a specific level. So of your in the "top level" the _id assertion will fail as that top level field of the same name is not going to match that condition.

这适用于$或“聚合运算符”的“不太明显”的使用。至少在正确的语法和形式方面,它实际上是“完全失败”。原因是因为$ redact通常是一个“递归”操作,它检查文档的“所有级别”而不仅仅是在特定级别。因此,在“*”中,_id断言将失败,因为同名的*字段不符合该条件。

There really isn't anything else you can really do about this, but considering that _id in the array is actually a "unique" element then you can always perform this operation in an $project stage with the help of $map and $setDifference:

关于这一点你真的没有其他任何事情可做,但考虑到数组中的_id实际上是一个“唯一”元素,那么你总是可以在$ map和$ setDifference的帮助下在$ project阶段执行这个操作:

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$project": {
    "details": 1,
    "products": {
      "$setDifference": [
        { "$map": {
          "input": "$products",
          "as": "el",
          "in": {
            "$cond": {
              "if": { 
                "$or": [
                  { "$eq": [ "$$el._id", "p1" ] },
                  { "$eq": [ "$$el._id", "p2" ] },
                  { "$eq": [ "$$el._id", "p3" ] }
                ]
              },
              "then": "$$el",
              "else": false
            }
          }
        }},
        [false]
      ]
    }
  }}
])

It seems lengthy, but it actually very efficient. The $map operator processes arrays "inline" for each document and acting on each element to produce a new array. The false assertion made under $cond where the condtions are not a match is balanced by considering the "set" of results in comparison to $setDifference, which effectively "filters" the false results from the resulting array, leaving only the valid matches behind.

它似乎很冗长,但它实际上非常有效。 $ map运算符为每个文档处理数组“内联”,并对每个元素进行操作以生成新数组。通过考虑与$ setDifference相比较的“set”结果来平衡在$ cond下进行的错误断言,其中condtions不匹配,这有效地“过滤”了结果数组的错误结果,只留下有效的匹配。

Of course where the _id values or entire objects were not truly "unique" then a "set" would no longer be valid. With this consideration, as well as the truth that the mentioned operators are not available to versions of MongoDB prior to 2.6, then the more tradtional approach is to $unwind the array members and then "filter" them via a $match operation.

当然,_id值或整个对象不是真正“唯一”的,那么“集合”将不再有效。考虑到这一点,以及所提到的运算符在2.6之前的MongoDB版本不可用的事实,那么更传统的方法是$展开数组成员然后通过$ match操作“过滤”它们。

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$unwind": "$products" },
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$group": {    
      "_id": "$_id",
      "details": { "$first": "$details" },
      "products": { "$push": "$products" }
  }}
])

Consideration is given that as per the other examples, the $match phase should be executed first in the pipeline in order to reduce the "possible" documents matching the condition. The "second" phase with $match does the actuall "filtering" of the document elements inside the array when in the "de-normalized" form.

考虑到,根据其他示例,应首先在管道中执行$ match阶段,以减少与条件匹配的“可能”文档。具有$ match的“第二”阶段在处于“去规范化”形式时对数组内的文档元素进行实际“过滤”。

Since the array was "deconstructed" by $unwind, the purpose of $group is to "re-build" the array, "filtered" from the elements that do not match the condition.

由于数组是通过$ unwind“解构”的,因此$ group的目的是“重新构建”数组,从与条件不匹配的元素中“过滤”。

MongoDB also offers the positional $ operator in order to select matched array elements from a query condition. Like so:

MongoDB还提供位置$运算符,以便从查询条件中选择匹配的数组元素。像这样:

db.collection.find(
    { "products._id": { "$in": ["p1","p2","p3"] },
    { "details": 1, "products.$": 1 }
)

But the problem here is that this operator only supports the "first" match on the conditions supplied in the query document. This is a design intent, and as yet there is no strict operator syntax to cater for more than a single match.

但这里的问题是该运算符仅支持查询文档中提供的条件的“第一”匹配。这是一个设计意图,目前还没有严格的运算符语法来满足多个匹配。

So your ultimate approach is currently to use the .aggregate() method in order to actually achieve the match filtering on inner arrays that you desire. Either that or filter the contents responded yourself in client code, depending on how palatable that ultimately is to you.

所以你的最终方法是使用.aggregate()方法来实际实现你想要的内部数组的匹配过滤。无论是过滤内容还是过滤内容都会在客户端代码中自行响应,具体取决于最终对您的满意程度。

#1


2  

Since you did ask nicely and also so well formed then there is some consideration that similar answers may not actually suit for reference, especially if your experience level with the MongoDB product is low.

既然你确实提出了很好的问题而且形式也很好,那么有一些考虑因素,类似的答案实际上可能不适合参考,特别是如果你的MongoDB产品的经验水平很低。

Options like $redact may seem simple, and they are often well suited. But this is not a case for how you would need to construct the statement:

像$ redact这样的选项看起来很简单,而且它们通常很适合。但这不是你需要如何构造语句的情况:

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$redact": {
    "$cond": {
      "if": {
        "$or": [
          { "$eq": [ "$_id", "p1" ] },
          { "$eq": [ "$_id", "p2" ] },
          { "$eq": [ "$_id", "p3" ] }
        ]
      },
      "then": "$$DESCEND",
      "else": "$$PRUNE"
    }
  }}
])

That works with the "not so obvious" use of $or in an aggregation operator. At least in terms of the correct syntax and form, but it is actually a "complete fail". The reasoning is that because $redact is generally a "recursive" operation, and it inspects at "all levels" of the document and not just at a specific level. So of your in the "top level" the _id assertion will fail as that top level field of the same name is not going to match that condition.

这适用于$或“聚合运算符”的“不太明显”的使用。至少在正确的语法和形式方面,它实际上是“完全失败”。原因是因为$ redact通常是一个“递归”操作,它检查文档的“所有级别”而不仅仅是在特定级别。因此,在“*”中,_id断言将失败,因为同名的*字段不符合该条件。

There really isn't anything else you can really do about this, but considering that _id in the array is actually a "unique" element then you can always perform this operation in an $project stage with the help of $map and $setDifference:

关于这一点你真的没有其他任何事情可做,但考虑到数组中的_id实际上是一个“唯一”元素,那么你总是可以在$ map和$ setDifference的帮助下在$ project阶段执行这个操作:

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$project": {
    "details": 1,
    "products": {
      "$setDifference": [
        { "$map": {
          "input": "$products",
          "as": "el",
          "in": {
            "$cond": {
              "if": { 
                "$or": [
                  { "$eq": [ "$$el._id", "p1" ] },
                  { "$eq": [ "$$el._id", "p2" ] },
                  { "$eq": [ "$$el._id", "p3" ] }
                ]
              },
              "then": "$$el",
              "else": false
            }
          }
        }},
        [false]
      ]
    }
  }}
])

It seems lengthy, but it actually very efficient. The $map operator processes arrays "inline" for each document and acting on each element to produce a new array. The false assertion made under $cond where the condtions are not a match is balanced by considering the "set" of results in comparison to $setDifference, which effectively "filters" the false results from the resulting array, leaving only the valid matches behind.

它似乎很冗长,但它实际上非常有效。 $ map运算符为每个文档处理数组“内联”,并对每个元素进行操作以生成新数组。通过考虑与$ setDifference相比较的“set”结果来平衡在$ cond下进行的错误断言,其中condtions不匹配,这有效地“过滤”了结果数组的错误结果,只留下有效的匹配。

Of course where the _id values or entire objects were not truly "unique" then a "set" would no longer be valid. With this consideration, as well as the truth that the mentioned operators are not available to versions of MongoDB prior to 2.6, then the more tradtional approach is to $unwind the array members and then "filter" them via a $match operation.

当然,_id值或整个对象不是真正“唯一”的,那么“集合”将不再有效。考虑到这一点,以及所提到的运算符在2.6之前的MongoDB版本不可用的事实,那么更传统的方法是$展开数组成员然后通过$ match操作“过滤”它们。

db.collection.aggregate([
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$unwind": "$products" },
  { "$match": { "products._id": { "$in": ["p1","p2","p3"] } }},
  { "$group": {    
      "_id": "$_id",
      "details": { "$first": "$details" },
      "products": { "$push": "$products" }
  }}
])

Consideration is given that as per the other examples, the $match phase should be executed first in the pipeline in order to reduce the "possible" documents matching the condition. The "second" phase with $match does the actuall "filtering" of the document elements inside the array when in the "de-normalized" form.

考虑到,根据其他示例,应首先在管道中执行$ match阶段,以减少与条件匹配的“可能”文档。具有$ match的“第二”阶段在处于“去规范化”形式时对数组内的文档元素进行实际“过滤”。

Since the array was "deconstructed" by $unwind, the purpose of $group is to "re-build" the array, "filtered" from the elements that do not match the condition.

由于数组是通过$ unwind“解构”的,因此$ group的目的是“重新构建”数组,从与条件不匹配的元素中“过滤”。

MongoDB also offers the positional $ operator in order to select matched array elements from a query condition. Like so:

MongoDB还提供位置$运算符,以便从查询条件中选择匹配的数组元素。像这样:

db.collection.find(
    { "products._id": { "$in": ["p1","p2","p3"] },
    { "details": 1, "products.$": 1 }
)

But the problem here is that this operator only supports the "first" match on the conditions supplied in the query document. This is a design intent, and as yet there is no strict operator syntax to cater for more than a single match.

但这里的问题是该运算符仅支持查询文档中提供的条件的“第一”匹配。这是一个设计意图,目前还没有严格的运算符语法来满足多个匹配。

So your ultimate approach is currently to use the .aggregate() method in order to actually achieve the match filtering on inner arrays that you desire. Either that or filter the contents responded yourself in client code, depending on how palatable that ultimately is to you.

所以你的最终方法是使用.aggregate()方法来实际实现你想要的内部数组的匹配过滤。无论是过滤内容还是过滤内容都会在客户端代码中自行响应,具体取决于最终对您的满意程度。