ES系列十四、ES聚合分析（聚合分析简介、指标聚合、桶聚合）

一、聚合分析简介

1. ES聚合分析是什么？

聚合分析是数据库中重要的功能特性，完成对一个查询的数据集中数据的聚合计算，如：找出某字段（或计算表达式的结果）的最大值、最小值，计算和、平均值等。ES作为搜索引擎兼数据库，同样提供了强大的聚合分析能力。

对一个数据集求最大、最小、和、平均值等指标的聚合，在ES中称为指标聚合 metric

而关系型数据库中除了有聚合函数外，还可以对查询出的数据进行分组group by，再在组上进行指标聚合。在 ES 中group by 称为分桶，桶聚合 bucketing

ES中还提供了矩阵聚合（matrix）、管道聚合（pipleline），但还在完善中。

2. ES聚合分析查询的写法

在查询请求体中以aggregations节点按如下语法定义聚合分析：

"aggregations" : {

    "<aggregation_name>" : { <!--聚合的名字 -->

        "<aggregation_type>" : { <!--聚合的类型 -->

            <aggregation_body> <!--聚合体：对哪些字段进行聚合 -->

        }

        [,"meta" : {  [<meta_data_body>] } ]? <!--元 -->

        [,"aggregations" : { [<sub_aggregation>]+ } ]? <!--在聚合里面在定义子聚合 -->

    }

    [,"<aggregation_name_2>" : { ... } ]*<!--聚合的名字 -->

}

说明：

aggregations 也可简写为 aggs

3. 聚合分析的值来源

聚合计算的值可以取字段的值，也可是脚本计算的结果。

二、指标聚合

1. max min sum avg

示例1：查询所有记录中年龄的最大值

POST /book1/_search?pretty

{

  "size": ,

  "aggs": {

    "maxage": {

      "max": {

        "field": "age"

      }

    }

  }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "maxage": {

            "value": 54

        }

    }

}

示例2：加上查询条件，查询名字包含'test'的年龄最大值：

POST /book1/_search?pretty

{

  "query":{

     "term":{

         "name":"test"

     }

  },

  "size": ,

  "sort": [

    {

      "age": {

        "order": "desc"

      }

    }

  ],

  "aggs": {

    "maxage": {

      "max": {

        "field": "age"

      }

    }

  }

}

结果2：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": null,

        "hits": [

            {

                "_index": "book1",

                "_type": "english",

                "_id": "6IUkUmUBRzBxBrDgFok2",

                "_score": null,

                "_source": {

                    "name": "test goog my money",

                    "age": [

                        ,

                        ,

                        ,

                    ],

                    "class": "dsfdsf",

                    "addr": "中国"

                },

                "sort": [

                ]

            },

            {

                "_index": "book1",

                "_type": "english",

                "_id": "54UiUmUBRzBxBrDgfIl9",

                "_score": null,

                "_source": {

                    "name": "test goog my money",

                    "age": [

                        ,

                        ,

                    ],

                    "class": "dsfdsf",

                    "addr": "中国"

                },

                "sort": [

                ]

            }

        ]

    },

    "aggregations": {

        "maxage": {

            "value": 54

        }

    }

}

示例3：值来源于脚本，查询所有记录的平均年龄是多少，并对平均年龄加10

POST /book1/_search?pretty

{

  "size":,

  "aggs": {

    "avg_age": {

      "avg": {

        "script": {

          "source": "doc.age.value"

        }

      }

    },

    "avg_age10": {

      "avg": {

        "script": {

          "source": "doc.age.value + 10"

        }

      }

    }

  }

}

结果3：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "avg_age": {

            "value": 7.585365853658536

        },

        "avg_age10": {

            "value": 17.585365853658537

        }

    }

}

示例4：指定field，在脚本中用_value 取字段的值

POST  /book1/_search?pretty

{

  "size":,

  "aggs": {

    "sun_age": {

      "sum": {

          "field":"age",

        "script": {

          "source": "_value * 2"

        }

      }

    }

  }

}

结果4：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "sun_age": {

            "value":

        }

    }

}

示例5：为没有值字段指定值。如未指定，缺失该字段值的文档将被忽略：

POST /book1/_search?pretty

{

  "size":,

  "aggs": {

    "sun_age": {

      "avg": {

          "field":"age",

        "missing":

      }

    }

  }

}

结果5：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "sun_age": {

            "value": 12.847826086956522

        }

    }

}

2. 文档计数 count

示例1：统计银行索引book下年龄为12的文档数量

POST book1/english/_count

{

    "query":{

        "match":{

            "age":

        }

    }

}

结果1：

{

    "count": ,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    }

}

3. Value count 统计某字段有值的文档数

示例1：

POST /book1/_search?size=

{

    "aggs":{

        "age_count":{

            "value_count":{

                "field":"age"

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_count": {

            "value":

        }

    }

}

4. cardinality 值去重计数

示例1：

POST  /book1/_search?size=

{

    "aggs":{

        "age_count":{

            "value_count":{

                "field":"age"

            }

        },

        "name_count":{

            "cardinality":{

                "field":"age"

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },
"aggregations": {

        "name_count": {

            "value": 11

        },

        "age_count": {

            "value": 38

        }

    }
}

说明：有值的38个，去掉重复的之后以一共有11个。

5. stats 统计 count max min avg sum 5个值

示例1：

POST  /book1/_search?size=

{

    "aggs":{

        "age_count":{

            "stats":{

                "field":"age"

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_count": {
"count": 38,

            "min": 1,

            "max": 54,

            "avg": 12.394736842105264,

            "sum": 471

        }

    }

}

6. Extended stats

高级统计，比stats多4个统计结果： 平方和、方差、标准差、平均值加/减两个标准差的区间。

示例1：

POST /book1/_search?size=

{

    "aggs":{

        "age_stats":{

            "extended_stats":{

                "field":"age"

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_stats": {

            "count": ,

            "min": ,

            "max": ,

            "avg": 12.394736842105264,

            "sum": ,
"sum_of_squares": 11049,

            "variance": 137.13365650969527,

            "std_deviation": 11.710408041981085,

            "std_deviation_bounds": {

                "upper": 35.81555292606743,

                "lower": -11.026079241856905

            }
        }

    }

}

7. Percentiles 占比百分位对应的值统计

示例1：

对指定字段（脚本）的值按从小到大累计每个值对应的文档数的占比（占所有命中文档数的百分比），返回指定占比比例对应的值。默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值。如下中间的结果，可以理解为：占比为50%的文档的age值 <= 12，或反过来：age<=12的文档数占总命中文档数的50%。

POST /book1/_search?size=

{

    "aggs":{

        "age_percentiles":{

            "percentiles":{

                "field":"age"

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },
"aggregations": {

        "age_percentiles": {

            "values": {

                "1.0": 1,

                "5.0": 1,

                "25.0": 1,

                "50.0": 12,

                "75.0": 13,

                "95.0": 40.600000000000016,

                "99.0": 54

            }

        }

    }
}

示例2：指定分位值（占比50%，96%，99%的范围值分别是多少）

POST /book1/_search?size=

{

    "aggs":{

        "age_percentiles":{

            "percentiles":{

                "field":"age",

                "percents" : [,,]

            }

        }

    }

}

结果2：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_percentiles": {

            "values": {

                "50.0": ,

                "96.0": 44.779999999999966,

                "99.0":

            }

        }

    }

}

说明：50%的数值<= 12, 96%的数值<= 96%, 99%的数值<= 54

8. Percentiles rank 统计值小于等于指定值的文档占比

示例1：统计年龄小于25和30的文档的占比，和第7项相反

POST /book1/_search?size=

{

    "aggs":{

        "aggs_perc_rank":{

            "percentile_ranks":{

                "field":"age",

                "values" : [,]

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "aggs_perc_rank": {

            "values": {

                "12.0": 71.05263157894737,

                "35.0": 92.76315789473685

            }

        }

    }

}

结果说明：年龄小于12的文档占比为71%，年龄小于35的文档占比为92%，

9. Geo Bounds aggregation 求文档集中的地理位置坐标点的范围

参考官网链接：

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-geobounds-aggregation.html

10. Geo Centroid aggregation 求地理位置中心点坐标值

参考官网链接：

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-geocentroid-aggregation.html

三、桶聚合

ES系列十四、ES聚合分析（聚合分析简介、指标聚合、桶聚合）

1. Terms Aggregation 根据字段值项分组聚合

示例1：

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age"

            }

        }

    }

}

说明：相当于group by age

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                }

            ]

        }

    }

}

结果说明：

"doc_count_error_upper_bound": 0：文档计数的最大偏差值

"sum_other_doc_count": 1：未返回的其他文档数，不在桶里的文档数量

默认情况下返回按文档计数从高到低的前10个分组：

示例2：sizz可以指定返回多少组数

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age",

                "size":

            }

        }

    }

}

结果2：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                }

            ]

        }

    }

}

示例3：每个分组上显示偏差值

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age",

                "size":,

                 "show_term_doc_count_error": true

            }

        }

    }

}

结果3：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": ,

                    "doc_count": ,

                    "doc_count_error_upper_bound":

                },

                {

                    "key": ,

                    "doc_count": ,

                    "doc_count_error_upper_bound":

                },

                {

                    "key": ,

                    "doc_count": ,

                    "doc_count_error_upper_bound":

                },

                {

                    "key": ,

                    "doc_count": ,

                    "doc_count_error_upper_bound":

                },

                {

                    "key": ,

                    "doc_count": ,

                    "doc_count_error_upper_bound":

                }

            ]

        }

    }

}

示例4：shard_size 指定每个分片上返回多少个分组

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age",

                "size":,

                 "shard_size":

            }

        }

    }

}

结果4：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                }

            ]

        }

    }

}

order 指定分组的排序

示例5：根据分组值"_key"排序

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age",

                "size":,
"order":{"_key":"desc"}

            }

        }

    }

}

结果5：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                }

            ]

        }

    }

}

示例6：根据文档计数"_count"排序

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age",

                "size":,

                 "order":{"_count":"desc"}

            }

        }

    }

}

结果6：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                },

                {

                    "key": ,

                    "doc_count":

                }

            ]

        }

    }

}

示例7：取分组指标值排序

POST /book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "terms":{

                "field":"age",

                 "order":{"max_age":"desc"}

            },

            "aggs":{

                "max_age":{

                    "max":{

                        "field":"age"

                    }

                },

                "min_age":{

                    "min":{

                        "field":"age"

                    }

                }

            }

        }

    }

}

说明：先根据age 分组，再计算每个组的最大最小值，最后根据最大值倒排

示例8：筛选分组-正则表达式匹配值

POST book1/_search?size=

{

    "aggs":{

        "tags":{

            "terms":{

                "field":"name",

                "include":"里*",

                "exclude":"test*"

            }

        }

    }

}

结果8：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "tags": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": "里",

                    "doc_count":

                }

            ]

        }

    }

}

示例9：筛选分组-指定值列表

POST book1/_search?size=

{

    "aggs":{

        "Chinese":{

            "terms":{

                "field":"name",

                "include":["里","国"]

            }

        },

        "Test":{

            "terms":{

                "field":"name",

                "exclude":["test","the"]

            }

        }

    }

}

结果9：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "Test": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": "里",

                    "doc_count":

                },

                {

                    "key": "否",

                    "doc_count":

                },

                {

                    "key": "a",

                    "doc_count":

                },

                {

                    "key": "default",

                    "doc_count":

                },

                {

                    "key": "document",

                    "doc_count":

                },

                {

                    "key": "for",

                    "doc_count":

                },

                {

                    "key": "absolute",

                    "doc_count":

                },

                {

                    "key": "account",

                    "doc_count":

                },

                {

                    "key": "accurate",

                    "doc_count":

                },

                {

                    "key": "documents",

                    "doc_count":

                }

            ]

        },

        "Chinese": {

            "doc_count_error_upper_bound": ,

            "sum_other_doc_count": ,

            "buckets": [

                {

                    "key": "国",

                    "doc_count":

                }

            ]

        }

    }

}

示例10：根据脚本计算值分组

POST book1/_search?size=

{

    "aggs":{

        "name":{

            "terms":{

                "script":{

                    "source":"doc['age'].value + doc.age.value",

                    "lang": "painless"

                }

            }

         }

     }

}

说明：脚本取值的方式doc['age'].value 或者 doc.age.value

结果10：

{
    "took": 18,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 41,
        "max_score": 0,
        "hits": []
    },
    "aggregations": {
        "name": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "24",
                    "doc_count": 16
                },
                {
                    "key": "2",
                    "doc_count": 11
                },
                {
                    "key": "0",
                    "doc_count": 8
                },
                {
                    "key": "22",
                    "doc_count": 1
                },
                {
                    "key": "26",
                    "doc_count": 1
                },
                {
                    "key": "28",
                    "doc_count": 1
                },
                {
                    "key": "32",
                    "doc_count": 1
                },
                {
                    "key": "42",
                    "doc_count": 1
                },
                {
                    "key": "66",
                    "doc_count": 1
                }
            ]
        }
    }
}

2. filter Aggregation 对满足过滤查询的文档进行聚合计算

示例1：在查询命中的文档中选取符合过滤条件的文档进行聚合，先过滤再聚合（和上面的示例9示例9：筛选分组，区分开：先聚合再过滤）

POST book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "filter":{

                "match":{"name":"test"}

            },

        "aggs":{

            "avg_age":{

                "avg":{"field":"age" }

            }

         }

       }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "doc_count": ,

            "avg_age": {

                "value": 19.9

            }

        }

    }

}

3. Filters Aggregation 多个过滤组聚合计算

示例1：分别统计包含‘test’,和‘里’的文档的个数

POST book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "filters":{

                "filters":{

                    "test":{

                        "match":{"name":"test"}

                    },

                    "china":{

                        "match":{"name":"里"}

                    }

                }

            }

        }

    }

}

结果：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "buckets": {

                "china": {

                    "doc_count":

                },

                "test": {

                    "doc_count":

                }

            }

        }

    }

}

例如：日志中选出 error和warning日志的个数，作日志预警

GET logs/_search

{

  "size": ,

  "aggs": {

    "messages": {

      "filters": {

        "filters": {

          "errors": {

            "match": {

              "body": "error"

            }

          },

          "warnings": {

            "match": {

              "body": "warning"

            }

          }

        }

      }

    }

  }

}

示例2：为其他值组指定key

POST book1/_search?size=

{

    "aggs":{

        "age_terms":{

            "filters":{

                "other_bucket_key": "other_messages",

                "filters":{

                    "test":{

                        "match":{"name":"test"}

                    },

                    "china":{

                        "match":{"name":"里"}

                    }

                }

            }

        }

    }

}

结果2：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_terms": {

            "buckets": {

                "china": {

                    "doc_count":

                },

                "test": {

                    "doc_count":

                },

                "other_messages": {

                    "doc_count":

                }

            }

        }

    }

}

4. Range Aggregation 范围分组聚合

示例1：

POST book1/_search?size=

{

    "aggs":{

        "age_range":{

            "range":{

                "field":"age",

                "keyed":true,

                "ranges":[

                    {

                        "to":,

                        "key":"TW"

                    },

                    {

                        "from":,

                        "to":,

                        "key":"TH"

                    },

                    {

                        "from":,

                        "key":"SIX"

                    }

                ]

            }

        }

    }

}

结果1：

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "age_range": {

            "buckets": {

                "TW": {

                    "to": ,

                    "doc_count":

                },

                "TH": {

                    "from": ,

                    "to": ,

                    "doc_count":

                },

                "SIX": {

                    "from": ,

                    "doc_count":

                }

            }

        }

    }

}

5. Date Range Aggregation 时间范围分组聚合

示例1：

POST /bank/_search?size=

{

  "aggs": {

    "range": {

      "date_range": {

        "field": "date",

        "format": "MM-yyy",

        "ranges": [

          {

            "to": "now-10M/M"

          },

          {

            "from": "now-10M/M"

          }

        ]

      }

    }

  }

}

结果1：

{

  "took": ,

  "timed_out": false,

  "_shards": {

    "total": ,

    "successful": ,

    "skipped": ,

    "failed":

  },

  "hits": {

    "total": ,

    "max_score": ,

    "hits": []

  },

  "aggregations": {

    "range": {

      "buckets": [

        {

          "key": "*-2017-08-01T00:00:00.000Z",

          "to": ,

          "to_as_string": "2017-08-01T00:00:00.000Z",

          "doc_count":

        },

        {

          "key": "2017-08-01T00:00:00.000Z-*",

          "from": ,

          "from_as_string": "2017-08-01T00:00:00.000Z",

          "doc_count":

        }

      ]

    }

  }

}

6. Date Histogram Aggregation 时间直方图（柱状）聚合

就是按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day (1d), hour (1h), minute (1m), second (1s) 间隔聚合或指定的时间间隔聚合。

示例1：

POST /bank/_search?size=

{

  "aggs": {

    "sales_over_time": {

      "date_histogram": {

        "field": "date",

        "interval": "month"

      }

    }

  }

}

结果1：

{

  "took": ,

  "timed_out": false,

  "_shards": {

    "total": ,

    "successful": ,

    "skipped": ,

    "failed":

  },

  "hits": {

    "total": ,

    "max_score": ,

    "hits": []

  },

  "aggregations": {

    "sales_over_time": {

      "buckets": []

    }

  }

}

7. Missing Aggregation 缺失值的桶聚合

示例：统计没有值的文档的数量

POST /book/_search?size=

{

    "aggs" : {

        "account_without_a_age" : {

            "missing" : { "field" : "age" }

        }

    }

}

结果1:

{

    "took": ,

    "timed_out": false,

    "_shards": {

        "total": ,

        "successful": ,

        "skipped": ,

        "failed":

    },

    "hits": {

        "total": ,

        "max_score": ,

        "hits": []

    },

    "aggregations": {

        "account_without_age": {

            "doc_count":

        }

    }

}

8. Geo Distance Aggregation 地理距离分区聚合

参考官网链接：

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geodistance-aggregation.html

秒客网

ES系列十四、ES聚合分析（聚合分析简介、指标聚合、桶聚合）

一、聚合分析简介

1. ES聚合分析是什么？

2. ES聚合分析查询的写法

3. 聚合分析的值来源

二、指标聚合

1. max min sum avg

2. 文档计数 count

3. Value count 统计某字段有值的文档数

4. cardinality 值去重计数

5. stats 统计 count max min avg sum 5个值

6. Extended stats

7. Percentiles 占比百分位对应的值统计

8. Percentiles rank 统计值小于等于指定值的文档占比

9. Geo Bounds aggregation 求文档集中的地理位置坐标点的范围

10. Geo Centroid aggregation 求地理位置中心点坐标值

三、桶聚合

1. Terms Aggregation 根据字段值项分组聚合

2. filter Aggregation 对满足过滤查询的文档进行聚合计算

3. Filters Aggregation 多个过滤组聚合计算

4. Range Aggregation 范围分组聚合

5. Date Range Aggregation 时间范围分组聚合

6. Date Histogram Aggregation 时间直方图（柱状）聚合

7. Missing Aggregation 缺失值的桶聚合

8. Geo Distance Aggregation 地理距离分区聚合

相关文章