Elasticsearch 介绍 Elasticsearch 7:快速上手 Elasticsearch 7:关于 Index、Type、Document Elasticsearch 7:安装与启动 Elasticsearch 7:Kibana 的使用 Elasticsearch 7:下载历史版本 Elasticsearch 7:文档唯一性 Elasticsearch 7:默认端口与端口设置 Elasticsearch 7:创建和删除索引 Elasticsearch 7:自定义 mapping 和 settings Elasticsearch 7:设置索引副本数量和分片数量 Elasticsearch 7:查看所有索引 Elasticsearch 7:数据类型 Elasticsearch 7:字符串类型 keyword 、text Elasticsearch 7:数组 Elasticsearch 7:添加和更新文档 Elasticsearch 7:通过 _bulk 批量添加文档 Elasticsearch 7:使用 from 、size 进行分页查询 Elasticsearch 7:查询中使用 sort 进行排序 Elasticsearch 7:查询结果只展示部分字段 Elasticsearch 7:查询结果中展示 _version 字段 Elasticsearch 7:使用 ignore_above 限制字符串长度 Elasticsearch 7:动态映射 Elasticsearch 7:doc_values 属性 Elasticsearch 7:刷新周期 refresh_interval Elasticsearch 7:使用 _refresh 刷新索引 Elasticsearch 7:分片(shard)限制 Elasticsearch 7:使用 _cat thread_pool 查询线程池运行情况 Elasticsearch 7:事务日志 translog Elasticsearch 7:文档 _id 的长度限制 Elasticsearch 7:分片 shard Elasticsearch 7:滚动查询 Elasticsearch 7:聚合查询 Elasticsearch 7:索引模板 Elasticsearch 7:获取文档所属的 shard Elasticsearch 7:获取版本号 Elasticsearch 7:获取指定 shard 中的文档 Elasticsearch 7:获取 shard 统计信息 Elasticsearch 7:搜索实战 Elasticsearch 7:Python 客户端 Elasticsearch 7:Java TransportClient API 客户端 Elasticsearch 7:Java REST Client API 客户端 Elasticsearch:将 SQL 转换为 DSL Elasticsearch 6 快速上手 Elasticsearch 5 快速上手 Elasticsearch 5:禁止自动创建索引 Elasticsearch 5:禁止动态增加字段 Elasticsearch 产品版本支持周期 基于 Elasticsearch 的站内搜索引擎实战

Elasticsearch 7:聚合查询


#Elasticsearch


示例1

创建索引和导入数据

创建索引:

PUT student
{
  "mappings" : {
    "properties" : {
      "name" : {
        "type" : "keyword"
      },
      "age" : {
        "type" : "integer"
      },
      "height": {
        "type": "integer"
      }
    }
  }
}

使用 _bulk 创建多个文档:

POST _bulk
{ "index" : { "_index" : "student", "_id" : "1" } }
{ "name" : "张三", "age": 12 }
{ "index" : { "_index" : "student", "_id" : "2" } }
{ "name" : "李四", "age": 10,  "height": 112 }
{ "index" : { "_index" : "student", "_id" : "3" } }
{ "name" : "王五", "age": 11, "height": 108 }
{ "index" : { "_index" : "student", "_id" : "4" } }
{ "name" : "陈六", "age": 11, "height": 111 }

一共4条数据,其中张三没有身高 height 的数据。

查询所有数据

# 请求
GET student/_search

# 响应 
... 省略

总人数

# 请求
POST student/_count

# 响应
{
  "count" : 4,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

11 岁的学生总人数

方法1:

# 请求
POST student/_count
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {"age": 11}
        }
      ]
    }
  }
}

# 响应
{
  "count" : 2,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}

方法2:查询后进行聚合:

# 请求
POST student/_search
{
  "query": {
    "bool": {
      "must": [
        {"term": {"age": 11} }
      ]
    }
  },
  "aggs":{
    "age_count": {
      "terms": {"field": "age"}
    }
  },
  "size": 0
}

# 响应
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 11,
          "doc_count" : 2
        }
      ]
    }
  }
}

各个年龄的人数分布

方法1:

# 请求
POST student/_search
{
  "aggs":{
    "age_count": {
      "terms": {"field": "age"}
    }
  },
  "size": 0
}

# 响应
{
  "took" : 71,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age_count" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 11,
          "doc_count" : 2
        },
        {
          "key" : 10,
          "doc_count" : 1
        },
        {
          "key" : 12,
          "doc_count" : 1
        }
      ]
    }
  }
}

可以看到,11 岁的有2个,10岁的1个,12岁的1个。

注意,在 ES 的关键词中 aggs 和 aggregations 等价。

方式2:

# 请求
POST student/_search
{
  "size": 0,
  "aggregations": {
    "group_by_age": {
      "aggregations": {
        "count_age": {
          "value_count": {
            "field": "_index"
          }
        }
      },
      "terms": {
        "field": "age"
      }
    }
  }
}

# 响应
{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_age" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 11,
          "doc_count" : 2,
          "count_age" : {
            "value" : 2
          }
        },
        {
          "key" : 10,
          "doc_count" : 1,
          "count_age" : {
            "value" : 1
          }
        },
        {
          "key" : 12,
          "doc_count" : 1,
          "count_age" : {
            "value" : 1
          }
        }
      ]
    }
  }
}

各个身高的人数分布

# 请求
POST student/_search
{
  "aggs":{
    "group_by_height": {
      "terms": {"field": "height"}
    }
  },
  "size": 0
}

# 响应
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_height" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 108,
          "doc_count" : 1
        },
        {
          "key" : 111,
          "doc_count" : 1
        },
        {
          "key" : 112,
          "doc_count" : 1
        }
      ]
    }
  }
}

无身高数据的张三 未被统计。

学生的平均年龄、最小年龄、最大年龄、年龄之和

方式1:

# 请求
POST student/_search
{
  "aggs":{
    "age_stat": {
      "stats": {"field": "age"}
    }
  },
  "size": 0
}

# 响应
{
  "took" : 45,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age_stat" : {
      "count" : 4,
      "min" : 10.0,
      "max" : 12.0,
      "avg" : 11.0,
      "sum" : 44.0
    }
  }
}

stats 指令,会计算出指定字段的 count、min、max、avg、sum。

方式2:

# 请求
POST student/_search
{
  "aggs":{
    "age_avg": {
      "avg": {"field": "age"}
    },
    "age_sum": {
      "sum": {"field": "age"}
    },
    "age_min": {
      "min": {"field": "age"}
    },
    "age_max": {
      "max": {"field": "age"}
    },
    "age_count": {
      "value_count": {"field": "age"}
    }
  },
  "size": 0
}

# 响应
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "age_sum" : {
      "value" : 44.0
    },
    "age_min" : {
      "value" : 10.0
    },
    "age_avg" : {
      "value" : 11.0
    },
    "age_count" : {
      "value" : 4
    },
    "age_max" : {
      "value" : 12.0
    }
  }
}


每个年龄的平均身高是多少?

# 请求
POST student/_search
{
  "size": 0,
  "aggregations": {
    "group_by_age": {
      "aggregations": {
        "avg_height": {
          "avg": {
            "field": "height"
          }
        }
      },
      "terms": {
        "field": "age"
      }
    }
  }
}

# 响应
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_age" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 11,
          "doc_count" : 2,
          "avg_height" : {
            "value" : 109.5
          }
        },
        {
          "key" : 10,
          "doc_count" : 1,
          "avg_height" : {
            "value" : 112.0
          }
        },
        {
          "key" : 12,
          "doc_count" : 1,
          "avg_height" : {
            "value" : null
          }
        }
      ]
    }
  }
}

获取每个年龄的平均身高,并按照年龄从小到大排序

方式1:

# 请求
POST student/_search
{
  "size": 0,
  "aggregations": { 
    "group_by_age": {
      "aggregations": {
        "avg_height": {
          "avg": {
            "field": "height"
          }
        }
      },
      "terms": {
        "field": "age",
        "order": {
          "_term": "asc"
        }
      }
    }
  }
}

# 响应 (响应中指出 _term 已经废弃,应使用 _key)
#! Deprecation: Deprecated aggregation order key [_term] used, replaced by [_key]
{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_age" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 10,
          "doc_count" : 1,
          "avg_height" : {
            "value" : 112.0
          }
        },
        {
          "key" : 11,
          "doc_count" : 2,
          "avg_height" : {
            "value" : 109.5
          }
        },
        {
          "key" : 12,
          "doc_count" : 1,
          "avg_height" : {
            "value" : null
          }
        }
      ]
    }
  }
}

方式2:

POST student/_search
{
  "size": 0,
  "aggregations": { 
    "group_by_age": {
      "aggregations": {
        "avg_height": {
          "avg": {
            "field": "height"
          }
        },
        "bucket_sort_by_avg_height": {
          "bucket_sort": {
            "sort": [
              {"_key": {"order": "asc"}}
            ]
          }
        }
      },
      "terms": {
        "field": "age"
      }
    }
  }
}

获取每个年龄的平均身高,并按照平均身高从大到小排序

# 请求

POST student/_search
{
  "size": 0,
  "aggregations": { 
    "group_by_age": {
      "aggregations": {
        "avg_height": {
          "avg": {
            "field": "height"
          }
        },
        "bucket_sort_by_avg_height": {
          "bucket_sort": {
            "sort": [
              {"avg_height": {"order": "desc"}}
            ]
          }
        }
      },
      "terms": {
        "field": "age"
      }
    }
  }
}

# 响应
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "group_by_age" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 10,
          "doc_count" : 1,
          "avg_height" : {
            "value" : 112.0
          }
        },
        {
          "key" : 11,
          "doc_count" : 2,
          "avg_height" : {
            "value" : 109.5
          }
        }
      ]
    }
  }
}


( 本文完 )