【es】cardinality 计算不准确问题
遇到问题:
两个结果不一样,按说是一样的
结果一:
{
"query": {
"bool": {
"must_not": [
{
"match_phrase": {
"reqUA": "Jakarta Commons-HttpClient/3.1"
}
},
{
"match_phrase": {
"reqReferer": "http://www.baidu.com/s?wd=www"
}
}
],
"must": [
{
"range": {
"reqTime": {
"gte": "2016-09-25 22:00:00",
"lte": "2016-09-26 22:00:00"
}
}
},
{
"range": {
"operateBeforeObj.sendTime": {
"gte": "2016-09-25 22:00:00",
"lte": "2016-09-26 22:00:00"
}
}
},
{
"terms": {
"productPageCode": [
"10001",
"33002"
]
}
}
]
}
},
"from": 0,
"aggs": {
"channelTag": {
"terms": {
"field": "channelTag",
"size": 0
},
"aggs": {
"userId": {
"cardinality": {
"field": "user.userId"
}
}
}
}
},
"size": 0
}
结果二:
{
"query": {
"bool": {
"must_not": [
{
"match_phrase": {
"reqUA": "Jakarta Commons-HttpClient/3.1"
}
},
{
"match_phrase": {
"reqReferer": "http://www.baidu.com/s?wd=www"
}
}
],
"must": [
{
"range": {
"reqTime": {
"gte": "2016-09-25 22:00:00",
"lte": "2016-09-26 22:00:00"
}
}
},
{
"range": {
"operateBeforeObj.sendTime": {
"gte": "2016-09-25 22:00:00",
"lte": "2016-09-26 22:00:00"
}
}
},
{
"terms": {
"productPageCode": [
"10001",
"33002"
]
}
}
]
}
},
"from": 0,
"aggs": {
"userId": {
"cardinality": {
"field": "user.userId"
}
}
},
"size": 0
}
分析问题:
问题应该在cardinality上,cardinality有个参数 "precision_threshold": 100 ,100是个预设值,你的真实值小于100计算出来的值就是正确的,真实值大于100计算出来的值就是模糊的,100可以自定义。
解决问题:
{
"aggs":{
"author_count":{
"cardinality":{
"field":"author_hash",
"precision_threshold":100
}
}
}
}