ElasticSearch 电商搜索实现
最近打算自己做一个类似手机电商app的搜索功能,于是自然想到了elasticsearch,首先参考到的美团的app,根据类目和喜好,加上关键词,根据距离远近筛选出满足用户要求的店铺,以及店铺下的满足要求的商品(3个),如下图类似的效果
由于刚刚入坑,elasticsearch,所以看了些文档,elastcisearch,有nested和parent/child这两种格式的文档结构满足功能需求,nested是将店铺和商品信息看做一个文档,当店铺里的商品很多或者,要更改商品信息时需要更新整个文档。根据子集进行过滤后无法支持只取子集的前3个所以暂不考虑,parent/child这种格式的现在6.X现在统一叫join,反正性能好像不太好。elasticsearch属于NoSql的分支,数据尽量扁平化。根据网络上的做法是做宽表冗余。
我采用的就是这种做法。将店铺的部分信息冗余到商品里去。(店铺名称,图标,月销量和坐标)。下图是我的商品的mapping
看下product的文档数据要求就是根据店铺做折叠,对折叠功能有兴趣的同学可以看下链接:https://elasticsearch.cn/article/132
现在要搜索关键字为数码,且goodTypeId为2的所有商品,按店铺显示
GET product/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [{
"match": {
"goodsLabels": {
"query": "数码",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1.0
}
}
},
{
"term": {
"goodTypeId": {
"value": 2,
"boost": 1.0
}
}
},
{
"geo_distance": {
"coordinate": [106.395645,
59.929986],
"distance": 15000.0,
"distance_type": "arc",
"validation_method": "STRICT",
"ignore_unmapped": false,
"boost": 1.0
}
}],
"adjust_pure_negative": true,
"boost": 1.0
}
},
"_source": {
"includes": ["shopId",
"merchantId",
"monthSalesVolume",
"shopName",
"coordinate"],
"excludes": []
},
"script_fields": {
"distance": {
"script": {
"source": "doc['coordinate'].arcDistance(params.lat,params.lon)",
"lang": "painless",
"params": {
"lon": 106.395645,
"lat": 59.929986
}
},
"ignore_failure": false
}
},
"sort": [{
"_geo_distance": {
"coordinate": [{
"lat": 59.929986,
"lon": 106.395645
}],
"unit": "km",
"distance_type": "arc",
"order": "asc",
"validation_method": "STRICT",
"ignore_unmapped": false
}
}],
"collapse": {
"field": "shopId",
"inner_hits": {
"name": "top_price",
"ignore_unmapped": true,
"from": 0,
"size": 1,
"version": false,
"explain": false,
"track_scores": true,
"_source": {
"includes": ["goodsId",
"goodName",
"price",
"discountedPrice",
"goodUrlThumb"],
"excludes": []
},
"sort": [{
"price": {
"order": "desc"
}
}]
}
}
}
这个es请求的json,有兴趣的自己看下,应该不难懂。
请求结果为:
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : null,
"hits" : [
{
"_index" : "product",
"_type" : "goods",
"_id" : "3",
"_score" : null,
"_source" : {
"coordinate" : "59.929986,106.395645",
"merchantId" : null,
"monthSalesVolume" : null,
"shopName" : "华为荣耀官方旗舰店",
"shopId" : "2"
},
"fields" : {
"distance" : [
0.0
],
"shopId" : [
"2"
]
},
"sort" : [
0.0
],
"inner_hits" : {
"top_price" : {
"hits" : {
"total" : 2,
"max_score" : 2.3884578,
"hits" : [
{
"_index" : "product",
"_type" : "goods",
"_id" : "3",
"_score" : 2.3296995,
"_source" : {
"goodName" : "荣耀V20",
"discountedPrice" : null,
"goodsId" : 3,
"price" : 1800,
"goodUrlThumb" : null
},
"sort" : [
1800
]
}
]
}
}
}
}
]
}
}
贴出java代码,过程中的难点是es的api很难和spring-data-elasticsearch的api对应起来:
看下swagger请求效果
用到的主要是折叠,其次还是需要在模型上多思考。上面用到了elasticsearch的script_filed.来计算出传入的经纬度和店铺经纬度之间的距离,也就是这里的distance.