ES性能测试-esrally

es存储结构:

我们从es存储一个数据来分析es的结构:
ES性能测试-esrally

index相当于我们的mysql中的数据库的database

_type相当于我们数据的表名

id:相当于我们表中的某一条主键id,也是一个唯一标识

version相当于数据的版本号

_source相当于我们的表数据,在es存储都是json数据

一.环境搭建
esrally 对于软件环境的要求如下:

Python 3.4+ 和 pip3
JDK 8
git 1.9+
执行pip3 install esrally静等几分钟就可以完成安装,如果在命令中没有显示请修改path变量则已经安装成功。

ES性能测试-esrally

输入esrally -h或者esrally --version来判断是否已经安装完成。
执行一次压测“

esrally --distribution-version=5.0.0

ES性能测试-esrally

结果如下:

| Metric | Task | Value | Unit |

|---------------------------------------------------------------????-----------------------????------------????--------????

| Cumulative indexing time of primary shards | | 37.8926 | min |

| Min cumulative indexing time across primary shards | | 7.16975 | min |

| Median cumulative indexing time across primary shards | | 7.52045 | min |

| Max cumulative indexing time across primary shards | | 7.92857 | min |

| Cumulative indexing throttle time of primary shards | | 0.268283 | min |

| Min cumulative indexing throttle time across primary shards | | 0.00691667 | min |

| Median cumulative indexing throttle time across primary shards | | 0.05425 | min |

| Max cumulative indexing throttle time across primary shards | | 0.0968333 | min |

| Cumulative merge time of primary shards | | 11.8935 | min |

| Cumulative merge count of primary shards | | 162 | |

| Min cumulative merge time across primary shards | | 2.14578 | min |

| Median cumulative merge time across primary shards | | 2.37747 | min |

| Max cumulative merge time across primary shards | | 2.59807 | min |

| Cumulative merge throttle time of primary shards | | 1.37435 | min |

| Min cumulative merge throttle time across primary shards | | 0.166783 | min |

| Median cumulative merge throttle time across primary shards | | 0.207617 | min |

| Max cumulative merge throttle time across primary shards | | 0.412633 | min |

| Cumulative refresh time of primary shards | | 2.76212 | min |

| Cumulative refresh count of primary shards | | 202 | |

| Min cumulative refresh time across primary shards | | 0.51945 | min |

| Median cumulative refresh time across primary shards | | 0.554767 | min |

| Max cumulative refresh time across primary shards | | 0.583167 | min |

| Cumulative flush time of primary shards | | 0.2689 | min |

| Cumulative flush count of primary shards | | 10 | |

| Min cumulative flush time across primary shards | | 0.04745 | min |

| Median cumulative flush time across primary shards | | 0.056 | min |

| Max cumulative flush time across primary shards | | 0.0588 | min |

| Total Young Gen GC | | 154.5 | s |

| Total Old Gen GC | | 128.553 | s |

| Store size | | 3.28761 | GB |

| Translog size | | 2.00234e-07 | GB |

| Index size | | 3.28761 | GB |

| Total written | | 23.0033 | GB |

| Heap used for segments | | 18.6395 | MB |

| Heap used for doc values | | 0.129642 | MB |

| Heap used for terms | | 17.0492 | MB |

| Heap used for norms | | 0.072998 | MB |

| Heap used for points | | 0.577609 | MB |

| Heap used for stored fields | | 0.810059 | MB |

| Segment count | | 95 | |

| Min Throughput | index-append | 55983.6 | docs/s |

| Median Throughput | index-append | 56580.3 | docs/s |

| Max Throughput | index-append | 57170 | docs/s |

| 50th percentile latency | index-append | 578.132 | ms |

| 90th percentile latency | index-append | 1140.02 | ms |

| 99th percentile latency | index-append | 1537.41 | ms |

| 100th percentile latency | index-append | 2007.21 | ms |

| 50th percentile service time | index-append | 578.132 | ms |

| 90th percentile service time | index-append | 1140.02 | ms |

| 99th percentile service time | index-append | 1537.41 | ms |

| 100th percentile service time | index-append | 2007.21 | ms |

| error rate | index-append | 0 | % |

| Min Throughput | index-stats | 90.03 | ops/s |

| Median Throughput | index-stats | 90.06 | ops/s |

| Max Throughput | index-stats | 90.08 | ops/s |

| 50th percentile latency | index-stats | 5.53262 | ms |

| 90th percentile latency | index-stats | 6.77283 | ms |

| 99th percentile latency | index-stats | 183.032 | ms |

| 99.9th percentile latency | index-stats | 272.737 | ms |

| 100th percentile latency | index-stats | 282.355 | ms |

| 50th percentile service time | index-stats | 2.96778 | ms |

| 90th percentile service time | index-stats | 3.03523 | ms |

| 99th percentile service time | index-stats | 4.33188 | ms |

| 99.9th percentile service time | index-stats | 125.705 | ms |

| 100th percentile service time | index-stats | 233.488 | ms |

| error rate | index-stats | 0 | % |

| Min Throughput | node-stats | 90.04 | ops/s |

| Median Throughput | node-stats | 90.08 | ops/s |

| Max Throughput | node-stats | 90.24 | ops/s |

| 50th percentile latency | node-stats | 5.73745 | ms |

| 90th percentile latency | node-stats | 6.76601 | ms |

| 99th percentile latency | node-stats | 287.349 | ms |

| 99.9th percentile latency | node-stats | 377.557 | ms |

| 100th percentile latency | node-stats | 386.949 | ms |

| 50th percentile service time | node-stats | 3.00129 | ms |

| 90th percentile service time | node-stats | 3.15221 | ms |

| 99th percentile service time | node-stats | 4.13536 | ms |

| 99.9th percentile service time | node-stats | 192.916 | ms |

| 100th percentile service time | node-stats | 213.165 | ms |

| error rate | node-stats | 0 | % |

| Min Throughput | default | 50.01 | ops/s |

| Median Throughput | default | 50.03 | ops/s |

| Max Throughput | default | 50.05 | ops/s |

| 50th percentile latency | default | 10.4133 | ms |

| 90th percentile latency | default | 11.9695 | ms |

| 99th percentile latency | default | 13.3122 | ms |

| 99.9th percentile latency | default | 17.1096 | ms |

| 100th percentile latency | default | 24.5131 | ms |

| 50th percentile service time | default | 5.81862 | ms |

| 90th percentile service time | default | 6.15389 | ms |

| 99th percentile service time | default | 7.93529 | ms |

| 99.9th percentile service time | default | 15.8226 | ms |

| 100th percentile service time | default | 17.0188 | ms |

| error rate | default | 0 | % |

| Min Throughput | term | 199.93 | ops/s |

| Median Throughput | term | 200.11 | ops/s |

| Max Throughput | term | 200.15 | ops/s |

| 50th percentile latency | term | 2.78975 | ms |

| 90th percentile latency | term | 2.88009 | ms |

| 99th percentile latency | term | 3.37198 | ms |

| 99.9th percentile latency | term | 5.07478 | ms |

| 100th percentile latency | term | 6.85981 | ms |

| 50th percentile service time | term | 1.65418 | ms |

| 90th percentile service time | term | 1.69345 | ms |

| 99th percentile service time | term | 1.7669 | ms |

| 99.9th percentile service time | term | 2.84947 | ms |

| 100th percentile service time | term | 3.9311 | ms |

| error rate | term | 0 | % |

| Min Throughput | phrase | 199.97 | ops/s |

| Median Throughput | phrase | 200.09 | ops/s |

| Max Throughput | phrase | 200.18 | ops/s |

| 50th percentile latency | phrase | 3.11549 | ms |

| 90th percentile latency | phrase | 3.23707 | ms |

| 99th percentile latency | phrase | 4.12888 | ms |

| 99.9th percentile latency | phrase | 5.50829 | ms |

| 100th percentile latency | phrase | 5.8348 | ms |

| 50th percentile service time | phrase | 2.13505 | ms |

| 90th percentile service time | phrase | 2.1709 | ms |

| 99th percentile service time | phrase | 2.30986 | ms |

| 99.9th percentile service time | phrase | 5.00575 | ms |

| 100th percentile service time | phrase | 5.15111 | ms |

| error rate | phrase | 0 | % |

| Min Throughput | country_agg_uncached | 4.01 | ops/s |

| Median Throughput | country_agg_uncached | 4.01 | ops/s |

| Max Throughput | country_agg_uncached | 4.01 | ops/s |

| 50th percentile latency | country_agg_uncached | 117.212 | ms |

| 90th percentile latency | country_agg_uncached | 120.426 | ms |

| 99th percentile latency | country_agg_uncached | 124.483 | ms |

| 100th percentile latency | country_agg_uncached | 129.454 | ms |

| 50th percentile service time | country_agg_uncached | 109.475 | ms |

| 90th percentile service time | country_agg_uncached | 112.182 | ms |

| 99th percentile service time | country_agg_uncached | 115.064 | ms |

| 100th percentile service time | country_agg_uncached | 119.394 | ms |

| error rate | country_agg_uncached | 0 | % |

| Min Throughput | country_agg_cached | 100.04 | ops/s |

| Median Throughput | country_agg_cached | 100.06 | ops/s |

| Max Throughput | country_agg_cached | 100.09 | ops/s |

| 50th percentile latency | country_agg_cached | 4.63347 | ms |

| 90th percentile latency | country_agg_cached | 5.14059 | ms |

| 99th percentile latency | country_agg_cached | 6.25025 | ms |

| 99.9th percentile latency | country_agg_cached | 10.0302 | ms |

| 100th percentile latency | country_agg_cached | 11.9625 | ms |

| 50th percentile service time | country_agg_cached | 2.01421 | ms |

| 90th percentile service time | country_agg_cached | 2.10823 | ms |

| 99th percentile service time | country_agg_cached | 2.17912 | ms |

| 99.9th percentile service time | country_agg_cached | 3.99838 | ms |

| 100th percentile service time | country_agg_cached | 7.81819 | ms |

| error rate | country_agg_cached | 0 | % |

| Min Throughput | scroll | 20.06 | pages/s |

| Median Throughput | scroll | 20.07 | pages/s |

| Max Throughput | scroll | 20.08 | pages/s |

| 50th percentile latency | scroll | 219.574 | ms |

| 90th percentile latency | scroll | 234.308 | ms |

| 99th percentile latency | scroll | 255.818 | ms |

| 100th percentile latency | scroll | 257.276 | ms |

| 50th percentile service time | scroll | 215.828 | ms |

| 90th percentile service time | scroll | 229.639 | ms |

| 99th percentile service time | scroll | 255.259 | ms |

| 100th percentile service time | scroll | 255.775 | ms |

| error rate | scroll | 0 | % |

| Min Throughput | expression | 2 | ops/s |

| Median Throughput | expression | 2 | ops/s |

| Max Throughput | expression | 2.01 | ops/s |

| 50th percentile latency | expression | 201.947 | ms |

| 90th percentile latency | expression | 209.363 | ms |

| 99th percentile latency | expression | 223.856 | ms |

| 100th percentile latency | expression | 226.376 | ms |

| 50th percentile service time | expression | 196.148 | ms |

| 90th percentile service time | expression | 203.086 | ms |

| 99th percentile service time | expression | 217.498 | ms |

| 100th percentile service time | expression | 219.35 | ms |

| error rate | expression | 0 | % |

| Min Throughput | painless_static | 1.5 | ops/s |

| Median Throughput | painless_static | 1.5 | ops/s |

| Max Throughput | painless_static | 1.5 | ops/s |

| 50th percentile latency | painless_static | 286.95 | ms |

| 90th percentile latency | painless_static | 293.601 | ms |

| 99th percentile latency | painless_static | 309.749 | ms |

| 100th percentile latency | painless_static | 332.214 | ms |

| 50th percentile service time | painless_static | 279.215 | ms |

| 90th percentile service time | painless_static | 286.218 | ms |

| 99th percentile service time | painless_static | 304.195 | ms |

| 100th percentile service time | painless_static | 323.788 | ms |

| error rate | painless_static | 0 | % |

| Min Throughput | painless_dynamic | 1.5 | ops/s |

| Median Throughput | painless_dynamic | 1.5 | ops/s |

| Max Throughput | painless_dynamic | 1.5 | ops/s |

| 50th percentile latency | painless_dynamic | 274.605 | ms |

| 90th percentile latency | painless_dynamic | 281.152 | ms |

| 99th percentile latency | painless_dynamic | 288.11 | ms |

| 100th percentile latency | painless_dynamic | 289.746 | ms |

| 50th percentile service time | painless_dynamic | 267.396 | ms |

| 90th percentile service time | painless_dynamic | 275.03 | ms |

| 99th percentile service time | painless_dynamic | 280.448 | ms |

| 100th percentile service time | painless_dynamic | 281.075 | ms |

| error rate | painless_dynamic | 0 | % |

| Min Throughput | large_terms | 0.31 | ops/s |

| Median Throughput | large_terms | 0.31 | ops/s |

| Max Throughput | large_terms | 0.32 | ops/s |

| 50th percentile latency | large_terms | 629055 | ms |

| 90th percentile latency | large_terms | 728902 | ms |

| 99th percentile latency | large_terms | 751172 | ms |

| 100th percentile latency | large_terms | 753360 | ms |

| 50th percentile service time | large_terms | 3234.67 | ms |

| 90th percentile service time | large_terms | 3525.99 | ms |

| 99th percentile service time | large_terms | 3677.94 | ms |

| 100th percentile service time | large_terms | 4029.56 | ms |

| error rate | large_terms | 0 | % |

| Min Throughput | large_filtered_terms | 1.5 | ops/s |

| Median Throughput | large_filtered_terms | 1.5 | ops/s |

| Max Throughput | large_filtered_terms | 1.5 | ops/s |

| 50th percentile latency | large_filtered_terms | 300.138 | ms |

| 90th percentile latency | large_filtered_terms | 306.847 | ms |

| 99th percentile latency | large_filtered_terms | 313.358 | ms |

| 100th percentile latency | large_filtered_terms | 313.884 | ms |

| 50th percentile service time | large_filtered_terms | 295 | ms |

| 90th percentile service time | large_filtered_terms | 300.679 | ms |

| 99th percentile service time | large_filtered_terms | 304.527 | ms |

| 100th percentile service time | large_filtered_terms | 306.65 | ms |

| error rate | large_filtered_terms | 0 | % |

| Min Throughput | large_prohibited_terms | 1.5 | ops/s |

| Median Throughput | large_prohibited_terms | 1.5 | ops/s |

| Max Throughput | large_prohibited_terms | 1.5 | ops/s |

| 50th percentile latency | large_prohibited_terms | 287.037 | ms |

| 90th percentile latency | large_prohibited_terms | 294.421 | ms |

| 99th percentile latency | large_prohibited_terms | 298.977 | ms |

| 100th percentile latency | large_prohibited_terms | 299.136 | ms |

| 50th percentile service time | large_prohibited_terms | 279.832 | ms |

| 90th percentile service time | large_prohibited_terms | 287.316 | ms |

| 99th percentile service time | large_prohibited_terms | 290.093 | ms |

| 100th percentile service time | large_prohibited_terms | 291.777 | ms |

| error rate | large_prohibited_terms | 0 | % |

一般要关注的数据有:

throughput 每个操作的吞吐量,比如 index、search等

latency 每个操作的响应时长数据

Heap used for x 记录堆栈的使用情况

esrally race --distribution-version=7.0.1 --pipeline=benchmark-only --target-hosts=172.18.3.94:9200,172.18.3.179:9200,172.18.3.177:9200

这个命令用来完整的做完一次压力测试,pipeline指定压测的模式是测试自己搭建的集群target-host指定集群端口,track指定本次要用nyc_taxis数据集做测试,challenge指定使用append-no-conflicts策略
ES性能测试-esrally

术语概念
track
track是指压测用的数据和测试策略。track就好像过去在玩极品飞车游戏时,你可以选择的车的种类以及地图赛道一般,track提供了多种多样的测试数据比如http_logs的测试数据供你选择。

比如选择geonames这个track(路径.rally/benchmarks/tracks/default/geonames)在目录下我们可以看到一个track.json文件,这个文件定义了整个测试过程要使用的数据类型和测试策略。track.json文件中有几个关键的参数。

race
race 是指某一次压测。要比赛,就要有赛道和赛车,如果不指定赛车,就用 default 配置,如果不指定赛道,则默认使用 geonames track。

esrally并不需要本地或者网络中已经存在elasticsearch集群,它甚至可以自己去github上下载源代码后本地编译安装然后测试。

esrally是官方做压测的开源工具,权威程度不言而喻,但是有点复杂,相当于又学了一个部署复杂版的jmeter。所以如果单单只是想测试ES搜索性能的话,建议使用jmeter。

如想了解更多可参照:https://www.jianshu.com/p/979f548c233e,https://elasticsearch.cn/article/275

ES基本命令:

1.写入数据

PUT /megacorp/employee
{
“first_name” : “John111111”,
“last_name” : “Smith11111”,
“age” : 115,
“about” : “I love to go rock climbing”,
“interests”: [ “sports”, “music” ]
}

megacorp 相当于mysql中的databases employee相当于mysql中的表

  1. 根据id查询

GET megacorp/employee/5

3.删除

DELETE /megacorp/ 这里是删除了整个数据库

4.查看ES的所有数据库

GET /_cat/indices?v