首页 > 其他分享 >wrk压测TF-serving

wrk压测TF-serving

时间:2022-11-18 16:01:28浏览次数:50  
标签:8501 serving 压测 two wrk plus ms half TF


serving服务

# 启动镜像
docker run -t --rm -p 8501:8501 \
-v "${PATH}/serving/tensorflow_serving/servables/tensorflow/testdata/saved_model_half_plus_two_cpu:/models/half_plus_two" \
-e MODEL_NAME=half_plus_two \
tensorflow/serving &

# 验证
curl -d '{"instances": [1.2, 2.0, 5.0]}' \
-X POST http://localhost:8501/v1/models/half_plus_two:predict

安装

mac: 
brew install wrk
linux:
git clone https://github.com/wg/wrk.git
make

编辑test.lua

wrk.method = "POST"
wrk.headers["Content-Type"] = "application/json"
wrk.body = '{"instances": [1.2, 2.0, 5.0]}'

压测

wrk -t8 -c200 -d20s --script=test.lua --latency http://localhost:8501/v1/models/half_plus_two:predict

# 结果
Running 20s test @ http://localhost:8501/v1/models/half_plus_two:predict
8 threads and 200 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 49.89ms 31.43ms 322.27ms 94.01%
Req/Sec 550.19 145.19 790.00 70.58%
Latency Distribution
50% 41.94ms
75% 49.79ms
90% 64.09ms
99% 215.99ms
86347 requests in 20.09s, 15.48MB read
Non-2xx or 3xx responses: 86347
Requests/sec: 4297.20
Transfer/sec: 788.94KB


标签:8501,serving,压测,two,wrk,plus,ms,half,TF
From: https://blog.51cto.com/u_15879559/5868558

相关文章