Tensorflow Serving 介绍
TensorFlow Serving Introduction
TensorFlow Serving 系统非常适用于大规模运行能够基于真实情况的数据并会发生动态改变的多重模型。它给出了一个把模型应用到实际生产的解决方案。
TensorFlow Serving 能够简化并加速从模型到生产的过程。它能实现在服务器架构和 API 保持不变的情况下,安全地部署新模型并运行试验。除了原生集成 TensorFlow,还可以扩展服务其他类型的模型
TensorFlow Serving 使用(之前训练的)模型来实施推理——基于客户端呈现数据的预测。因为客户端通常会使用远程过程调用(RPC)接口来与服务系统通信,TensorFlow Serving 提供了一种基于gRPC的参考型前端实现,今年六月TensorFlow Serving在以往的gRPC API之外,开始支持RESTful API了,使得访问更加符合常用的JSON习惯 ,这是一种 Google 开发的高性能开源 RPC 架构。当新数据可用或改进模型时,加载并迭代模型是很常见的。事实上,在谷歌,许多管线经常运行,一旦当新数据可用时,就会产生新版本的模型。
Simple TensorFlow Serving
Simple TensorFlow Serving is the generic and easy-to-use serving service for machine learning models.
It is the bridge for TensorFlow models and bring machine learning to any programming language, such as Bash, Python, C++, Java, Scala, Go, Ruby, JavaScript, PHP, Erlang, Lua, Rust, Swift, Perl, Lisp, Haskell, Clojure, R.
- Support distributed TensorFlow models
- Support the general RESTful/HTTP APIs
- Support inference with accelerated GPU
- Support
curl
and other command-line tools - Support clients in any programming language
- Support code-gen client by models without coding
- Support inference with raw file for image models
- Support statistical metrics for verbose requests
- Support serving multiple models at the same time
- Support dynamic online and offline for model versions
- Support loading new custom op for TensorFlow models
- Support secure authentication with configurable basic auth
- Support multiple models of TensorFlow/MXNet/PyTorch/Caffe2/CNTK/ONNX/H2o/Scikit-learn/XGBoost/PMML