Paddle Serving是PaddlePaddle的在线预估服务框架-FinClip官网

Paddle Serving是PaddlePaddle的在线预估服务框架

网友投稿 1159 2022-10-26

Paddle Serving是PaddlePaddle的在线预估服务框架

(简体中文|English)

Motivation

We consider deploying deep learning inference service online to be a user-facing application in the future. The goal of this project: When you have trained a deep neural net with Paddle, you are also capable to deploy the model online easily. A demo of Paddle Serving is as follows:

Installation

We highly recommend you to run Paddle Serving in Docker, please visit Run in Docker

# Run CPU Dockerdocker pull hub.baidubce.com/paddlepaddle/serving:latestdocker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latestdocker exec -it test bash

# Run GPU Dockernvidia-docker pull hub.baidubce.com/paddlepaddle/serving:latest-gpunvidia-docker run -p 9292:9292 --name test -dit hub.baidubce.com/paddlepaddle/serving:latest-gpunvidia-docker exec -it test bash

pip install paddle-serving-client pip install paddle-serving-server # CPUpip install paddle-serving-server-gpu # GPU

You may need to use a domestic mirror source (in China, you can use the Tsinghua mirror source, add -i https://pypi.tuna.tsinghua.edu-/simple to pip command) to speed up the download.

If you need install modules compiled with develop branch, please download packages from latest packages list and install with pip install command.

Packages of Paddle Serving support Centos 6/7 and Ubuntu 16/18, or you can use HTTP service without install client.

Pre-built services with Paddle Serving

Chinese Word Segmentation

> python -m paddle_serving_app.package --get_model lac> tar -xzf lac.tar.gz> python lac_web_service.py lac_model/ lac_workdir 9393 &> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"words": "我爱北京天安门"}], "fetch":["word_seg"]}' http://127.0.0.1:9393/lac/prediction{"result":[{"word_seg":"我|爱|北京|天安门"}]}

Image Classification

> python -m paddle_serving_app.package --get_model resnet_v2_50_imagenet> tar -xzf resnet_v2_50_imagenet.tar.gz> python resnet50_imagenet_classify.py resnet50_serving_model &> curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"image": "https://paddle-serving.bj.bcebos.com/imagenet-example/daisy.jpg"}], "fetch": ["score"]}' http://127.0.0.1:9292/image/prediction{"result":{"label":["daisy"],"prob":[0.9341403245925903]}}

Quick Start Example

This quick start example is only for users who already have a model to deploy and we prepare a ready-to-deploy model here. If you want to know how to use paddle serving from offline training to online serving, please reference to Train_To_Service

Boston House Price Prediction model

wget --no-check-certificate https://paddle-serving.bj.bcebos.com/uci_housing.tar.gztar -xzf uci_housing.tar.gz

Paddle Serving provides HTTP and RPC based service for users to access

HTTP service

Paddle Serving provides a built-in python module called paddle_serving_server.serve that can start a RPC service or a http service with one-line command. If we specify the argument --name uci, it means that we will have a HTTP service with a url of $IP:$PORT/uci/prediction

python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292 --name uci

Argument	Type	Default	Description
`thread`	int	`4`	Concurrency of current service
`port`	int	`9292`	Exposed port of current service to users
`name`	str	`""`	Service name, can be used to generate HTTP request url
`model`	str	`""`	Path of paddle model directory to be served
`mem_optim`	-	-	Enable memory / graphic memory optimization
`ir_optim`	-	-	Enable analysis and optimization of calculation graph
`use_mkl` (Only for cpu version)	-	-	Run inference with MKL

Here, we use curl to send a HTTP POST request to the service we just started. Users can use any python library to send HTTP POST as well, e.g, requests.

curl -H "Content-Type:application/json" -X POST -d '{"feed":[{"x": [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]}], "fetch":["price"]}' http://127.0.0.1:9292/uci/prediction

RPC service

A user can also start a RPC service with paddle_serving_server.serve. RPC service is usually faster than HTTP service, although a user needs to do some coding based on Paddle Serving's python client API. Note that we do not specify --name here.

python -m paddle_serving_server.serve --model uci_housing_model --thread 10 --port 9292

# A user can visit rpc service through paddle_serving_client APIfrom paddle_serving_client import Clientclient = Client()client.load_client_config("uci_housing_client/serving_client_conf.prototxt")client.connect(["127.0.0.1:9292"])data = [0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332]fetch_map = client.predict(feed={"x": data}, fetch=["price"])print(fetch_map)

Here, client.predict function has two arguments. feed is a python dict with model input variable alias name and values. fetch assigns the prediction variables to be returned from servers. In the example, the name of "x" and "price" are assigned when the servable model is saved during training.

Some Key Features of Paddle Serving

Integrate with Paddle training pipeline seamlessly, most paddle models can be deployed with one line command.Industrial serving features supported, such as models management, online loading, online A/B testing etc.Distributed Key-Value indexing supported which is especially useful for large scale sparse features as model inputs.Highly concurrent and efficient communication between clients and servers supported.Multiple programming languages supported on client side, such as Golang, C++ and python.

Document

New to Paddle Serving

How to save a servable model?An End-to-end tutorial from training to inference service deploymentWrite Bert-as-Service in 10 minutes

Developers

How to config Serving native operators on server side?How to develop a new Serving operator?How to develop a new Web Service?Golang clientCompile from source codeDeploy Web Service with uWSGIHot loading for model file

About Efficiency

How to profile Paddle Serving latency?How to optimize performance?Deploy multi-services on one GPU(Chinese)CPU Benchmarks(Chinese)GPU Benchmarks(Chinese)

FAQ

FAQ(Chinese)

Design

Design Doc

Community

User Group in China

PaddleServing交流QQ群 PaddleServing微信群

Slack

To connect with other users and contributors, welcome to join our Slack channel

Contribution

If you want to contribute code to Paddle Serving, please reference Contribution Guidelines

Feedback

For any feedback or to report a bug, please propose a GitHub Issue.

License

Apache 2.0 License

标签：js

敏捷交付如何驱动企业在快速变化的市场中获胜

1159 2022-10-26

Paddle Serving是PaddlePaddle的在线预估服务框架

前端框架选型是企业提升开发效率与用户体验的关键因素

大屏前端框架如何推动企业数据可视化与用户体验的革新

敏捷交付如何驱动企业在快速变化的市场中获胜

最近发表

更多内容

小程序SDK

Finclip技术文档

小程序开发

小程序容器

小程序框架

Finclip小程序平台

Finclip用户投稿

车联网

推荐文章

小程序SDK是什么意思？小程序sdk和插件有什么区别？

小程序支付功能怎么实现？

企业app开发流程是什么？

app运营模式有哪些？

小程序多端引流怎么做？

小程序生态分析的机会和威胁

Flutter入门这一篇效率文章就够了

原生与跨平台解决方案分析,跨平台软件开发技术方案

热更新技术：让软件更新变得更加轻松快速

解决方案

银行解决方案

证券解决方案

互联网解决方案

政企OA解决方案

科技解决方案

loT解决方案

信任解决方案

热评文章

AppCan:基于混合模式的移动应用开发,移动混合模

Hybrid App混合模式开发的了解

小程序容器技术助力券商数字营销突围，小程序容器化的意

用mpvue开发微信小程序基础知识（vue.js开发

小程序多端框架全面测评对比，强烈推荐！

券商app架构 - 解析券商应用程序的构建与设计