pytorch模型部署

容器化部署(Docker + 微服务)+ GRPC通信

1. 架构设计

  • 解耦AI服务:将AI模块拆分为独立服务(gRPC服务),避免阻塞主游戏逻辑。
  • 通信方式:Go服务端通过gRPC调用AI服务接口,传递游戏状态并接收决策结果。

2. 容器化与传统方案部署方案对比

步骤 容器化方案 传统方案
环境准备 安装Docker或Kubernetes 安装Python及依赖
代码构建 Docker镜像构建 本地编译可执行文件
服务部署 Docker Compose/Kubernetes systemd托管服务
网络通信 服务间通过内网域名调用 Nginx反向代理
扩展性 水平扩展容器实例 手动部署多节点

3. HTTP vs gRPC方案对比

维度 HTTP (RESTful) gRPC
性能 基于HTTP/1.1,文本传输(JSON/XML),序列化开销大,连接无法复用 基于HTTP/2,二进制传输(Protobuf),多路复用、头部压缩,性能提升约10倍
开发效率 手动定义API文档,客户端需解析JSON,调试工具成熟(如Postman) 通过Protobuf自动生成代码,强类型接口减少运行时错误
通信模式 仅支持请求-响应 支持单向/双向流、服务端推送,适合实时交互场景
协议灵活性 兼容浏览器、移动端,通用性强 需特定客户端支持,更适合服务间内部通信
调试工具 工具丰富(如curl、浏览器) 需专用工具(如Apifox与Apipost,支持.proto文件调试)

附录一、python服务脚本grpc_server.py

import os
import sys
import logging
import grpc
from concurrent import futures
from env.game import InfoSet
from env.move_generator import MoveGenerator
from env import utils
import evaluation.deep_agent as deep_agent
import proto.model_service_pb2
import proto.model_service_pb2_grpc
from google.protobuf.json_format import MessageToJson
from grpc_reflection.v1alpha import reflection

# logging.basicConfig(level=logging.INFO, filename="test.log", filemode="w")
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger("game")

# 这里需要根据 DouZero 项目实际加载模型的方式调整
modelDA = deep_agent.DeepAgent(0, "models/model.ckpt")
generator = MoveGenerator()


def _get_obs(observation: proto.model_service_pb2.Observation) -> InfoSet:
    # .... 自己的obs转换代码....
    return info

class PyTorchModel:
    def __init__(self, model):
        self.model = model

    def predict(self, obs):
        action = self.model.act(obs)
        return action

class ModelServicer(proto.model_service_pb2_grpc.ModelServiceServicer):
    def __init__(self, model):
        self.model = model

    def Predict(self, request, context):
        try:
            obs = _get_obs(request.observation)
            action = self.model.predict(obs)
            logger.debug(f"obs: '{obs}' action: '{action}'")
            return proto.model_service_pb2.Response(code=0, msg="OK", action=action)
        except Exception as e:
            json_str = MessageToJson(request.observation)
            logger.error(f"error: {str(e)}, obs: '{json_str}'")
            return proto.model_service_pb2.Response(code=-1, msg=str(e))

def serve():
    model = PyTorchModel(modelDA)
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
    proto.model_service_pb2_grpc.add_ModelServiceServicer_to_server(
        ModelServicer(model), server
    )
      
    # 启用反射服务
    SERVICE_NAMES = (
        proto.model_service_pb2.DESCRIPTOR.services_by_name['ModelService'].full_name,
        reflection.SERVICE_NAME,
    )
    reflection.enable_server_reflection(SERVICE_NAMES, server)

    server.add_insecure_port("[::]:50051")
    server.start()
    logger.info("gRPC Server running on port 50051")
    server.wait_for_termination()

if __name__ == "__main__":
    serve()



附录二、Dockerfile

FROM pytorch/pytorch:latest

# 安装依赖
RUN apt update
RUN apt install git -y
RUN pip install grpcio protobuf grpcio-reflection GitPython -i https://pypi.tuna.tsinghua.edu.cn/simple

# 复制代码
COPY models/model.ckpt /app/models/model.ckpt
COPY env /app/env
COPY dmc /app/dmc
COPY evaluation /app/evaluation
COPY proto /app/proto
COPY grpc_server.py /app/grpc_server.py
COPY version.txt /app/version.txt

# 设置工作目录
WORKDIR /app

# 暴露端口
EXPOSE 50051

# 启动应用
CMD ["python3", "grpc_server.py"]

附录三、k8s配置

apiVersion: apps/v1
kind: Deployment
metadata:
  name: deepai-game1
  labels:
    app: deepai-game1
spec:
  replicas: 1
  selector:
    matchLabels:
      app: deepai-game1
  template:
    metadata:
      labels:
        app: deepai-game1
    spec:
      containers:
      - name: deepai-game1
        image: registry-vpc.cn-hangzhou.aliyuncs.com/balt/deepai-game1:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 50051
          protocol: TCP
          name: grpc
        resources:
          requests:
            cpu: "100m"
            memory: "256Mi"
          limits:
            cpu: "1000m"
            memory: "512Mi"
---
apiVersion: v1
kind: Service
metadata:
  name: deepai-game1
spec:
  clusterIP: None  # 关键配置
  ports:
    - port: 50051
      name: grpc
  selector:
    app: deepai-game1  # 匹配后端 Pod 标签

按业务维度拆分部署(即是按游戏来)

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容