容器化部署(Docker + 微服务)+ GRPC通信
1. 架构设计
-
解耦AI服务:将AI模块拆分为独立服务(gRPC服务),避免阻塞主游戏逻辑。
-
通信方式:Go服务端通过gRPC调用AI服务接口,传递游戏状态并接收决策结果。
2. 容器化与传统方案部署方案对比
| 步骤 |
容器化方案 |
传统方案 |
| 环境准备 |
安装Docker或Kubernetes |
安装Python及依赖 |
| 代码构建 |
Docker镜像构建 |
本地编译可执行文件 |
| 服务部署 |
Docker Compose/Kubernetes |
systemd托管服务 |
| 网络通信 |
服务间通过内网域名调用 |
Nginx反向代理 |
| 扩展性 |
水平扩展容器实例 |
手动部署多节点 |
3. HTTP vs gRPC方案对比
| 维度 |
HTTP (RESTful) |
gRPC |
| 性能 |
基于HTTP/1.1,文本传输(JSON/XML),序列化开销大,连接无法复用 |
基于HTTP/2,二进制传输(Protobuf),多路复用、头部压缩,性能提升约10倍 |
| 开发效率 |
手动定义API文档,客户端需解析JSON,调试工具成熟(如Postman) |
通过Protobuf自动生成代码,强类型接口减少运行时错误 |
| 通信模式 |
仅支持请求-响应 |
支持单向/双向流、服务端推送,适合实时交互场景 |
| 协议灵活性 |
兼容浏览器、移动端,通用性强 |
需特定客户端支持,更适合服务间内部通信 |
| 调试工具 |
工具丰富(如curl、浏览器) |
需专用工具(如Apifox与Apipost,支持.proto文件调试) |
附录一、python服务脚本grpc_server.py
import os
import sys
import logging
import grpc
from concurrent import futures
from env.game import InfoSet
from env.move_generator import MoveGenerator
from env import utils
import evaluation.deep_agent as deep_agent
import proto.model_service_pb2
import proto.model_service_pb2_grpc
from google.protobuf.json_format import MessageToJson
from grpc_reflection.v1alpha import reflection
# logging.basicConfig(level=logging.INFO, filename="test.log", filemode="w")
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger("game")
# 这里需要根据 DouZero 项目实际加载模型的方式调整
modelDA = deep_agent.DeepAgent(0, "models/model.ckpt")
generator = MoveGenerator()
def _get_obs(observation: proto.model_service_pb2.Observation) -> InfoSet:
# .... 自己的obs转换代码....
return info
class PyTorchModel:
def __init__(self, model):
self.model = model
def predict(self, obs):
action = self.model.act(obs)
return action
class ModelServicer(proto.model_service_pb2_grpc.ModelServiceServicer):
def __init__(self, model):
self.model = model
def Predict(self, request, context):
try:
obs = _get_obs(request.observation)
action = self.model.predict(obs)
logger.debug(f"obs: '{obs}' action: '{action}'")
return proto.model_service_pb2.Response(code=0, msg="OK", action=action)
except Exception as e:
json_str = MessageToJson(request.observation)
logger.error(f"error: {str(e)}, obs: '{json_str}'")
return proto.model_service_pb2.Response(code=-1, msg=str(e))
def serve():
model = PyTorchModel(modelDA)
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
proto.model_service_pb2_grpc.add_ModelServiceServicer_to_server(
ModelServicer(model), server
)
# 启用反射服务
SERVICE_NAMES = (
proto.model_service_pb2.DESCRIPTOR.services_by_name['ModelService'].full_name,
reflection.SERVICE_NAME,
)
reflection.enable_server_reflection(SERVICE_NAMES, server)
server.add_insecure_port("[::]:50051")
server.start()
logger.info("gRPC Server running on port 50051")
server.wait_for_termination()
if __name__ == "__main__":
serve()
附录二、Dockerfile
FROM pytorch/pytorch:latest
# 安装依赖
RUN apt update
RUN apt install git -y
RUN pip install grpcio protobuf grpcio-reflection GitPython -i https://pypi.tuna.tsinghua.edu.cn/simple
# 复制代码
COPY models/model.ckpt /app/models/model.ckpt
COPY env /app/env
COPY dmc /app/dmc
COPY evaluation /app/evaluation
COPY proto /app/proto
COPY grpc_server.py /app/grpc_server.py
COPY version.txt /app/version.txt
# 设置工作目录
WORKDIR /app
# 暴露端口
EXPOSE 50051
# 启动应用
CMD ["python3", "grpc_server.py"]
附录三、k8s配置
apiVersion: apps/v1
kind: Deployment
metadata:
name: deepai-game1
labels:
app: deepai-game1
spec:
replicas: 1
selector:
matchLabels:
app: deepai-game1
template:
metadata:
labels:
app: deepai-game1
spec:
containers:
- name: deepai-game1
image: registry-vpc.cn-hangzhou.aliyuncs.com/balt/deepai-game1:latest
imagePullPolicy: Always
ports:
- containerPort: 50051
protocol: TCP
name: grpc
resources:
requests:
cpu: "100m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "512Mi"
---
apiVersion: v1
kind: Service
metadata:
name: deepai-game1
spec:
clusterIP: None # 关键配置
ports:
- port: 50051
name: grpc
selector:
app: deepai-game1 # 匹配后端 Pod 标签
按业务维度拆分部署(即是按游戏来)