Argo Workflows-Kubernetes的工作流引擎

Argo CD关注于自动化应用的部署和管理的持续集成和持续部署流程,而Argo Workflows则专注于在Kubernetes上编排和执行复杂的工作流任务。

什么是 Argo Workflows?

Argo Workflows 是一个开源的容器原生工作流引擎,用于在 Kubernetes 上编排并行作业。Argo Workflows 是作为 Kubernetes CRD (自定义资源定义)实现的。

  • 定义工作流,其中每个步骤是一个容器。
  • 将多步骤工作流建模为任务序列,或者使用有向无环图(DAG)描述任务之间的依赖关系。
  • 在 Kubernetes 上使用 Argo Workflows,在短时间内轻松运行用于机器学习或数据处理的计算密集型作业。

安装 Argo Workflows

kubectl create namespace argo
kubectl apply -n argo -f https://github.com//www.greatytc.com/argoproj//www.greatytc.com/argo-workflows/releases/download/v<<ARGO_WORKFLOWS_VERSION>>/install.yaml
//使用最新版本
kubectl apply -n argo -f https://github.com//www.greatytc.com/argoproj//www.greatytc.com/argo-workflows/releases/latest/download/install.yaml

安装完成后,如下

$ kubectl -n argo get all
NAME                                      READY   STATUS    RESTARTS   AGE
pod//www.greatytc.com/argo-server-5bb489b6b4-zwplh          1/1     Running   0          107s
pod/workflow-controller-86858b796-mv7sv   1/1     Running   0          107s

NAME                  TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
service//www.greatytc.com/argo-server   ClusterIP   10.96.0.158   <none>        2746/TCP   107s

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps//www.greatytc.com/argo-server           1/1     1            1           107s
deployment.apps/workflow-controller   1/1     1            1           107s

NAME                                            DESIRED   CURRENT   READY   AGE
replicaset.apps//www.greatytc.com/argo-server-5bb489b6b4          1         1         1       107s
replicaset.apps/workflow-controller-86858b796   1         1         1       107s

端口转发

kubectl edit service argo-server -n argo
----
apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"argo-server","namespace":"argo"},"spec":{"ports":[{"name":"web","port":2746,"targetPort":2746}],"selector":{"app":"argo-server"}}}
  creationTimestamp: "2023-01-02T13:55:29Z"
  name: argo-server
  namespace: argo
  resourceVersion: "167527"
  uid: c0c42005-a4e4-4a43-82d6-5ff3d002c209
spec:
  clusterIP: 10.233.37.209
  clusterIPs:
  - 10.233.37.209
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: web
    nodePort: 32746    // 设置端口号
    port: 2746
    protocol: TCP
    targetPort: 2746
  selector:
    app: argo-server
  sessionAffinity: None
  type: NodePort     //修改为NodePort 
status:
  loadBalancer: {}

需要注意的是,这里默认的配置下,服务器设置了自签名的证书提供 HTTPS 服务,因此,确保你使用 https:// 协议进行访问。例如,地址为:https://192.168.59.59:32746/

修改默认认证方式

kubectl patch deployment \
  argo-server \
  --namespace argo \
  --type='json' \
  -p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args", "value": [
  "server",
  "--auth-mode=server"
]}]'

安装 Argo Workflows CLI

# Download the binary
curl -sLO https://github.com//www.greatytc.com/argoproj//www.greatytc.com/argo/releases/download/v3.0.0-rc4//www.greatytc.com/argo-linux-amd64.gz

# Unzip
gunzip argo-linux-amd64.gz

# Make binary executable
chmod +x argo-linux-amd64

# Move binary to path
mv .//www.greatytc.com/argo-linux-amd64 /usr/local/bin//www.greatytc.com/argo

安装完成后,使用以下命令校验是否安装成功。

argo version
$ argo help

You can use the CLI in the following modes:

#### Kubernetes API Mode (default)

Requests are sent directly to the Kubernetes API. No Argo Server is needed. Large workflows and the workflow archive are not supported.

Use when you have direct access to the Kubernetes API, and don't need large workflow or workflow archive support.

If you're using instance ID (which is very unlikely), you'll need to set it:

        ARGO_INSTANCEID=your-instanceid

#### Argo Server GRPC Mode

Requests are sent to the Argo Server API via GRPC (using HTTP/2). Large workflows and the workflow archive are supported. Network load-balancers that do not support HTTP/2 are not supported.

Use if you do not have access to the Kubernetes API (e.g. you're in another cluster), and you're running the Argo Server using a network load-balancer that support HTTP/2.

To enable, set ARGO_SERVER:

        ARGO_SERVER=localhost:2746 ;# The format is "host:port" - do not prefix with "http" or "https"

If you're have transport-layer security (TLS) enabled (i.e. you are running "argo server --secure" and therefore has HTTPS):

        ARGO_SECURE=true

If your server is running with self-signed certificates. Do not use in production:

        ARGO_INSECURE_SKIP_VERIFY=true

By default, the CLI uses your KUBECONFIG to determine default for ARGO_TOKEN and ARGO_NAMESPACE. You probably error with "no configuration has been provided". To prevent it:

        KUBECONFIG=/dev/null

You will then need to set:

        ARGO_NAMESPACE=argo

And:

        ARGO_TOKEN='Bearer ******' ;# Should always start with "Bearer " or "Basic ".

#### Argo Server HTTP1 Mode

As per GRPC mode, but uses HTTP. Can be used with ALB that does not support HTTP/2. The command "argo logs --since-time=2020...." will not work (due to time-type).

Use this when your network load-balancer does not support HTTP/2.

Use the same configuration as GRPC mode, but also set:

        ARGO_HTTP1=true

If your server is behind an ingress with a path (running "argo server --base-href //www.greatytc.com/argo" or "ARGO_BASE_HREF=//www.greatytc.com/argo argo server"):

        ARGO_BASE_HREF=//www.greatytc.com/argo

Usage:
  argo [flags]
  argo [command]

Available Commands:
  archive          manage the workflow archive
  auth             manage authentication settings
  cluster-template manipulate cluster workflow templates
  completion       output shell completion code for the specified shell (bash, zsh or fish)
  cp               copy artifacts from workflow
  cron             manage cron workflows
  delete           delete workflows
  executor-plugin  manage executor plugins
  get              display details about a workflow
  help             Help about any command
  lint             validate files or directories of manifests
  list             list workflows
  logs             view logs of a pod or workflow
  node             perform action on a node in a workflow
  resubmit         resubmit one or more workflows
  resume           resume zero or more workflows (opposite of suspend)
  retry            retry zero or more workflows
  server           start the Argo Server
  stop             stop zero or more workflows allowing all exit handlers to run
  submit           submit a workflow
  suspend          suspend zero or more workflows (opposite of resume)
  template         manipulate workflow templates
  terminate        terminate zero or more workflows immediately
  version          print version information
  wait             waits for workflows to complete
  watch            watch a workflow until it completes

Flags:
      --argo-base-href string          Path to use with HTTP client due to Base HREF. Defaults to the ARGO_BASE_HREF environment variable.
      --argo-http1                     If true, use the HTTP client. Defaults to the ARGO_HTTP1 environment variable.
  -s, --argo-server host:port          API server host:port. e.g. localhost:2746. Defaults to the ARGO_SERVER environment variable.
      --as string                      Username to impersonate for the operation
      --as-group stringArray           Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
      --as-uid string                  UID to impersonate for the operation
      --certificate-authority string   Path to a cert file for the certificate authority
      --client-certificate string      Path to a client certificate file for TLS
      --client-key string              Path to a client key file for TLS
      --cluster string                 The name of the kubeconfig cluster to use
      --context string                 The name of the kubeconfig context to use
      --disable-compression            If true, opt-out of response compression for all requests to the server
      --gloglevel int                  Set the glog logging level
  -H, --header strings                 Sets additional header to all requests made by Argo CLI. (Can be repeated multiple times to add multiple headers, also supports comma separated headers) Used only when either ARGO_HTTP1 or --argo-http1 is set to true.
  -h, --help                           help for argo
      --insecure-skip-tls-verify       If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
  -k, --insecure-skip-verify           If true, the Argo Server's certificate will not be checked for validity. This will make your HTTPS connections insecure. Defaults to the ARGO_INSECURE_SKIP_VERIFY environment variable.
      --instanceid string              submit with a specific controller's instance id label. Default to the ARGO_INSTANCEID environment variable.
      --kubeconfig string              Path to a kube config. Only required if out-of-cluster
      --loglevel string                Set the logging level. One of: debug|info|warn|error (default "info")
  -n, --namespace string               If present, the namespace scope for this CLI request
      --password string                Password for basic authentication to the API server
      --proxy-url string               If provided, this URL will be used to connect via proxy
      --request-timeout string         The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
  -e, --secure                         Whether or not the server is using TLS with the Argo Server. Defaults to the ARGO_SECURE environment variable. (default true)
      --server string                  The address and port of the Kubernetes API server
      --tls-server-name string         If provided, this name will be used to validate server certificate. If this is not provided, hostname used to contact the server is used.
      --token string                   Bearer token for authentication to the API server
      --user string                    The name of the kubeconfig user to use
      --username string                Username for basic authentication to the API server
  -v, --verbose                        Enabled verbose logging, i.e. --loglevel debug

Use "argo [command] --help" for more information about a command.

主要命令

argo submit hello-world.yaml    # submit a workflow spec to Kubernetes
argo list                       # list current workflows
argo get hello-world-xxx        # get info about a specific workflow
argo logs hello-world-xxx       # print the logs from a workflow
argo delete hello-world-xxx     # delete workflow

argo submit

$ argo submit --help
submit a workflow

Usage:
  argo submit [FILE... | --from `kind/name] [flags]

Examples:
# Submit multiple workflows from files:

  argo submit my-wf.yaml

# Submit and wait for completion:

  argo submit --wait my-wf.yaml

# Submit and watch until completion:

  argo submit --watch my-wf.yaml

# Submit and tail logs until completion:

  argo submit --log my-wf.yaml

# Submit a single workflow from an existing resource

  argo submit --from cronwf/my-cron-wf

# Submit multiple workflows from stdin:

  cat my-wf.yaml | argo submit -


Flags:
      --dry-run                      modify the workflow on the client-side without creating it
      --entrypoint string            override entrypoint
      --from kind/name               Submit from an existing kind/name E.g., --from=cronwf/hello-world-cwf
      --generate-name string         override metadata.generateName
  -h, --help                         help for submit
  -l, --labels string                Comma separated labels to apply to the workflow. Will override previous values.
      --log                          log the workflow until it completes
      --name string                  override metadata.name
      --node-field-selector string   selector of node to display, eg: --node-field-selector phase=abc
  -o, --output string                Output format. One of: name|json|yaml|wide
  -p, --parameter stringArray        pass an input parameter
  -f, --parameter-file string        pass a file containing all input parameters
      --priority int32               workflow priority
      --scheduled-time string        Override the workflow's scheduledTime parameter (useful for backfilling). The time must be RFC3339
      --server-dry-run               send request to server with dry-run flag which will modify the workflow without creating it
      --serviceaccount string        run all pods in the workflow using specified serviceaccount
      --status string                Filter by status (Pending, Running, Succeeded, Skipped, Failed, Error). Should only be used with --watch.
      --strict                       perform strict workflow validation (default true)
  -w, --wait                         wait for the workflow to complete
      --watch                        watch the workflow until it completes

Global Flags:
      --argo-base-href string          Path to use with HTTP client due to Base HREF. Defaults to the ARGO_BASE_HREF environment variable.
      --argo-http1                     If true, use the HTTP client. Defaults to the ARGO_HTTP1 environment variable.
  -s, --argo-server host:port          API server host:port. e.g. localhost:2746. Defaults to the ARGO_SERVER environment variable.
      --as string                      Username to impersonate for the operation
      --as-group stringArray           Group to impersonate for the operation, this flag can be repeated to specify multiple groups.
      --as-uid string                  UID to impersonate for the operation
      --certificate-authority string   Path to a cert file for the certificate authority
      --client-certificate string      Path to a client certificate file for TLS
      --client-key string              Path to a client key file for TLS
      --cluster string                 The name of the kubeconfig cluster to use
      --context string                 The name of the kubeconfig context to use
      --disable-compression            If true, opt-out of response compression for all requests to the server
      --gloglevel int                  Set the glog logging level
  -H, --header strings                 Sets additional header to all requests made by Argo CLI. (Can be repeated multiple times to add multiple headers, also supports comma separated headers) Used only when either ARGO_HTTP1 or --argo-http1 is set to true.
      --insecure-skip-tls-verify       If true, the server's certificate will not be checked for validity. This will make your HTTPS connections insecure
  -k, --insecure-skip-verify           If true, the Argo Server's certificate will not be checked for validity. This will make your HTTPS connections insecure. Defaults to the ARGO_INSECURE_SKIP_VERIFY environment variable.
      --instanceid string              submit with a specific controller's instance id label. Default to the ARGO_INSTANCEID environment variable.
      --kubeconfig string              Path to a kube config. Only required if out-of-cluster
      --loglevel string                Set the logging level. One of: debug|info|warn|error (default "info")
  -n, --namespace string               If present, the namespace scope for this CLI request
      --password string                Password for basic authentication to the API server
      --proxy-url string               If provided, this URL will be used to connect via proxy
      --request-timeout string         The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests. (default "0")
  -e, --secure                         Whether or not the server is using TLS with the Argo Server. Defaults to the ARGO_SECURE environment variable. (default true)
      --server string                  The address and port of the Kubernetes API server
      --tls-server-name string         If provided, this name will be used to validate server certificate. If this is not provided, hostname used to contact the server is used.
      --token string                   Bearer token for authentication to the API server
      --user string                    The name of the kubeconfig user to use
      --username string                Username for basic authentication to the API server
  -v, --verbose                        Enabled verbose logging, i.e. --loglevel debug

也可以使用kubectl直接运行工作流规范,但是Argo CLI提供语法检查、更好的输出,并且需要更少的输入。

使用

1、创建arguments-parameters.yaml文件

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: hello-world-parameters-
spec:
  # invoke the print-message template with "hello world" as the argument to the message parameter
  entrypoint: print-message
  arguments:
    parameters:
    - name: message
      value: hello world

  templates:
  - name: print-message
    inputs:
      parameters:
      - name: message       # parameter declaration
    container:
      # run echo with that message input parameter as args
      image: busybox
      command: [echo]
      args: ["{{inputs.parameters.message}}"]

2、argo submit arguments-parameters.yaml -p message="goodbye world"

当未指定 sa 时,将使用 default 账户。
可以通过--serviceaccount string run all pods in the workflow using specified serviceaccount指定ServiceAccount

$ argo submit arguments-parameters.yaml -p message="goodbye world"
Name:                hello-world-parameters-8smmg
Namespace:           default
ServiceAccount:      unset (will run with the default ServiceAccount)
Status:              Pending
Created:             Mon Mar 24 01:20:08 +0000 (now)
Progress:            
Parameters:          
  message:           goodbye world

指定sa及namespace:argo submit arguments-parameters.yaml -n argo -p message="goodbye world" --serviceaccount argo-server

$ argo submit arguments-parameters.yaml -n argo -p message="goodbye world" --serviceaccount argo-server
Name:                hello-world-parameters-hdxgr
Namespace:           argo
ServiceAccount:      argo-server
Status:              Pending
Created:             Mon Mar 24 01:29:24 +0000 (now)
Progress:            
Parameters:          
  message:           goodbye world

列出workflows

$ argo list -n argo
NAME                           STATUS      AGE   DURATION   PRIORITY   MESSAGE
hello-world-parameters-hdxgr   Error       2m    10s        0          Error (exit code 64): workflowtaskresults.argoproj.io is forbidden: User "system:serviceaccount:argo:argo-server" cannot create resource "workflowtaskresults" in API group "argoproj.io" in the namespace "argo"

获取指定workflow 信息

$ argo get hello-world-parameters-hdxgr -n argo
Name:                hello-world-parameters-hdxgr
Namespace:           argo
ServiceAccount:      argo-server
Status:              Error
Message:             Error (exit code 64): workflowtaskresults.argoproj.io is forbidden: User "system:serviceaccount:argo:argo-server" cannot create resource "workflowtaskresults" in API group "argoproj.io" in the namespace "argo"
Conditions:          
 PodRunning          False
 Completed           True
Created:             Mon Mar 24 01:37:19 +0000 (4 minutes ago)
Started:             Mon Mar 24 01:37:04 +0000 (4 minutes ago)
Finished:            Mon Mar 24 01:37:14 +0000 (4 minutes ago)
Duration:            10 seconds
Progress:            0/1
ResourcesDuration:   0s*(1 cpu),5s*(100Mi memory)
Parameters:          
  message:           goodbye world

STEP                             TEMPLATE       PODNAME                       DURATION  MESSAGE
 ⚠ hello-world-parameters-hdxgr  print-message  hello-world-parameters-hdxgr  6s        Error (exit code 64): workflowtaskresults.argoproj.io is forbidden: User "system:serviceaccount:argo:argo-server" cannot create resource "workflowtaskresults" in API group "argoproj.io" in the namespace "argo

根据提示Error (exit code 64): workflowtaskresults.argoproj.io is forbidden: User "system:serviceaccount:argo:argo-server" cannot create resource "workflowtaskresults" in API group "argoproj.io" in the namespace "argo",为argo命名空间下的argo-server服务账号增加提示权限。
k edit role argo-server-role -n argo -o yaml

- apiGroups:
  - argoproj.io
  resources:
  - eventsources
  - sensors
  - workflows
  - workfloweventbindings
  - workflowtemplates
  - workflowtaskresults   # 增加获取任务结果的权限
  - cronworkflows
  - cronworkflows/finalizers

retry 后,再次获取workflow信息,显示任务执行成功。

$ argo get hello-world-parameters-hdxgr -n argo
Name:                hello-world-parameters-hdxgr
Namespace:           argo
ServiceAccount:      argo-server
Status:              Succeeded
Conditions:          
 PodRunning          False
 Completed           True
Created:             Mon Mar 24 01:37:19 +0000 (11 minutes ago)
Started:             Mon Mar 24 01:48:03 +0000 (26 seconds ago)
Finished:            Mon Mar 24 01:48:13 +0000 (16 seconds ago)
Duration:            10 seconds
Progress:            1/1
ResourcesDuration:   0s*(1 cpu),5s*(100Mi memory)
Parameters:          
  message:           goodbye world

STEP                             TEMPLATE       PODNAME                       DURATION  MESSAGE
 ✔ hello-world-parameters-hdxgr  print-message  hello-world-parameters-hdxgr  6s
$ argo list -n argo
NAME                           STATUS      AGE   DURATION   PRIORITY   MESSAGE
hello-world-parameters-hdxgr   Succeeded   32m   10s        0          
       
$ k get pod -n argo
NAME                                   READY   STATUS      RESTARTS   AGE
hello-world-parameters-hdxgr           0/2     Completed   0          21m

workflow完成后,pod 处于 Completed状态,如何配置当工作流完成时,自动删除工作流相关的 Pod呢?

https:///www.greatytc.com/argo-workflows.readthedocs.io/en/latest/cost-optimisation/#limit-the-total-number-of-workflows-and-pods

Pod GC strategy must be one of the following:

  • OnPodCompletion - delete pods immediately when pod is completed (including errors/failures)
  • OnPodSuccess - delete pods immediately when pod is successful
  • OnWorkflowCompletion - delete pods when workflow is completed
  • OnWorkflowSuccess - delete pods when workflow is successful
$ k edit configmap workflow-controller-configmap -n argo

apiVersion: v1
kind: ConfigMap
metadata:
  name: workflow-controller-configmap
data:
  workflowDefaults: |
    spec:
      # must complete in 8h (28,800 seconds)
      activeDeadlineSeconds: 28800
      # keep workflows for 1d (86,400 seconds)
          # 1 week (604800 seconds)
      ttlStrategy:
        secondsAfterCompletion: 86400
      # delete all pods as soon as workflow complete
      podGC:
        strategy: OnWorkflowCompletion

Access Token

https://argo-workflows.readthedocs.io/en/latest/access-token/

1、创建一个具有最小权限的角色。

# 这个示例角色只允许 jenkins 更新和列出工作流:
kubectl -n jenkins create role jenkins-role --verb=list,update --resource=workflows.argoproj.io

2、为服务创建一个服务帐户

kubectl -n jenkins create sa jenkins

3、将服务帐户绑定到角色

kubectl -n jenkins create rolebinding jenkins-rolebinding --role=jenkins-role --serviceaccount=jenkins:jenkins

4、 创建一个 secret 来保存 token

kubectl -n jenkins apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
  name: jenkins.service-account-token
  annotations:
    kubernetes.io/service-account.name: jenkins
type: kubernetes.io/service-account-token
EOF

查看token:

ARGO_TOKEN="Bearer $(kubectl get secret jenkins.service-account-token -o=jsonpath='{.data.token}' | base64 --decode)"
echo $ARGO_TOKEN

5、token使用和测试

  • 在CLI中使用token,您需要设置ARGO_SERVER
  • 在API中使用token
curl https://localhost:2746/api/v1/workflows/argo -H "Authorization: $ARGO_TOKEN"

问题1、拉取argoexec镜像失败

Back-off pulling image "quay.io/argoproj/argoexec:v3.5.10"

默认 argoexec 会从 quay.io/rgoproj/argoexec:<版本> 拉取镜像,可以通过修改 Argo Workflowsworkflow-controller-configmap 配置项,设置从私有部署的镜像仓库拉取镜像,减少镜像拉取时间,提高 Pod 的运行效率。

apiVersion: v1
data:
  executor: |
    imagePullPolicy: IfNotPresent
    image: harbor.com/quay.io/argoproj/argoexec:v3.5.5
    resources:
      requests:
        cpu: 10m
        memory: 64Mi
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"ConfigMap","metadata":{"annotations":{},"name":"workflow-controller-configmap","namespace":"argo"}}
  creationTimestamp: "2025-03-20T02:31:19Z"
  name: workflow-controller-configmap
  namespace: argo
  resourceVersion: "298627"

参考:
https://github.com/argoproj/argo-workflows/
https://argo-workflows.readthedocs.io/en/latest/
https://zhuanlan.zhihu.com/p/441580936

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容