Toil验证记录

Reply to Toil验证记录 on Thu, 09 Mar 2023 10:49:34 GMT

anneng — Thu, 09 Mar 2023 10:49:34 GMT

Toil Server模式

docker run -d --name wes-rabbitmq -p 5672:5672 rabbitmq:3.9.5
celery -A toil.server.celery_app worker --loglevel=INFO
toil server

curl --location --request POST 'http://localhost:8000/ga4gh/wes/v1/runs' --user test:test --form 'workflow_url="example.cwl"' --form 'workflow_type="cwl"' --form 'workflow_type_version="v1.0"' --form 'workflow_params="{"message": "Hello world!"}"' --form
'workflow_attachment=@"./example.cwl"'

===========需要metrics-server==============================
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
卸载 metrics-server
kubectl delete -f components.yaml

根据 https://stackoverflow.com/questions/71843068/metrics-server-is-currently-unable-to-handle-the-request

labels:
    k8s-app: metrics-server
spec:
  containers:
  - args:
    - --cert-dir=/tmp
    - --secure-port=443
    - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
    - --kubelet-use-node-status-port
    - --metric-resolution=15s
    - --kubelet-insecure-tls **# add this line**

kubectl apply -f components.yaml
否则会产生下面的错误
kubectl get deployment/metrics-server -n kube-system
v1beta1.metrics.k8s.io kube-system/metrics-server False (MissingEndpoints) 44m
测试：
kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node1 1076m 13% 17670Mi 75%
node2 1295m 16% 10048Mi 65%
node3 1168m 14% 14871Mi 63%

============apparmor可能有影响删除这个服务（产品环境按照https://github.com/adamnovak/gi-kubernetes-autoscaling-config/blob/e1350ac9ad17d94b5073b20db3c75620957926e3/kubenode.ubuntu.cloud-config.yaml#L27-L67设置）=====
sudo systemctl stop apparmor.service
sudo systemctl disable apparmor.service

Toil server在启动 toil-cwl-runner的时候可能是没有把全局变量传递过去会报错但是直接使用下面的提交就成功了
export TOIL_WORKDIR=/cephfs_data/toil
export TOIL_KUBERNETES_HOST_PATH=/cephfs_data/toil
toil-cwl-runner --writeMessages=/cephfs_data/toil/run-6aef556521e1460e94b0557ce848f49e/bus_messages --batchSystem=kubernetes --workDir=/cephfs_data/toil --clean=always --outdir=/cephfs_data/toil/run-6aef556521e1460e94b0557ce848f49e/outputs --jobStore=/cephfs_data/toil/run-6aef556521e1460e94b0557ce848f49e/toil_job_store /cephfs_data/toil/run-6aef556521e1460e94b0557ce848f49e/execution/example.cwl /cephfs_data/toil/run-6aef556521e1460e94b0557ce848f49e/execution/wes_inputs.json

cat /cephfs_data/toil/run-6aef556521e1460e94b0557ce848f49e/outputs/output.txt
Hello world!

后面产品环境看看是用hostpath 还是pv

Reply to Toil验证记录 on Thu, 09 Mar 2023 08:06:33 GMT

anneng — Thu, 09 Mar 2023 08:06:33 GMT

AWS对toil的支持
https://aws.github.io/amazon-genomics-cli/docs/workflow-engines/toil/

Reply to Toil验证记录 on Tue, 07 Mar 2023 03:12:23 GMT

anneng — Tue, 07 Mar 2023 03:12:23 GMT

https://www.researchgate.net/publication/345904527_Rapid_and_efficient_analysis_of_20000_RNA-seq_samples_with_Toil
一个案例：
Rapid and efficient analysis of 20,000 RNA-seq samples with Toil

Reply to Toil验证记录 on Wed, 08 Mar 2023 09:04:16 GMT

anneng — Wed, 08 Mar 2023 09:04:16 GMT

Toil k8s 部署方式二将toil leader部署到k8s 外部
主要用于开发测试而且当前要求本地能访问 aws
实时日志将无法使用除非本地有外网
Real time logging will not work unless your local machine is able to listen for incoming UDP packets on arbitrary ports on the address it uses to contact the IPv4 Internet; Toil does no NAT traversal or detection.

$ export TOIL_KUBERNETES_OWNER=demo-user  # This defaults to your local username if not set
$ export TOIL_AWS_SECRET_NAME=aws-credentials
$ export TOIL_KUBERNETES_HOST_PATH=/data/scratch
$ virtualenv --python python3 --system-site-packages venv
$ . venv/bin/activate
$ wget https://raw.githubusercontent.com/DataBiosphere/toil/releases/4.1.0/src/toil/test/docs/scripts/tutorial_helloworld.py
$ python3 tutorial_helloworld.py \
      aws:us-west-2:demouser-toil-test-jobstore \
      --batchSystem kubernetes \
      --realTimeLogging \
      --logInfo

ModuleNotFoundError: No module named 'boto'
pip install boto botocore boto3 mypy_boto3_s3

尝试将任务在k8s上启动使用file模式任务可以下发到minikube 但是无法正常启动 toil默认会挂载aws的pv
python3 tutorial_helloworld.py file:job-store --batchSystem kubernetes --realTimeLogging --logInfo
MountVolume.SetUp failed for volume "s3-credentials" : secret "aws-credentials" not found
不要设置 export TOIL_AWS_SECRET_NAME=aws-credentials

验证结果:

export TOIL_KUBERNETES_HOST_PATH=/home/jynlix/Downloads/src/toil/data
export TOIL_WORKDIR=/home/jynlix/Downloads/src/toil/data
minikube mount /home/jynlix/Downloads/src/toil/data:/home/jynlix/Downloads/src/toil/data
python3 -m pdb tutorial_helloworld.py file:job-store  --batchSystem kubernetes --realTimeLogging --logInfo

Reply to Toil验证记录 on Thu, 09 Mar 2023 08:25:36 GMT

anneng — Thu, 09 Mar 2023 08:25:36 GMT

Toil k8s 部署方式一将toil leader部署到k8s 内部

apiVersion: batch/v1
kind: Job
metadata:
  # It is good practice to include your username in your job name.
  # Also specify it in TOIL_KUBERNETES_OWNER
  name: demo-user-toil-test
# Do not try and rerun the leader job if it fails

spec:
 backoffLimit: 0
 template:
   spec:
     # Do not restart the pod when the job fails, but keep it around so the
     # log can be retrieved
     restartPolicy: Never
     volumes:
     - name: aws-credentials-vol
       secret:
         # Make sure the AWS credentials are available as a volume.
         # This should match TOIL_AWS_SECRET_NAME
         secretName: aws-credentials
     # You may need to replace this with a different service account name as
     # appropriate for your cluster.
     serviceAccountName: default
     containers:
     - name: main
       image: quay.io/ucsc_cgl/toil:5.5.0
       env:
       # Specify your username for inclusion in job names
       - name: TOIL_KUBERNETES_OWNER
         value: demo-user
       # Specify where to find the AWS credentials to access the job store with
       - name: TOIL_AWS_SECRET_NAME
         value: aws-credentials
       # Specify where per-host caches should be stored, on the Kubernetes hosts.
       # Needs to be set for Toil's caching to be efficient.
       - name: TOIL_KUBERNETES_HOST_PATH
         value: /data/scratch
       volumeMounts:
       # Mount the AWS credentials volume
       - mountPath: /root/.aws
         name: aws-credentials-vol
       resources:
         # Make sure to set these resource limits to values large enough
         # to accommodate the work your workflow does in the leader
         # process, but small enough to fit on your cluster.
         #
         # Since no request values are specified, the limits are also used
         # for the requests.
         limits:
           cpu: 2
           memory: "4Gi"
           ephemeral-storage: "10Gi"
       command:
       - /bin/bash
       - -c
       - |
         # This Bash script will set up Toil and the workflow to run, and run them.
         set -e
         # We make sure to create a work directory; Toil can't hot-deploy a
         # script from the root of the filesystem, which is where we start.
         mkdir /tmp/work
         cd /tmp/work
         # We make a virtual environment to allow workflow dependencies to be
         # hot-deployed.
         #
         # We don't really make use of it in this example, but for workflows
         # that depend on PyPI packages we will need this.
         #
         # We use --system-site-packages so that the Toil installed in the
         # appliance image is still available.
         virtualenv --python python3 --system-site-packages venv
         . venv/bin/activate
         # Now we install the workflow. Here we're using a demo workflow
         # script from Toil itself.
         wget https://raw.githubusercontent.com/DataBiosphere/toil/releases/4.1.0/src/toil/test/docs/scripts/tutorial_helloworld.py
         # Now we run the workflow. We make sure to use the Kubernetes batch
         # system and an AWS job store, and we set some generally useful
         # logging options. We also make sure to enable caching.
         python3 tutorial_helloworld.py \
             aws:us-west-2:demouser-toil-test-jobstore \
             --batchSystem kubernetes \
             --realTimeLogging \
             --logInfo

kubectl apply -f leader.yaml

注意：
Note that the leader pod will need your workflow script, its other dependencies, and Toil all installed. An easy way to get Toil installed is to start with the Toil appliance image for the version of Toil you want to use. In this example, we use quay.io/ucsc_cgl/toil:5.5.0.
Toil要求这种模式把脚本、Toil都打包到镜像里面

Reply to Toil验证记录 on Fri, 03 Mar 2023 03:49:16 GMT

anneng — Fri, 03 Mar 2023 03:49:16 GMT

minikube的其他参考
https://www.zhaowenyu.com/minikube-doc/ops/minikube.html