The publishing-bot for the Kubernetes project is running in the publishing-bot
namespace on a CNCF sponsored GKE cluster aaa
in the kubernetes-public
project.
If you need access to any of the following, please update groups.yaml.
publishing-bot is running in a GKE cluster named aaa
in the kubernetes-public
The cluster can be accessed by [email protected]. To access the cluster, please see these instructions.
Publishing-bot images can be pushed by [email protected].
Make sure you are at the root of the publishing-bot repo before running these commands.
This script needs to be run whenever a new staging repo is added in kubernetes/kubernetes
hack/fetch-all-latest-and-push.sh kubernetes
make validate build-image push-image deploy CONFIG=configs/kubernetes
You can use the Activate Cloud Shell
in the GCP console above and in that console, run the following command
gcloud container clusters get-credentials aaa --region us-central1 --project kubernetes-public
then run kubectl
commands to ensure you can see what's running in the cluster.
The publishing-bot
runs in a separate kubernetes namespace by the same name in the aaa
cluster.
The manifests here have the definitions
for these kubernetes resources. Example below:
davanum@cloudshell:~ (kubernetes-public)$ kubectl get pv,pvc,replicaset,pod -n publishing-bot
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-084a4d52-0a57-4f70-a76a-5d2d2667429d 100Gi RWO Delete Bound publishing-bot/publisher-gopath ssd 8h
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/publisher-gopath Bound pvc-084a4d52-0a57-4f70-a76a-5d2d2667429d 100Gi RWO ssd 8h
NAME DESIRED CURRENT READY AGE
replicaset.apps/publisher 1 1 1 45d
NAME READY STATUS RESTARTS AGE
pod/publisher-cdvwj 1/1 Running 0 9h
Follow this Kubernetes issue #56876. When the bot fails it re-opens this issue with a fresh log. So if you are subscribed to this issue, you can see the bot open the issue when it fails.
you can stream the logs of the pod to see what the publishing-bot is doing
kubectl -n publishing-bot logs pod/publisher-cdvwj -f
To do its work the publishing-bot has to download all the repositories and performs git surgery on them. So publishing-bot keeps the downloaded copy around and re-uses them. For example, if the pod gets killed the new pod can still work off of the downloaded git repositories on the persistent volume. Occasionally if we suspect the downloaded git repos are corrupted for some reason (say github flakiness), we may have to cleanup the pv/pvc. in other words, The volume is cache only. Wiping it is not harmful in general (other than for the time it takes to recreate all the data).
Step 1: Use the command to scale down the replicaset
kubectl scale -n publishing-bot --replicas=0 replicaset publisher
Step 2: Delete the PVC
kubectl delete -n publishing-bot persistentvolumeclaim/publisher-gopath
Step 3: Make sure the PVC is deleted and removed from the namespace
kubectl get -n publishing-bot pvc
should not list any PVCs
Step 4: Re-deploy the pvc again
kubectl apply -n publishing-bot -f artifacts/manifests/pvc.yaml
Step 5: Scale up the replicaset
kubectl scale -n publishing-bot --replicas=1 replicaset publisher
Step 6: Watch the pod start back up from Pending
kubectl -n publishing-bot get pods