Skip to content

Tag: kubernetes

AKS + Private Link Service + Private Endpoint

This walkthrough shows how to setup a Private Link Service with an AKS cluster and create a Private Endpoint in a separate Vnet. While many tutorials might give you a full ARM template, this is designed as a walkthrough which completely uses the CLI so you can understand what's happening at every step of the process. It focuses on an "uninteresting" workload and uses podinfo as the sample app. This is because it's easy to deploy and customize with a sample Helm chart. This is inspired and leans heavily on the Azure Docs for creating a Private Link Service. Architecture Private Link Endpoint Service Prerequisites Azure CLI jq Assumptions This walkthrough assumes you let Azure create the Vnet when creating the AKS cluster. If you manually created the Vnet, then the general steps are the same, except you must enter the AKS_MC_VNET, AKS_MC_SUBNET env vars manually. Setup Steps First, create a sample AKS cluster and install Podinfo on it. Set these values AKS_NAME= AKS_RG= LOCATION= Create the AKS cluster az aks create -n $AKS_NAME -g $AKS_RG Get the MC Resource Group AKS_MC_RG=$(az aks show -n $AKS_NAME -g $AKS_RG | jq -r '.nodeResourceGroup') echo $AKS_MC_RG Get the Vnet Name AKS_MC_VNET=$(az network vnet list -g $AKS_MC_RG | jq -r '.[0].name') echo $AKS_MC_VNET AKS_MC_SUBNET=$(az network vnet subnet list -g $AKS_MC_RG --vnet-name $AKS_MC_VNET | jq -r '.[0].name') echo $AKS_MC_SUBNET AKS_MC_LB_INTERNAL=kubernetes-internal AKS_MC_LB_INTERNAL_FE_CONFIG=$(az network lb rule list -g $AKS_MC_RG --lb-name=$AKS_MC_LB_INTERNAL | jq -r '.[0].frontendIpConfiguration.id') echo $AKS_MC_LB_INTERNAL_FE_CONFIG Deploy a sample app using an Internal LB helm upgrade --install --wait podinfo-internal-lb \ --set-string service.annotations."service\.beta\.kubernetes\.io\/azure-load-balancer-internal"=true \ --set service.type=LoadBalancer \ --set ui.message=podinfo-internal-lb \ podinfo/podinfo Install Steps - Create the Private Link Service These steps will be done in the MC_ resource group. Disable the private link service network policies az network vnet subnet update \ --name $AKS_MC_SUBNET \ --resource-group $AKS_MC_RG \ --vnet-name $AKS_MC_VNET \ --disable-private-link-service-network-policies true Create the PLS PLS_NAME=aks-pls az network private-link-service create \ --resource-group $AKS_MC_RG \ --name $PLS_NAME \ --vnet-name $AKS_MC_VNET \ --subnet $AKS_MC_SUBNET \ --lb-name $AKS_MC_LB_INTERNAL \ --lb-frontend-ip-configs $AKS_MC_LB_INTERNAL_FE_CONFIG Install Steps - Create the Private Endpoint These steps will be done in our private-endpoint-rg resource group. PE_RG=private-endpoint-rg az group create \ --name $PE_RG \ --location $LOCATION PE_VNET=pe-vnet PE_SUBNET=pe-subnet az network vnet create \ --resource-group $PE_RG \ --name $PE_VNET \ --address-prefixes 10.0.0.0/16 \ --subnet-name $PE_SUBNET \ --subnet-prefixes 10.0.0.0/24 Disable the private link service network policies az network vnet subnet update \ --name $PE_SUBNET \ --resource-group $PE_RG \ --vnet-name $PE_VNET \ --disable-private-endpoint-network-policies true PE_CONN_NAME=pe-conn PE_NAME=pe az network private-endpoint create \ --connection-name $PE_CONN_NAME \ --name $PE_NAME \ --private-connection-resource-id $PLS_ID \ --resource-group $PE_RG \ --subnet $PE_SUBNET \ --manual-request false \ --vnet-name $PE_VNET We need the NIC ID to get the newly created Private IP PE_NIC_ID=$(az network private-endpoint show -g $PE_RG --name $PE_NAME -o json | jq -r '.networkInterfaces[0].id') echo $PE_NIC_ID Get the Private IP from the NIC PE_IP=$(az network nic show --ids $PE_NIC_ID -o json | jq -r '.ipConfigurations[0].privateIpAddress') echo $PE_IP Validation Steps - Create a VM Lastly, validate that this works by creating a VM in the Vnet with the Private Endpoint. VM_NAME=ubuntu az vm create \ --resource-group $PE_RG \ --name ubuntu \ --image UbuntuLTS \ --public-ip-sku Standard \ --vnet-name $PE_VNET \ --subnet $PE_SUBNET \ --admin-username $USER \ --ssh-key-values ~/.ssh/id_rsa.pub VM_PIP=$(az vm list-ip-addresses -g $PE_RG -n $VM_NAME | jq -r '.[0].virtualMachine.network.publicIpAddresses[0].ipAddress') echo $VM_PIP SSH into the host ssh $VM_IP $ curl COPY_THE_VALUE_FROM_PE_IP:9898 The output should look like: $ curl 10.0.0.5:9898 { "hostname": "podinfo-6ff68cbf88-cxcvv", "version": "6.0.3", "revision": "", "color": "#34577c", "logo": "/images/2022/cuddle_clap.gif", "message": "podinfo-internal-lb", "goos": "linux", "goarch": "amd64", "runtime": "go1.16.9", "num_goroutine": "9", "num_cpu": "2" } Multiple PLS/PE To test a specific use case, I wanted to create multiple PLS and PE's. This set of instructions lets you easily loop through and create multiple instances. podinfo requires a high numbered port, eg 9000+ SUFFIX=9000 helm upgrade --install --wait podinfo-$SUFFIX \ --set-string service.annotations."service\.beta\.kubernetes\.io\/azure-load-balancer-internal"=true \ --set service.type=LoadBalancer \ --set service.httpPort=$SUFFIX \ --set service.externalPort=$SUFFIX \ --set ui.message=podinfo-$SUFFIX \ podinfo/podinfo This might be easier to hard-code AKS_MC_LB_INTERNAL_FE_CONFIG=$(az network lb rule list -g $AKS_MC_RG --lb-name=$AKS_MC_LB_INTERNAL -o json | jq -r ".[] | select( .backendPort == $SUFFIX) | .frontendIpConfiguration.id") echo $AKS_MC_LB_INTERNAL_FE_CONFIG PLS_NAME=aks-pls-$SUFFIX PE_CONN_NAME=pe-conn-$SUFFIX PE_NAME=pe-$SUFFIX az network private-link-service create \ --resource-group $AKS_MC_RG \ --name $PLS_NAME \ --vnet-name $AKS_MC_VNET \ --subnet $AKS_MC_SUBNET \ --lb-name $AKS_MC_LB_INTERNAL \ --lb-frontend-ip-configs $AKS_MC_LB_INTERNAL_FE_CONFIG PLS_ID=$(az network private-link-service show \ --name $PLS_NAME \ --resource-group $AKS_MC_RG \ --query id \ --output tsv) echo $PLS_ID az network private-endpoint create \ --connection-name $PE_CONN_NAME \ --name $PE_NAME \ --private-connection-resource-id $PLS_ID \ --resource-group $PE_RG \ --subnet $PE_SUBNET \ --manual-request false \ --vnet-name $PE_VNET PE_NIC_ID=$(az network private-endpoint show -g $PE_RG --name $PE_NAME -o json | jq -r '.networkInterfaces[0].id') echo $PE_NIC_ID PE_IP=$(az network nic show --ids $PE_NIC_ID -o json | jq -r '.ipConfigurations[0].privateIpAddress') echo $PE_IP echo "From your Private Endpoint VM run: curl $PE_IP:$SUFFIX" I created this article to help myself (and hopefully you!) to clearly understand all of the resources and how they interact to create a Private Link Service and Private Endpoint fronting a private service inside an AKS cluster. This has been highly enlightening for me and I hope it has for you too.

· 5 min read

When, How and Where to use ClusterAPI (CAPI) and ClusterAPI for Azure (CAPZ)

This article explains why, when, and how to use self-managed Kubernetes clusters in Azure for testing custom scenarios. Kubernetes has gotten so large and complex that most companies prefer to use the managed service (e.g. AKS, GKE) instead of running it themselves. By using a managed Kubernetes service, this frees up the operations team to focus on their core competency instead of optimizing, backing up and upgrading of Kubernetes. While this reduces the operational burden, you lose the ability to modify the platform. Sometimes these are acceptable tradeoffs, sometimes you need to manage it yourself. Historically, AKS-engine was the OSS tool for creating unmanaged Kubernetes clusters on Azure, but it had some limitations. CAPI/CAPZ is the go-forward solution for creating and operating self-managed clusters declaratively. I highly recommend reading Scott Lowe's article on An introduction to CAPI. It covers a lot of terminology and concepts used here. One of the reasons for using CAPI/CAPZ is as a testing and development tool for Kubernetes on Azure. For example, you might need to build and test the following scenarios: A kernel change to the worker nodes A modification to the K8S config on control plane nodes An installation of a different CNI The use of K8S to manage K8S This diagram represents a high level architecture of a starter CAPI/CAPZ cluster. The rest of this article will explain how to implement the above scenarios utilizing the CAPI quickstart. Because the command arguments will change over time, this article will describe the steps and provide a link to the full details like this: Link to CAPI Quick Start with details: base command to run Create the KIND Cluster Similar to RepRap, CAPI uses a Kubernetes cluster to make more Kubernetes clusters. The easiest way is with Kuberenetes IN Docker (KIND). As the name implies, it's a Kubernetes cluster which runs as a Docker container. This is our starting point for what we call "Bootstrap Cluster". Create Kind Cluster: kind create cluster Initialize cluster for Azure We will use this bootstrap cluster to initialize the "Management Cluster" which contains all of the CRDs and runs the CAPI controllers. This is where we will apply all of our changes to meet our scenarios. Initialize cluster for Azure: clusterctl init --infrastructure azure Generate cluster configuration Now that our management cluster is ready, we want to define what our workload cluster will look like. Thankfully, there are different flavors we can pick from. By using the default, we will get an unmanaged K8S cluster using virtual machines. Generate cluster configuration: clusterctl generate cluster capi-quickstart > capi-quickstart.yaml We now have a file which contains the CRDs which will define our workload cluster. We will modify capi-quickstart.yaml and edit the CRDs to implement each of our scenarios. Full documentation is available for CAPI (baseline) CRDs and CAPZ (Azure specific resources) CRDs. Scenario: Worker node kernel change If we want to modify the worker nodes, we likely want to add a preKubeadmCommands and postKubeadmCommands directive in the KubeadmConfigTemplate. preKubeadmCommands allows a list of commands to run on the worker node BEFORE joining the cluster. postKubeadmCommands allows a list of commands to run on the worker node AFTER joining the cluster. apiVersion: bootstrap.cluster.x-k8s.io/v1alpha4 kind: KubeadmConfigTemplate metadata: name: capi-quickstart-md-0 namespace: default spec: template: spec: preKubeadmCommands: wget -P /tmp https://kernel.ubuntu.com/.deb dpkg -i /tmp/.deb postKubeadmCommands: reboot After you've made these changes, you can proceed to the rest of the steps by applying the resources to your management cluster which will then create your workload cluster and deploy the CNI. Scenario: Modify Kubernetes components If we want to modify the control plane, we can make changes to the KubeadmControlPlane. This allows us to leverage the kubeadm API to customize various components. For example, to enable a Feature Gate on the kube-apiserver: apiVersion: controlplane.cluster.x-k8s.io/v1alpha4 kind: KubeadmControlPlane metadata: name: capi-quickstart-control-plane namespace: default spec: kubeadmConfigSpec: clusterConfiguration: apiServer: extraArgs: feature-gates: MyFeatureGate=true The above example omits some fields for brevity. Make sure that you keep any existing args and configurations that you are not modifying in-place. After you've made these changes, you can proceed to the rest of the steps by applying the resources to your management cluster which will then create your workload cluster and deploy the CNI. Apply the Workload Cluster Now that we have defined what our cluster should look like, apply the resources to the management cluster. The CAPZ operator will detect the updated resources and talk to Azure Resource Manager. Apply the workload cluster kubectl apply -f capi-quickstart.yaml Monitor the Cluster Creation After you've made the changes to the capi-quickstart.yaml resources and applied them, you're ready to watch the cluster come up. Watch the cluster creation: kubectl get cluster clusterctl describe cluster capi-quickstart kubectl get kubeadmcontrolplane - Verify the Control Plane is up Now that the workload cluster is up and running, it's time to start using it! Get the Kubeconfig for the Workload Cluster Now that we're dealing with two clusters (management cluster in Docker and workload cluster in Azure), we now have two kubeconfig files. For ease, we will save it to the local directory. Get the Kubeconfig for the workload cluster clusterctl get kubeconfig capi-quickstart > capi-quickstart.kubeconfig Install the CNI By default, the workload cluster will not have a CNI and one must be installed. Deploy the CNI kubectl --kubeconfig=./capi-quickstart.kubeconfig apply -f https://...calico.yaml Scenario: Install a different CNI If you want to use flannel as your CNI, then you can apply the resources to your management cluster which will then create your workload cluster. However, instead of Deploying the CNI, you can follow the steps in the Install Flannel walkthrough. Cleanup When you're done, you can cleanup both the workload and management cluster easily. Delete the workload cluster kubectl delete cluster capi-quickstart If you want to create the workload cluster again, you can do so by re-applying capi-quickstart.yaml Delete the management cluster kind delete cluster If you want to create the management cluster again, you must start from scratch. If you delete the management cluster without deleting the workload cluster, then the workload cluster and Azure resources will remain. Summary Similar to how Kubernetes allows you to orchestrate containers using a declarative syntax, CAPI/CAPZ allows you to do the same, but for Kubernetes clusters in Azure. This article covered example scenarios for when to use CAPI/CAPZ as well as a walkthrough on how to implement them. I'm especially excited for the future of CAPI/CAPZ and how it can integrate with other Cloud Native methodologies like GitOps to declaratively manage clusters. P.S. I am extremely grateful to Cecile Robert Michon's (Twitter & Github) technical guidance for this article. Without her support, I wouldn't have gotten this far and definitely would have missed a few key scenarios. Thanks Cecile!

· 6 min read

Ark + Azure Kubernetes Service

As much as Cloud Providers tout their availability and uptime, disasters happen. It's inevitable. And it's usually up to you to be prepared. There are services that can help; however, they're not always "Kubernetes aware". Thankfully, the great folks at Heptio open-sourced Ark, a Disaster Recovery tool which works for all the major cloud providers. I got hands-on with Ark and followed their Azure steps. It was a good start, but didn't highlight how an actual failover and recovery would look to the operator. I took their steps and created a step-by-step guide to perform a full migration. Ark support Azure native resources, namely Managed Disk + Snapshots. You can review those steps here: https://github.com/heptio/ark/blob/master/docs/azure-config.md Another option would be to use Restic, which performs backups to a local file system. Later, I'll detail the steps on how to use Restic with Azure. If you're looking for Best Practices on supporting Business Continuity and Disaster Recovery for AKS/K8S clusters in Azure, you're in luck! I wrote a Microsoft article covering this use case, which can be found here: https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-multi-region

· 1 min read

The Journey to Kubernetes

I created this article with the intent of explaining the migration journey from deploying a legacy application with manual steps to an automated Kubernetes deployment with proper DevOps practices. Its intent is not to help you understand Kubernetes deeper (there’s an abundance of materials out there already). As a Cloud Solution Architect for Microsoft, every week I work with our partners to assist them towards containerization and Kubernetes. I’ll use AKS and discuss it’s strengths and weaknesses without holding punches. Disclaimer: Given I work for Microsoft, I am self-aware of my bias. So in this article, I will make an effort to be more critical of Azure to balance that out. Beginning With the End in Mind, I created the following outline: Intent Duckiehunt is secure, monitored and deployable with the least amount of manual effort, cost and code-change. Purpose I wrote Duckiehunt in 2007 as a LAMP website. It embodies many of the customer requirements I see: Old code, using legacy tooling Want a reliable, resilient infrastructure Want to automate deployment Don't want to re-write Migration should involve minimal/no code change Need to update to modern standards (e.g. HTTPS, MySQL encryption, private DB instance with backups) Outcomes CI/CD (Code Check-in triggers automated tests and pushes to Production) Monitoring cluster + app (visualization + alerts if down) HTTPS enabled for duckiehunt.com (CA Cert + forced redirection to https) Running on Kubernetes (AKS) Managed MySQL Milestones: (in reverse order of accomplishment) Production DNS migrated Azure Monitor + Container Monitoring Solution + LogAnalytics Distinct Dev + Prod environments VSTS + Github integration Securely expose UI + API Integrated MySQL instance Installed on AKS Test in Minikube Migrate App to Container From here on, I’ll explain my journey as steps fulfilling the milestones I created. I’ll list my estimated time, as along with my actual time to compare. The times below are not “Time to get X working”, but “Time to get X working correctly and automate as if I had to support this in production” (which I do). As a result, they’re much higher than a simple success case. Migrate app to Container Estimated Time: 4 hours. Actual Time: 10 hours I wrote this in 2007 using a PHP version that is no longer supported (5.3) and a framework (CodeIgniter) that is not as active. I didn’t want to re-write it yet. Thankfully, 5.6 is mostly backwards compatible and I was able to find a container using that. I would have been done in ~4 hours; however, I lost an embarrassing amount of hours banging my head against the wall when I automated the docker build. (I would always get 404) I learned this was because Linux’s file system is case-sensitive and OSX’s is not, and the PHP framework I chose in 2007 expects the first character of some files to start with a capital letter. grumble* *grumble Test in Minikube Estimated time: 12 hours. Actual Time: 10 hours Now that I got my PHP app running in a container, it was time to get it running inside Kubernetes. To do this, I needed to deploy, integrate and test the following: Pod, Service, Secrets, Configuration, MySQL and environment variables. This is a pretty iterative approach of "This, this…nope…how about this?...Nope...This?...ah ha!...Ok, now this...Nope." This is where Draft comes in. It’s a Kubernetes tool specifically designed for this use case, and I think I’ve started to develop romantic feelings for this tool because of how much time and headache it saved me while being dead simple to use. Install in AKS Estimated time: 8 hours. Actual time: 2 hours Creating a new AKS cluster takes about 10 minutes and is instantly ready to use. Because I had done the work on testing it Minikube the hard-word was already done, but I expected some additional hiccups. Again, this is where my love and adoration of Draft started to shine. I was almost done in 30 minutes, but I took some shortcuts with Minikube that came back to bite me. Integrated MySQL instance Estimated time: 2 hours. Actual time: 3 hours Azure now offers MySQL as a Service (aka Azure Database for MySQL) and I chose to use that. I could have run MySQL in a container in the cluster; however, I would have had to manage my own SLA, backups, scaling, etc. Given my intent of this project is to have the least amount of work and cost, and the cost is still within my MSDN budget, I chose to splurge. I spent an hour experimenting with Open Service Broker for Azure (a way of managing external dependencies, like MySQL, native to K8S). I really like the idea, but I wanted one instance for both Dev + Prod and needed a high control over how my app read in database parameters (since it was written in 2007). If I was doing more deployments than one, OSBA would be the right fit, but not this time. Steps taken: Create the Azure Database for MySQL Instance Created the dev/prod accounts Migrated the data (mysqldump) White-listed the source IPs (To MySQL, the cluster traffic looks as if it's coming from the Ingress IP address) Injected the connection string to my application (Using K8S Secrets) Then I was off to the races. OSBA would have automated all of that for me, but I'll save that for a proverbial rainy day. Securely expose UI + API Estimated time: 4 hours. Actual time: 20 hours This was the most frustrating part of the entire journey. I decided to use Nginx Ingress Controller with Cert-manager (for SSL). There’s lots of old documentation that conflicts with recommended practices, which led to lots of confusion and frustration. I got so frustrated I purposely deleted the entire cluster and started from scratch. Lessons’ learned: nginx-ingress is pretty straight-forward and stable. Cert-manager is complicated and I had to restart it a lot. I really miss kube-lego (same functionality, but deprecated. Kube-lego was simple and reliable) Put your nginx-ingress + cert-manager in kube-system, not in the same namespace as your app You might have to restart cert manager pods when you modify services. I had issues where cert-manager was not registering my changes. cert-manager might take ~30 minutes to re-calibrate itself and successfully pull the cert it’s been failing on for the last 6 hours cert-manager creates secrets when it tries to negotiate, so be mindful of extra resources left around, even if you delete the helm chart cert-manager injects its own ingress into your service for verifying you own the domain. If you don’t have your service/ingress working properly, cert-manager will not work If you’re doing DNS changes, cert-manager will take a long time to “uncache” the result. Rebooting kibe-dns doesn’t help. There’s no documentation for best-practices for setting up 2 different domains with cert-manager (e.g. dev.duckiehunt.com; www.duckiehunt.com) AKS's HTTP application routing is a neat idea, but you cannot use custom domains. So you're forced to use its *.aksapps.io domain for your services. Great idea, but not useful in real-world scenarios To summarize, I was finally able to get development and production running in two different namespaces with one ingress controller and one cert-manager. Should have been simple, but death-by-1000-papercuts ensued with managing certs for each of them. Now I’m wiser, but the journey was long and frustrating. That might involve a blog post of its own. VSTS + Github integration Estimated time: 4 hours. Actual time: 2 hours VSTS makes CI/CD easy. Real easy. Almost too easy. I lost some time (and ~8 failed builds) because the VSTS UX isn’t intuitive to me and documentation is sparse. But now that it’s working, I have a fully automated Github commit -> Production release pipeline which completes within 5 minutes. This will save me a tremendous amount of time in the future. This is what I’m most excited about. Azure Monitor + Container Monitoring Solution + LogAnalytics *Estimated time: 3 hour. Actual time: None. * This was the surprising part. All of this work was already done for me by setting up the AKS cluster and integrated into the portal. I was impressed that this was glued together without any additional effort needed. That said, here’s some “gotchas”: The LogAnalytics SLA is 6 hours. My testing showed that new logs showed up within 5 minutes, but after a cluster is newly created, initial logs would take 30 minutes to appear. The LogAnalytics UX isn’t intuitive, but the query language is extremely powerful and each of the pods logs were available by clicking through the dashboard. Monitoring and Logging are two pillars of the solution; however, Alerting is missing from the documentation. That integration is forthcoming, and will likely involve another blog entry. The “Health” tile is useful for getting an overview of your cluster; however, the “Metrics” tile seems pretty limited. Both are still in Preview, and I expect to see additional improvements coming soon. Production DNS migrated Estimated time: 1 hour. Actual time: 1 hour Since I did the heavy lifting in the “Securely expose UI + API” section, this was as easy as flipping a light switch and updating the DNS record in my registrar (dreamhost.com). No real magic here. Summary This has been a wonderful learning experience for me, because I was not just trying to showcase AKS/K8S and its potential, but also using it as it is intended to be used, thus getting my hands dirtier than normal. Most of the underestimated time was spent on a few issues that “rat-holed” me due to technical misunderstandings and gaps in my knowledge. I’ve filled in many of those gaps now and hope that it saves you some time too. If this has been valuable for you, please let me know by commenting below. And if you’re interesting in getting a DuckieHunt duck, let me know as I’d love to see more take flight! P.S. The source code for this project is also available here.

· 9 min read

How to SSH into an AKS agent node

WARNING: SSH'ing into an agent node is an anti-pattern and should be avoided. However, we don't live in an ideal world, and sometimes we have to do the needful. Overview This walkthrough creates an SSH Server running as a Pod in your Kubernetes cluster and uses it as a jumpbox to the agent nodes. It is designed for users managing a Kubernetes cluster who cannot readily SSH to into their agent nodes (e.g. AKS) does not publicly expose the agent nodes for security considerations). This is one of the steps in the Kubernetes Workshop I have built when working with our partners. NOTE It has been tested in AKS cluster; however, it should also work in other cloud providers. You can follow the steps on the SSH to AKS Cluster Nodes walkthrough; however, that requires you to upload your Private SSH key which I would rather avoid. Assumptions The SSH Public key has been installed for your user on the Agent host You have jq installed Not vital, but makes the last step easier to understand. Install an SSH Server If you're paranoid, you can generate your own SSH server container; however, (https://github.com/corbinu/ssh-server) has some pretty good security defaults and is available on Docker Hub. kubectl run ssh-server --image=corbinu/ssh-server --port=22 --restart=Never Setup port forward Instead of exposing a service with an IP+Port, we'll take the easy way and use kubectl to port-forward to your localhost. NOTE: Run this in a separate window since it will need to be running for as long as you want the SSH connection kubectl port-forward ssh-server 2222:22 Inject your Public SSH key Since we're using the ssh-server as a jumphost, we need to inject our SSH key into the SSH Server. Using root for simplicity's sake, but I recommend a more secure approach going forward. (TODO: Change this to use a non-privileged user.) cat ~/.ssh/id_rsa.pub | kubectl exec -i ssh-server -- /bin/bash -c "cat >> /root/.ssh/authorized_keys" SSH to the proxied port Using the SSH Server as a jumphost (via port-forward proxy), ssh into the IP address of the desired host. `# Get the list of Host + IP's kubectl get nodes -o json | jq '.items[].status.addresses[].address' $USER = Username on the agent host $IP = IP of the agent host ssh -J root@127.0.0.1:2222 $USER@$IP` NOTE: If you get "WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!" You might need to add -o StrictHostKeyChecking=no to the SSH command if you bounce across clusters. This is because SSH believes that the identity of the host has changed and you need to either remove that entry from your ~/.ssh/known_hosts or tell it to ignore the host identity. Cleanup kubectl delete pod ssh-server Kill the kubectl port-forward command

· 3 min read