Terraform, EKS, Auto-scale in Action !

TL;DR

Kubernetes has matured a lot, and features such as autoscaling of the cluster are literally a tag away … well a little bit more then that, but I’m sure you understand what I am talking about. In this post ill be using Terraform to spin-up an EKS cluster which is the easy part … and autoscaling which is “an annotation away” … + installing the cluster-autoscaler and of course putting the scale-up and scale-out to the test, with some cool tips down the road ;)

So let’s get started!

What we are going to build is an EKS control plane, which is basically what EKS is a *highliy available Kubernetes Control Plane and Key-Value Store, which means that provisioning an EKS cluster without attaching worker nodes to it dosen’t make much sense, unless (like in our case) you are planning on using this cluster for testing / ML / DL purposes which need to grow and shrink on demand.

Now, considering I am planniing on having some “long living” processes such as my logging & monitoring, ingress cet-manager stacks, I would like to have the following setup:

|Node Pool| Purpose | Min Count | Max Count | Desired at init | |–|–|–|–|–| | Infra workers | Long Living processes |3|6|3| | GPU workers | Long Living processes |0|6|0| | CPU workers | Long Living processes |2|10|2|

  • how I got to this numer is an inters
  • ting topic I will be blogging about “Kubernetes Performance Baselining” and it’s challanges …

Please note: In multi-az cluster, one should consider the spread of computations especially if there is persistence involved (btw, something EKS takes care of , well the Cloud Control Manager does it on his behalf).

So before we get started prereqs …

Prerequisites

Tool Purpose installation instaructions
aws-cli setup our AWS_PROFILE link
aws-iam-authenticator must have installed on the machine you are going to connect to the cluster from (laptop/ci-server etc) link
Terraform setup an eks cluster in an AWS VPC link
kubectl Kubernetes cluster Management link
helm install the cluster-autoscaler chart link

Steps from 0 to 100

  1. Configure AWS account
  2. Review VPC for worker nodes
  3. Review Terraform code
  4. Run Terraform
  5. Connect to Kubernetes Cluster
  6. Install HELM
  7. Install cluster-autoscaler via HELM
  8. Test our autoscaling capabilities

Setting up the cluster

1. Confinguring your AWS_PROFILE

  • Assuming you have many profiles, this is an important best-practice, if you have 1 aws account you could probebly stick with the default profile name which is achived with the aws configure command followd by your access key and secret key which you should have from configuration your IAM account.
  • Once complete you can just set the profile like so:
    echo AWS_PROFILE=profileName
    

2. Review VPC for worker nodes

Please note -> you dont have to create the vpc, you can use an existing one and just feed the eks module with the vpc information and you should have yorself a cluster in ~20 min …

  • Standard VPC setup: Our vpc consists of 3 private subnets for the worker nodes and 3 private subnets for potentially customer facing instances such as a VPN server / bastion hosts which are used to access nodes if needed (IMHO never but still). The following digram best describes the VPC layout:

If you were wondering why I’m using terraform The long answer lays here in a nutshell:

HashiCorp’s Terraform enables you to safely and predictably create, change, and improve infrastructure.

and the reason I took this approach (in fervour of CloudFormation BTW which is much more “Amazon Fridley”) is that hopefully I will be able at some stage to change my vpc module from AWS to GCP (or Digital Ocean) and the same for the eks module to gcp module and the rest of my “logic” will remain the same … (via Helm & friends).

The main reason I create the VPC prior to eks is considering I want to be able to spin-up more and more clusters mainly for testing purposes and probebly when we “go prod” + everytime there is a new Kubernetes version I want to be able to test the same configuratio in a differnt vpc and once done move the prod one …

3. Review Terraform code

please note A similar version of this implementation could be found in https://github.com/spacemeshos/k8sm SpaceMesh is an OpenSource -> Blockmesh Operating System and i’ve had the privilidge of working with them on Kubernetes and their testing framework infrustructure.

3.1 The usual terraform setup resources:

  • AWS Provider setup,
  • AWS availability zones which are calculated via the region variable in the vpc resource later on
  • A sshkey_pair which I provide the public key as a variable, so I can ssh to the machines (not that I should want to) + the private key is in the saftey of my laptop ;).

      provider "aws" {
        version = ">= 1.24.0"
        region  = "${var.region}"
      }
    
      data "aws_availability_zones" "available" {}
    
      resource "aws_key_pair" "tikal-eks" {
        key_name   = "tikal-eks"
        public_key = "${var.eks_support_pubkey}"
      }
    

3.2 The VPC module

The following block utilizes the official Hasicorp’s AWS VPC module which is quite flexible and literraly implements the diagram above with the options it supports such as private subnets, public subnets, nat gateway and other cool options.

I initially wrote my own VPC module which wasen’t half as flexible as the official one … and I am one step away of utlizing the same with GKE … (another blogpost perhapps)

So the VPC module


module "vpc" {
  source              = "terraform-aws-modules/vpc/aws"
  version             = "1.14.0"
  name                = "prd-eu-vpc"
  cidr                = "${var.vpc_cidr_block}"
  azs                 = ["${data.aws_availability_zones.available.names[0]}", "${data.aws_availability_zones.available.names[1]}", "${data.aws_availability_zones.available.names[2]}"]
  public_subnets      = "${var.public_subnets}"
  private_subnets     = "${var.private_subnets}"
  enable_dns_hostnames = true
  enable_nat_gateway  = true
  single_nat_gateway  = true
  tags                = "${merge(var.cluster_tags, map("kubernetes.io/cluster/${var.cluster_name}", "shared"))}"
}

At this point you are ready to spinup a cluster which could potentially be spread across n availability zones depending on your reagion of course, in our case 3 regaions.

The info we need to feed eks is described below under the EKD module section.

3.3 EKS module

The EKS module provides a ton of work you would have manully done if goign through the aws web console or cli -
such as:

  • IAM roles & Instance Profiles
  • map IAM roles and users to your EKS control plane
  • Create Auto-Scaling Groups and Corespongin AWS Launc Configurations for worker groups
  • Create a kubeconfig file Once we have a VPC we want to feed it’s output to the eks module to utlize like so:
module "eks" {
  source                               = "terraform-aws-modules/eks/aws"
  cluster_name                         = "${var.cluster_name}"
  cluster_version                      = "${var.cluster_version}"
  subnets                              = ["${module.vpc.private_subnets}"]
  tags                                 = "${var.cluster_tags}"
  vpc_id                               = "${module.vpc.vpc_id}"
  worker_groups                        = "${local.worker_groups}"
  worker_group_count                   = "3"
  worker_additional_security_group_ids = ["${aws_security_group.eks-workers-controlPlaneSg.id}"]
  map_users                            = "${var.map_users}"
  map_roles                            = "${var.map_roles}"
}

3.4 The ‘IAM Authentication’ Sauce

This is a very vast subject to cover in this blog postm in a nutshell AWS implemnts the webhook authntication token which works both in the way EKS control plane Authenticates the nodes and the users, considering evey entity communicates with the Kubernetes API the IAM role alongside the aws-iam-authenticator you are able to connect. I recommend you read more about this in this comprehensive article. In my case in order to enable the other IAM users / IAM roles the maps you pass as an arg above to the EKS cluster:

	map_users                            = "${var.map_users}"
	map_roles                            = "${var.map_roles}"

Are the glue between your users/roles and the cluster. Important to node these users/roles will be granted the system:masters group, which is a pre-defined group that is interpreted by the RBAC authoriser to allow full access to the cluster !

3.5 The ‘Auto Scaling’ Sauce

The eks module needs 2 things to get this to work eks module supports this out of the box by adding the option autoscaling_enabled = true. If you want Kubernetes cluster-autoscaler component to take care of the scaleup/down behaviour set protect_from_scale_in = true. I haven’t put the protect_from_scale_in option to the test (yet) …

The worker group section looks like so:

locals {
	  worker_groups = [
    {
      asg_desired_capacity          = 1
      asg_max_size                  = 5
      asg_min_size                  = 1
      key_name                      = "${aws_key_pair.tikal-eks.key_name}"
      instance_type                 = "t2.medium"
      name                          = "infra-workload"
      subnets                       = "${join(",", module.vpc.private_subnets)}"
      additional_security_group_ids = "${aws_security_group.allow_office_to_all_sg.id}"
      worker_group_tags             = "${local.worker_group_tags}"
      worker_group_launch_template_tags = "${local.worker_group_tags}"
      kubelet_extra_args            = "--register-with-taints=key=value:NoSchedule --node-labels=workload-type=infra-workload"	
    },
    {
      asg_desired_capacity          = 2
      asg_max_size                  = 10
      asg_min_size                  = 2
      autoscaling_enabled           = true
      protect_from_scale_in         = true
      key_name                      = "${aws_key_pair.tikal-eks.key_name}"
      instance_type                 = "t2.medium"
      name                          = "cpu-workload"
      subnets                       = "${join(",", module.vpc.private_subnets)}"
      additional_security_group_ids = "${aws_security_group.allow_office_to_all_sg.id}"
      worker_group_tags             = "${local.worker_group_tags}"
      worker_group_launch_template_tags = "${local.worker_group_tags}"
      kubelet_extra_args            = "--node-labels=workload-type=cpu-workload"
    },
    {
      asg_desired_capacity          = 0
      asg_max_size                  = 6
      asg_min_size                  = 0
      autoscaling_enabled           = true
      protect_from_scale_in         = true
      key_name                      = "${aws_key_pair.tikal-eks.key_name}"
      instance_type                 = "p2.xlarge"
      name                          = "gpu-workload"
      subnets                       = "${join(",", module.vpc.private_subnets)}"
      additional_security_group_ids = "${aws_security_group.allow_office_to_all_sg.id}"
      worker_group_tags             = "${local.worker_group_tags}"
      worker_group_launch_template_tags = "${local.worker_group_tags}"
      kubelet_extra_args            = "--register-with-taints=key=value:NoSchedule --node-labels=workload-type=gpu-workload"
    },
  ]
}

3.5 node-selectors and tolarations

As mentioned above I have a mixtrure of wokloads spread over 3 worker groups (backed by auto scaling groups), as you know in order to make the kubernetes node aware of these we need to pass an argument to kublet saying somthing like:

--register-with-taints=key=value:NoSchedule --node-labels=workload-type=gpu-workload

So unless we have a tollaration of NoSchedule and label gpu-workload no gpu resources will be allocated … I’ve tested this with the cpu-workload considering the costs of my games ;) so in this example you will find in the block I specified above the kubelet_extra_args for each worker group:

Infra -> kubelet_extra_args = "--register-with-taints=key=value:NoSchedule --node-labels=workload-type=infra-workload" CPU -> kubelet_extra_args = “–node-labels=workload-type=cpu-workload” GPU -> kubelet_extra_args = “–register-with-taints=key=value:NoSchedule –node-labels=workload-type=gpu-workload”


So Assuming we have wrapped our heads arround the above which should result in a VPC, An EKS control plane & 3 worker nodes which have different purposes Infra, cpu, gpu … so the next step is to see it in action and test the autoscaling portion of the cluster.

Please note the gpu nodes above will auto-select the ami_id based on the region your running eks in the full list if official ami’s could be found here.

4. Run Terraform and connect to Kubernetes Cluster

Please note: I already ran the vpc setup before applying the eks moduls for this post using the following command: terraform plan -out $(basename $PWD).plan -target=module.vpc so I can focus on the eks part …

As the rotuine goes with Terraform it’s init, plan, apply… I will spare you the latter of the additional 29 rsources for eks …

terraform plan -out $(basename $PWD).plan
...
  + module.eks.null_resource.tags_as_list_of_maps[2]
      id:                                        <computed>
      triggers.%:                                "3"
      triggers.key:                              "Workspace"
      triggers.propagate_at_launch:              "true"
      triggers.value:                            "tikal-dev"

  + module.eks.null_resource.update_config_map_aws_auth
      id:                                        <computed>
      triggers.%:                                <computed>


Plan: 29 to add, 0 to change, 2 to destroy.

------------------------------------------------------------------------

This plan was saved to: prd.plan

To perform exactly these actions, run the following command to apply:
    terraform apply "prd.plan"

Releasing state lock. This may take a few moments...

So now that weve got the plan let’s apply it:

terraform apply $(basename $PWD).plan
...
module.eks.aws_autoscaling_group.workers[0]: Modifications complete after 2s (ID: tikal-lab-infra-workloads20190216210053363200000005)
module.eks.aws_autoscaling_group.workers[1]: Modifications complete after 2s (ID: tikal-lab-cpu-workloads20190216210053365400000006)
module.eks.aws_launch_configuration.workers.deposed[1]: Destroying... (ID: tikal-lab-cpu-workloads20190216210050345000000003)
module.eks.aws_launch_configuration.workers.deposed[0]: Destroying... (ID: tikal-lab-infra-workloads20190216210050441100000004)
module.eks.aws_launch_configuration.workers.deposed[1]: Destruction complete after 1s
module.eks.aws_launch_configuration.workers.deposed[0]: Destruction complete after 1s

Apply complete! Resources: 2 added, 2 changed, 2 destroyed.
Releasing state lock. This may take a few moments...

5. Connect to Kubernetes Cluster

Assuming:

  • aws-iam-authenticator is in your $PATH
  • kubectl is installed
  • terraform executed successfully

run kubectl with the generated kube config, which is our case resides in the terraform directory called kubeconfig_$clustename which in our case is kubeconfig_tikal-lab.

  • kubectl get nodes … A kubectl get nodes --kubeconfig=kubeconfig_tikal-lab should yield:

      kubectl get nodes --kubeconfig=kubeconfig_tikal-lab
      NAME                            STATUS   ROLES    AGE   VERSION
      ip-172-31-56-172.ec2.internal   Ready    <none>   1h    v1.11.5
      ip-172-31-82-114.ec2.internal   Ready    <none>   1h    v1.11.5
      ip-172-31-92-171.ec2.internal   Ready    <none>   1h    v1.11.5
    

    please note: I only launched the infra & cpu for cost purposes …

  • Let’s check our nodes by labels:
      NAME                            STATUS   ROLES    AGE   VERSION   LABELS
      ip-172-31-56-172.ec2.internal   Ready    <none>   1h    v1.11.5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t2.medium,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1b,kubernetes.io/hostname=ip-172-31-56-172.ec2.internal,workload-type=cpu-workload
      ip-172-31-82-114.ec2.internal   Ready    <none>   1h    v1.11.5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t2.medium,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1d,kubernetes.io/hostname=ip-172-31-82-114.ec2.internal,workload-type=cpu-workload
      ip-172-31-92-171.ec2.internal   Ready    <none>   1h    v1.11.5   beta.kubernetes.io/arch=amd64,beta.kubernetes.io/instance-type=t2.medium,beta.kubernetes.io/os=linux,failure-domain.beta.kubernetes.io/region=us-east-1,failure-domain.beta.kubernetes.io/zone=us-east-1d,kubernetes.io/hostname=ip-172-31-92-171.ec2.internal,workload-type=infra-workload
    
  • Let’s check our nodes by taints:

      hagzags-mac:prd hagzag$ k describe nodes -l workload-type=infra-workload
      Name:               ip-172-31-92-171.ec2.internal
      Roles:              <none>
      Labels:             beta.kubernetes.io/arch=amd64
                          beta.kubernetes.io/instance-type=t2.medium
                          beta.kubernetes.io/os=linux
                          failure-domain.beta.kubernetes.io/region=us-east-1
                          failure-domain.beta.kubernetes.io/zone=us-east-1d
                          kubernetes.io/hostname=ip-172-31-92-171.ec2.internal
                          workload-type=infra-workload
      Annotations:        node.alpha.kubernetes.io/ttl: 0
                          volumes.kubernetes.io/controller-managed-attach-detach: true
      CreationTimestamp:  Sun, 17 Feb 2019 00:29:57 +0200
      Taints:             key=value:NoSchedule
    

    You can see both the taint key=value:NoSchedule and the label workload-type=infra-workload which means I will need to specify both in order to schedule a pod on those instances.

6. Install cluster-autoscaler via HELM

Assuming you have helm cli installed and perhapps you already installed the server side component named tiller Considering the cluster-autoscaler project is from the helm/stable repository hosted by Google (or via google storage) you need a few commandline arguments to helm (or alternatively pass a values.yaml) you can install the autoscaler like so:

helm upgrade --install cluster-autoscaler \
     --set clusterName=tikal-cloud \
     --set awsRegion=eu-west-1 \
     stable/cluster-autoscaler

In the exmaple above the relase name is cluster-autoscaler and it will be installed / upgraded upon need when using the upgrade --install flag (especially in CI/CD scenarions).

7. Test our autoscaling capabilities

Before we start let’s take a look at the nuber of nodes we have to start with:

NAME                                       STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE         KERNEL-VERSION               CONTAINER-RUNTIME
ip-10-0-1-192.eu-west-1.compute.internal   Ready    <none>   2d    v1.11.5   10.0.1.192    <none>        Amazon Linux 2   4.14.94-89.73.amzn2.x86_64   docker://17.6.2
ip-10-0-1-37.eu-west-1.compute.internal    Ready    <none>   2d    v1.11.5   10.0.1.37     <none>        Amazon Linux 2   4.14.94-89.73.amzn2.x86_64   docker://17.6.2
ip-10-0-2-124.eu-west-1.compute.internal   Ready    <none>   2d    v1.11.5   10.0.2.124    <none>        Amazon Linux 2   4.14.94-89.73.amzn2.x86_64   docker://17.6.2
ip-10-0-3-201.eu-west-1.compute.internal   Ready    <none>   2d    v1.11.5   10.0.3.201    <none>        Amazon Linux 2   4.14.94-89.73.amzn2.x86_64   docker://17.6.2
ip-10-0-3-29.eu-west-1.compute.internal    Ready    <none>   2d    v1.11.5   10.0.3.29     <none>        Amazon Linux 2   4.14.94-89.73.amzn2.x86_64   docker://17.6.2

So in order to enlarge the nuber of instances Let’s run nudnik to make so Chaos and cause our autoscaler to scale …

nudnik is a grpc stress testing tool, OpenSource and written i python maintained by SaloShp

Thank you for your interest!

We will contact you as soon as possible.

Send us a message

Oops, something went wrong
Please try again or contact us by email at info@tikalk.com