Goglides Dev 🌱

Balkrishna Pandey
Balkrishna Pandey

Posted on • Originally published at goglides.io on

Replace AWS VPC-CNI with Calico on AWS EKS cluster

Background

EKS clusters come with default AWS VPC CNI plugin that provides some excellent features like getting an address within the VPC subnet range. One limitation of AWS CNI that comes from the number of IP addresses and ENI that you can assign to the single instance. Refer to the official page which shows that limit. As you can see,

Instance type Maximum network interfaces Private IPv4 addresses per interface IPv6 addresses per interface
t3.large 3 12 12

AWS VPC-CNI IPs limitation

For t3.large, you can assign 3×12 = 36 IP addresses to a single EC2 instance. Which severely limits the number of pods that can we can schedule in a single node.

Here is the formula for max Pods numbers.


Max Pods = (Maximum Network Interfaces ) * ( IPv4 Addresses per Interface ) - 1

Enter fullscreen mode Exit fullscreen mode

For example, if you have a t3.large instance which supports max three ethernet and 12 IPs per interface. You can create only 35 pods, including the Kubernetes internal Pods, Because One IP is reserve for nodes itself.


3 * 12 - 1 = 35

Enter fullscreen mode Exit fullscreen mode

If you want to replace the default VPC CNI plugin with calico, here is the process for that.

Prerequisites

Before you get started, make sure you have downloaded and configured the necessary prerequisites.

Create EKS cluster

Create cluster “eksctl way”

We can replace VPC-CNI with calico in the EKS cluster, no matter how we created a cluster in the first place. But I see some problems while trying to increase the number of pods we can deploy in each machine if cluster created using **aws eks** command. Please follow the “eksctl way” method mentioned below to create a cluster.

I have created a dirty bash script to create a cluster. Not perfect, but it will do the jobs.

  • First, create a Cloudformation template file using the following content, call it amazon-eks-nodegroup-role.yaml. You can find the template in AWS official page also. Which is missing following two sections,

...
- !FindInMap [ServicePrincipals, !Ref "AWS::Partition", eks]
...
- !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSClusterPolicy"
...

Enter fullscreen mode Exit fullscreen mode

amazon-eks-nodegroup-role.yaml


AWSTemplateFormatVersion: "2010-09-09"

Description: Amazon EKS - Node Group Role

Mappings:
  ServicePrincipals:
    aws-cn:
      ec2: ec2.amazonaws.com.cn
    aws-us-gov:
      ec2: ec2.amazonaws.com
    aws:
      ec2: ec2.amazonaws.com
      eks: eks.amazonaws.com

Resources:
  NodeInstanceRole:
    Type: "AWS::IAM::Role"
    Properties:
      AssumeRolePolicyDocument:
        Version: "2012-10-17"
        Statement:
          - Effect: Allow
            Principal:
              Service:
                - !FindInMap [ServicePrincipals, !Ref "AWS::Partition", ec2]
                - !FindInMap [ServicePrincipals, !Ref "AWS::Partition", eks]
            Action:
              - "sts:AssumeRole"
      ManagedPolicyArns:
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy"
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
        - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSClusterPolicy"
      Path: /

Outputs:
  NodeInstanceRole:
    Description: The node instance role
    Value: !GetAtt NodeInstanceRole.Arn

Enter fullscreen mode Exit fullscreen mode

Now set following environment variables for your bash script as per your environment.


export AWS_ACCOUNT_ID=111111111111
export REGION="us-west-2"
export SUBNET1="subnet-00212121212121"
export SUBNET2="subnet-00212123232323"
export SECURITY_GROUP="sg-095df33a10a8"
export CLUSTER_NAME="democluster"
export INSTANCE_TYPE="t3.large"
export PRIVATE_AWS_KEY_NAME="demokey"

Enter fullscreen mode Exit fullscreen mode

Lets create the cluster using following script,


aws cloudformation create-stack \
    --stack-name eksrole \
    --template-body file://amazon-eks-nodegroup-role.yaml \
    --capabilities CAPABILITY_IAM \
    --output text || true

export eks_role_arn=$(aws cloudformation describe-stacks \
    --stack-name eksrole \
    --query "Stacks[0].Outputs[?OutputKey=='NodeInstanceRole'].OutputValue" \
    --output text)

#sleeping 20 seconds, sometimes Cloudformation taking time to create the stack.
sleep 20 

echo ${eks_role_arn}

# This will create a cluster.
aws eks create-cluster \
   --region ${REGION} \
   --name ${CLUSTER_NAME} \
   --kubernetes-version 1.16 \
   --role-arn ${eks_role_arn} \
   --resources-vpc-config subnetIds=${SUBNET1},${SUBNET2}

Enter fullscreen mode Exit fullscreen mode

Now wait for sometimes before you create a node-groups, it will approximately take 10min to create it. After that, you can apply the following scripts.


aws eks create-nodegroup --cluster-name ${CLUSTER_NAME} \
    --nodegroup-name ${CLUSTER_NAME} \
    --subnets ${SUBNET1} ${SUBNET2} \
    --node-role ${eks_role_arn} \
    --remote-access=ec2SshKey=${PRIVATE_AWS_KEY_NAME},sourceSecurityGroups=${SECURITY_GROUP} \
    --kubernetes-version=1.16 \
    --scaling-config=minSize=1,maxSize=1,desiredSize=1 \
    --instance-types ${INSTANCE_TYPE} \
    --region ${REGION}

Enter fullscreen mode Exit fullscreen mode

Fetch kubeconfig file locally, so that you can use kubectl command as follows,


aws eks update-kubeconfig --name ${CLUSTER_NAME}

Enter fullscreen mode Exit fullscreen mode

Deploy a sample application

Let’s deploy a simple Nginx application to make sure the app will come up or not.


cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    run: load-balancer-example
  name: hello-world
spec:
  replicas: 1
  selector:
    matchLabels:
      run: load-balancer-example
  template:
    metadata:
      labels:
        run: load-balancer-example
    spec:
      containers:
      - image: nginx:1.15.8
        name: hello-world
        ports:
        - containerPort: 80
          protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
  labels:
    run: load-balancer-example
  name: hello-service
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 80
  selector:
    run: load-balancer-example
  type: LoadBalancer
EOF

Enter fullscreen mode Exit fullscreen mode

Scale Up sample application

Let’s create 20 replicas of sample application as follows,


kubectl scale deployment hello-world --replicas=20

Enter fullscreen mode Exit fullscreen mode

and validate


kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
hello-world 20/20 20 20 16d

Enter fullscreen mode Exit fullscreen mode

So here, all 20 pods are active. Now let’s scale these replicas to 50.


kubectl scale deployment hello-world --replicas=50

Enter fullscreen mode Exit fullscreen mode

and validate


kubectl get deployment
NAME READY UP-TO-DATE AVAILABLE AGE
hello-world 30/50 50 30 16d

Enter fullscreen mode Exit fullscreen mode

Only 30 pods are active. Since I am using t3.large, so remaining pods (5 pods based on the above calculation) are must be system related pods. Let’s validate,


$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-10-10-26-180.us-west-2.compute.internal Ready <none> 16d v1.16.8-eks-e16311

$ kubectl get pods -A -o wide | grep ip-10-10-26-180.us-west-2.compute.internal | wc -l
35

Enter fullscreen mode Exit fullscreen mode

As per this, overall running pods are 35, which matches our calculation.

Remove existing AWS CNI components

First, we need to get rid of AWS CNI. We can delete individual resources, but there are lots. The easy way you can do is by using the following manifest file. By the time of writing, I am using the v1.6 version of vpc-cni plugin, which is also the latest one available at the moment.


kubectl delete -f https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/release-1.6/config/v1.6/aws-k8s-cni.yaml

Enter fullscreen mode Exit fullscreen mode

Deploy calico components

You can follow this quickstart guide to deploy calico. Since we are implementing calico in the existing cluster, you only need to run following command,


kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Enter fullscreen mode Exit fullscreen mode

You will see output similar to this,


...
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created

Enter fullscreen mode Exit fullscreen mode

Let’s confirm everything is up and running or not.


watch kubectl get pods -n kube-system -o wide  

Enter fullscreen mode Exit fullscreen mode

Here calico-node Daemonsets has the STATUS of Running, but calico-kube-controller is not running.


NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-77d6cbc65f-cvhkn 0/1 ContainerCreating 0 19m <none> ip-10-10-3-213.us-west-2.compute.internal <none> <none>
calico-node-2qthl 1/1 Running 0 19m 10.10.3.213 ip-10-10-3-213.us-west-2.compute.internal <none> <none>
coredns-5c97f79574-sxvc7 1/1 Running 0 43m 10.10.3.31 ip-10-10-3-213.us-west-2.compute.internal <none> <none>
coredns-5c97f79574-txm9f 1/1 Running 0 43m 10.10.7.151 ip-10-10-3-213.us-west-2.compute.internal <none> <none>
kube-proxy-lnknf 1/1 Running 0 36m 10.10.3.213 ip-10-10-3-213.us-west-2.compute.internal <none> <none>

Enter fullscreen mode Exit fullscreen mode

Let’s troubleshoot,


kubectl describe po calico-kube-controllers-77d6cbc65f-cvhkn -n kube-system 

Enter fullscreen mode Exit fullscreen mode

Here I see the following error log,


...
kubelet, ip-10-10-3-213.us-west-2.compute.internal Failed create pod sandbox: rpc error: code = Unknown desc = [failed to set up sandbox container "c5446664b39653e3ef08f88bb55d25b70cae569eb059e4be3d201740ba5b50f7" network for pod "calico-kube-controllers-77d6cbc65f-cvhkn": networkPlugin cni failed to set up pod "calico-kube-controllers-77d6cbc65f-cvhkn_kube-system" network: add cmd: Error received from AddNetwork gRPC call: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused", failed to clean up sandbox container "c5446664b39653e3ef08f88bb55d25b70cae569eb059e4be3d201740ba5b50f7" network for pod "calico-kube-controllers-77d6cbc65f-cvhkn": networkPlugin cni failed to teardown pod "calico-kube-controllers-77d6cbc65f-cvhkn_kube-system" network: del cmd: error received from DelNetwork gRPC call: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused"]
  Normal SandboxChanged 4m41s (x160 over 39m) kubelet, ip-10-10-3-213.us-west-2.compute.internal Pod sandbox changed, it will be killed and re-created.

Enter fullscreen mode Exit fullscreen mode

In this log, kubelet is timing out while trying to reach port 50051. This port is mention in readinessProbe and livenessProbe of vpc cni manifest file. https://github.com/aws/amazon-vpc-cni-k8s/blob/master/config/v1.6/aws-k8s-cni.yaml#L108-L115


...
          readinessProbe:
            exec:
              command: ["/app/grpc-health-probe", "-addr=:50051"]
            initialDelaySeconds: 35
          livenessProbe:
            exec:
              command: ["/app/grpc-health-probe", "-addr=:50051"]
            initialDelaySeconds: 35
...

Enter fullscreen mode Exit fullscreen mode

Meaning it is still trying to reach to VPC CNI. Possibly some caching going on. I deleted the node, and the autoscaling group brings up the nodes.


kubectl get nodes -o wide               
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-10-10-26-180.us-west-2.compute.internal Ready <none> 4m1s v1.16.8-eks-e16311 10.10.26.180 34.219.58.217 Amazon Linux 2 4.14.177-139.254.amzn2.x86_64 docker://18.9.9

Enter fullscreen mode Exit fullscreen mode

After that everything seems fine and working


kubectl get pods -n kube-system         
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-77d6cbc65f-pvx6r 1/1 Running 0 5m18s
calico-node-rjctk 1/1 Running 0 4m26s
coredns-5c97f79574-746kc 1/1 Running 0 5m18s
coredns-5c97f79574-qvdjr 1/1 Running 0 5m19s
kube-proxy-lgl9k 1/1 Running 0 4m26s

Enter fullscreen mode Exit fullscreen mode

Validate hello-world application

Check the status of the sample application deployed previously. The new pods are coming up as 192.168.*.* range, which is a calico network.

And verify using the following.


kubectl get all -o wide 



NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/hello-world-64db9f698b-2sd79 1/1 Running 0 16d 192.168.36.132 ip-10-10-26-180.us-west-2.compute.internal <none> <none>
pod/hello-world-64db9f698b-4dw94 0/1 Pending 0 13m <none> <none> <none> <none>
pod/hello-world-64db9f698b-5v2t9 0/1 Pending 0 13m <none> <none> <none> <none>
pod/hello-world-64db9f698b-6dld2 1/1 Running 0 16d 192.168.36.150 ip-10-10-26-180.us-west-2.compute.internal <none> <none>
...
...

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/hello-service LoadBalancer 172.20.130.23 a90c1d3d1ac3b4bcd8ed21ece59a6b47-2002783034.us-west-2.elb.amazonaws.com 80:31095/TCP 16d run=load-balancer-example
service/kubernetes ClusterIP 172.20.0.1 <none> 443/TCP 16d <none>

NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/hello-world 30/50 50 30 16d hello-world nginx:1.15.8 run=load-balancer-example

NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/hello-world-64db9f698b 50 50 30 16d hello-world nginx:1.15.8 pod-template-hash=64db9f698b,run=load-balancer-example

Enter fullscreen mode Exit fullscreen mode

But still, the total number of active pods is 30. Which means there is some restriction going on EKS NodeGroups. The feature is missing on aws eks command. Now I have to install eksctl just to handle this. If you know the workaround, please let me know.

Using eksctl I found the following options,


$ eksctl create nodegroup --help
...
New nodegroup flags:
  -n, --name string name of the new nodegroup (generated if unspecified, e.g. "ng-91a7a011")
  -t, --node-type string node instance type (default "m5.large")
...
      --max-pods-per-node int maximum number of pods per node (set automatically if unspecified)
...

Enter fullscreen mode Exit fullscreen mode

So initially, I thought the following command would do the trick.


eksctl create nodegroup --cluster ${CLUSTER_NAME} --node-type t3.large --node-ami auto --max-pods-per-node 100

Enter fullscreen mode Exit fullscreen mode

But as per this github issue eksctl does not support clusters that were not created by eksctl. I haven’t created the cluster using eksctl, so this means I am hitting the limit and seeing following issue,


[ℹ] eksctl version 0.22.0
[ℹ] using region us-west-2
[ℹ] will use version 1.16 for new nodegroup(s) based on control plane version
Error: getting VPC configuration for cluster "calico": no eksctl-managed CloudFormation stacks found for "calico"

Enter fullscreen mode Exit fullscreen mode

Test Sample application

Let’s verify if the application is running or not using following,


curl a90c1d3d1ac3b4bcd8ed21ece59a6b47-2002783034.us-west-2.elb.amazonaws.com

Enter fullscreen mode Exit fullscreen mode

I am seeing the following output, which is a response coming from Nginx.


<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

Enter fullscreen mode Exit fullscreen mode

eksctl way

  • First, create an Amazon EKS cluster without any nodes. This command will create everything from scratch including vpc, subnetes and others. If you want to use existing vpc and subnetes, explore more on eksctl by checking help menu or official document.

eksctl create cluster --name my-calico-cluster --without-nodegroup

Enter fullscreen mode Exit fullscreen mode

Output:


[ℹ] eksctl version 0.22.0
[ℹ] using region us-west-2
[ℹ] setting availability zones to [us-west-2c us-west-2b us-west-2a]
[ℹ] subnets for us-west-2c - public:192.168.0.0/19 private:192.168.96.0/19
[ℹ] subnets for us-west-2b - public:192.168.32.0/19 private:192.168.128.0/19
[ℹ] subnets for us-west-2a - public:192.168.64.0/19 private:192.168.160.0/19
[ℹ] using Kubernetes version 1.16
[ℹ] creating EKS cluster "my-calico-cluster" in "us-west-2" region with 
[ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=my-calico-cluster'
[ℹ] CloudWatch logging will not be enabled for cluster "my-calico-cluster" in "us-west-2"
[ℹ] you can enable it with 'eksctl utils update-cluster-logging --region=us-west-2 --cluster=my-calico-cluster'
[ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "my-calico-cluster" in "us-west-2"
[ℹ] 2 sequential tasks: { create cluster control plane "my-calico-cluster", no tasks }
[ℹ] building cluster stack "eksctl-my-calico-cluster-cluster"
[ℹ] deploying stack "eksctl-my-calico-cluster-cluster"

[ℹ] waiting for the control plane availability...
[✔] saved kubeconfig as "/Users/pandeyb/.kube/config"
[ℹ] no tasks
[✔] all EKS cluster resources for "my-calico-cluster" have been created
[ℹ] kubectl command should work with "/Users/pandeyb/.kube/config", try 'kubectl get nodes'
[✔] EKS cluster "my-calico-cluster" in "us-west-2" region is ready

Enter fullscreen mode Exit fullscreen mode

eksctl create nodegroup --cluster my-calico-cluster --node-type t3.large --node-ami auto --max-pods-per-node 100

Enter fullscreen mode Exit fullscreen mode

Output:


[ℹ] eksctl version 0.22.0
[ℹ] using region us-west-2
[ℹ] will use version 1.16 for new nodegroup(s) based on control plane version
[ℹ] nodegroup "ng-6d80fb78" will use "ami-06e2c973f2d0373fa" [AmazonLinux2/1.16]
[ℹ] 1 nodegroup (ng-6d80fb78) was included (based on the include/exclude rules)
[ℹ] will create a CloudFormation stack for each of 1 nodegroups in cluster "my-calico-cluster"
[ℹ] 2 sequential tasks: { fix cluster compatibility, 1 task: { 1 task: { create nodegroup "ng-6d80fb78" } } }
[ℹ] checking cluster stack for missing resources
[ℹ] cluster stack has all required resources
[ℹ] building nodegroup stack "eksctl-my-calico-cluster-nodegroup-ng-6d80fb78"
[ℹ] --nodes-min=2 was set automatically for nodegroup ng-6d80fb78
[ℹ] --nodes-max=2 was set automatically for nodegroup ng-6d80fb78
[ℹ] deploying stack "eksctl-my-calico-cluster-nodegroup-ng-6d80fb78"
[ℹ] no tasks
[ℹ] adding identity "arn:aws:iam::xxxxxxxxxx:role/eksctl-my-calico-cluster-nodegrou-NodeInstanceRole-S668LPGH9HFZ" to auth ConfigMap
[ℹ] nodegroup "ng-6d80fb78" has 0 node(s)
[ℹ] waiting for at least 2 node(s) to become ready in "ng-6d80fb78"
[ℹ] nodegroup "ng-6d80fb78" has 2 node(s)
[ℹ] node "ip-192-168-26-237.us-west-2.compute.internal" is ready
[ℹ] node "ip-192-168-68-246.us-west-2.compute.internal" is ready
[✔] created 1 nodegroup(s) in cluster "my-calico-cluster"
[✔] created 0 managed nodegroup(s) in cluster "my-calico-cluster"
[ℹ] checking security group configuration for all nodegroups
[ℹ] all nodegroups have up-to-date configuration

Enter fullscreen mode Exit fullscreen mode
  • Validate

$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-192-168-26-237.us-west-2.compute.internal Ready <none> 72s v1.16.8-eks-e16311
ip-192-168-68-246.us-west-2.compute.internal Ready <none> 69s v1.16.8-eks-e16311

$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-69cb4d4df7-pm9k4 1/1 Running 0 9m19s 172.16.178.195 ip-192-168-26-237.us-west-2.compute.internal <none> <none>
calico-node-htg6v 1/1 Running 0 102s 192.168.26.237 ip-192-168-26-237.us-west-2.compute.internal <none> <none>
calico-node-lzqbg 1/1 Running 0 99s 192.168.68.246 ip-192-168-68-246.us-west-2.compute.internal <none> <none>
coredns-5c97f79574-dtrv5 1/1 Running 0 69m 172.16.178.194 ip-192-168-26-237.us-west-2.compute.internal <none> <none>
coredns-5c97f79574-r59gk 1/1 Running 0 69m 172.16.178.193 ip-192-168-26-237.us-west-2.compute.internal <none> <none>
kube-proxy-lbvl9 1/1 Running 0 99s 192.168.68.246 ip-192-168-68-246.us-west-2.compute.internal <none> <none>
kube-proxy-njwxh 1/1 Running 0 102s 192.168.26.237 ip-192-168-26-237.us-west-2.compute.internal <none> <none>

Enter fullscreen mode Exit fullscreen mode

Everything looks good.

And validate as follows,


$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-world-64db9f698b-wsmbw 1/1 Running 0 22s 172.16.180.1 ip-192-168-68-246.us-west-2.compute.internal <none> <none>

Enter fullscreen mode Exit fullscreen mode

Now lets create 100 replicas, this time it should able to create all pods.


kubectl scale deployment hello-world --replicas=100  

Enter fullscreen mode Exit fullscreen mode

Validate:


kubectl get deployment hello-world
NAME READY UP-TO-DATE AVAILABLE AGE
hello-world 100/100 100 100 2m55s

Enter fullscreen mode Exit fullscreen mode

Cleanup

Delete eks cluster and aws resources using following,


eksctl delete cluster --name my-calico-cluster

Enter fullscreen mode Exit fullscreen mode

Output:


[ℹ] eksctl version 0.22.0
[ℹ] using region us-west-2
[ℹ] deleting EKS cluster "my-calico-cluster"
[ℹ] deleted 0 Fargate profile(s)
[✔] kubeconfig has been updated
[ℹ] cleaning up LoadBalancer services
[ℹ] 2 sequential tasks: { delete nodegroup "ng-6d80fb78", delete cluster control plane "my-calico-cluster" [async] }
[ℹ] will delete stack "eksctl-my-calico-cluster-nodegroup-ng-6d80fb78"
[ℹ] waiting for stack "eksctl-my-calico-cluster-nodegroup-ng-6d80fb78" to get deleted
[ℹ] will delete stack "eksctl-my-calico-cluster-cluster"
[✔] all cluster resources were deleted

Enter fullscreen mode Exit fullscreen mode

Troubleshooting

I encountered various problems while completing this blog. Those errors are listed here and mitigated already in the above steps.

  • An error occurred (InvalidParameterException) when calling the CreateNodegroup operation: Subnets are required
  • An error occurred (InvalidParameterException) when calling the CreateNodegroup operation: One or more security groups in remote-access is not valid!
  • An error occurred (InvalidParameterException) when calling the CreateNodegroup operation: Following required service principals [eks.amazonaws.com] were not found in the trust relationships of clusterRole arn:aws:iam::XXXXXXXXXX:role/eksrole-NodeInstanceRole-DYX5G48JN3NP
  • An error occurred (AlreadyExistsException) when calling the CreateStack operation: Stack [eksrole] already exists
  • An error occurred (InvalidParameterException) when calling the CreateNodegroup operation: The role with name eksrole-NodeInstanceRole-DYX2148JN21P cannot be found. (Service: AmazonIdentityManagement; Status Code: 404; Error Code: NoSuchEntity; Request ID: ca90689e-daas-4233-955b-963121d5604c; Proxy: null)
  • An error occurred (InvalidRequestException) when calling the CreateNodegroup operation: Cluster ‘democluster’ is not in ACTIVE status
  • An error occurred (InvalidParameterException) when calling the CreateNodegroup operation: Subnets are not tagged with the required tag. Please tag all subnets with Key: kubernetes.io/cluster/democluster Value: shared

Top comments (0)