Managed Kubernetes on AWS with Amazon EKS: Enterprise Architecture and Operational Guide
Kubernetes has become the de facto operating system for cloud-native applications. However, managing the Kubernetes control planeācomprising the API server, etcd database, controller manager, and schedulerāpresents significant operational overhead, high availability risks, and administrative complexity. Amazon Elastic Kubernetes Service (Amazon EKS) mitigates these challenges by providing a fully managed, highly available, and secure Kubernetes control plane integrated deeply with the AWS ecosystem.
In this comprehensive masterclass, we will explore the internal mechanics of Amazon EKS. We will cover control plane architecture, data plane optimization, advanced networking via the AWS VPC CNI, enterprise security via IAM Roles for Service Accounts (IRSA) and EKS Pod Identity, dynamic scaling with Karpenter, and multi-tenant observability. By the end of this guide, you will possess the production-ready knowledge and infrastructure code required to architect, provision, and operate enterprise-grade Kubernetes clusters on AWS.
Table of Contents
- What You Will Learn
- Prerequisites
- 1. Amazon EKS Architecture Deep Dive
- 2. Data Plane Architectures: MNGs, Fargate, and Karpenter
- 3. Advanced EKS Networking & AWS VPC CNI
- 4. Enterprise EKS Security: IRSA vs. Pod Identity
- 5. Step-by-Step Production Provisioning with Terraform
- 6. EKS Storage Architecture: EBS & EFS CSI Drivers
- 7. Scaling EKS: Karpenter and Horizontal Pod Autoscaler
- 8. GitOps Continuous Delivery with ArgoCD
- 9. Monitoring, Logging, and Observability
- 10. Troubleshooting and Common Failure Modes
- 11. Enterprise Best Practices Checklist
- 12. Architectural Interview Questions & Answers
- 13. Frequently Asked Questions (FAQs)
- Summary & Next Steps
What You Will Learn
- Architect highly available EKS control planes across multiple Availability Zones.
- Provision EKS infrastructure programmatically using production-grade Terraform configuration.
- Optimize IP address allocation using AWS VPC CNI custom networking and prefix delegation.
- Implement zero-trust security using EKS Pod Identity, KMS envelope encryption, and private endpoints.
- Configure Karpenter for sub-minute, right-sized node provisioning based on pending workload demands.
- Deploy persistent stateful workloads utilizing AWS EBS and EFS Container Storage Interface (CSI) drivers.
- Establish enterprise-grade observability utilizing Prometheus, Grafana, and AWS CloudWatch.
- Troubleshoot real-world production failures such as node registration issues, IP exhaustion, and OOM kills.
Prerequisites
To fully benefit from this advanced guide, you should be familiar with fundamental AWS and Kubernetes concepts:
- AWS Foundations: Deep understanding of VPCs, subnets, route tables, security groups, and IAM. Refer to our AWS VPC Networking Deep Dive.
- Kubernetes Basics: Familiarity with Pods, Deployments, Services, Namespaces, and basic
kubectloperations. - Infrastructure as Code: Working knowledge of Terraform syntax, providers, and state management. See Infrastructure as Code with Terraform.
1. Amazon EKS Architecture Deep Dive
Amazon EKS operates on a shared responsibility model split between the AWS-managed control plane and the customer-managed data plane. Understanding this architectural boundary is critical for diagnosing performance issues, configuring security parameters, and designing network topologies.
The AWS-Managed Control Plane
The Kubernetes control plane must be highly available, fault-tolerant, and resilient to traffic spikes. In a self-managed Kubernetes deployment, setting up a multi-master cluster with replicated etcd instances requires complex load balancing, quorum management, and certificate rotation.
Amazon EKS automates this entire lifecycle. For every EKS cluster you provision, AWS deploys and manages a dedicated, highly available control plane across at least three Availability Zones (AZs) within an AWS-managed VPC.
+-------------------------------------------------------------------------------------------------+
| AWS-MANAGED VPC |
| |
| +--------------------------+ +--------------------------+ +--------------------------+ |
| | Availability Zone A | | Availability Zone B | | Availability Zone C | |
| | | | | | | |
| | +--------------------+ | | +--------------------+ | | +--------------------+ | |
| | | kube-apiserver | | | | kube-apiserver | | | | kube-apiserver | | |
| | +---------+----------+ | | +---------+----------+ | | +---------+----------+ | |
| | | | | | | | | | |
| | +---------v----------+ | | +---------v----------+ | | +---------v----------+ | |
| | | etcd node <--+---+--> etcd node <--+---+--> etcd node | | |
| | +--------------------+ | | +--------------------+ | | +--------------------+ | |
| +------------+-------------+ +------------+-------------+ +------------+-------------+ |
+----------------|------------------------------|------------------------------|------------------+
| | |
| ENI | ENI | ENI
+----------------|------------------------------|------------------------------|------------------+
| v v v |
| +--------------------------+ +--------------------------+ +--------------------------+ |
| | Availability Zone A | | Availability Zone B | | Availability Zone C | |
| | | | | | | |
| | Private Subnet A | | Private Subnet B | | Private Subnet C | |
| | +------------------+ | | +------------------+ | | +------------------+ | |
| | | Worker Node | | | | Worker Node | | | | Worker Node | | |
| | +------------------+ | | +------------------+ | | +------------------+ | |
| | | | | | | |
| +--------------------------+ +--------------------------+ +--------------------------+ |
| |
| CUSTOMER VPC |
+-------------------------------------------------------------------------------------------------+
This managed control plane features:
- Redundant API Servers: EKS runs multiple instances of the
kube-apiserverbehind a managed Network Load Balancer (NLB) to handle incoming API traffic and guarantee high availability. - Replicated etcd Cluster: A dedicated, three-node
etcdcluster is deployed across three AZs. AWS monitors and automatically replaces unhealthyetcdnodes, handles periodic backups, and manages database defragmentation. - Automatic Scaling: The EKS service monitors load on the control plane and automatically scales the instance sizes of the API servers and
etcdnodes to sustain high request throughput. - Managed Upgrades: AWS provides a streamlined upgrade path for Kubernetes minor versions, ensuring that control plane components are updated safely and sequentially.
Cross-VPC Communication via Elastic Network Interfaces (ENIs)
Because the control plane runs in an AWS-managed VPC and your worker nodes run in your customer-owned VPC, EKS must establish a secure, low-latency communication channel between them.
During cluster creation, EKS provisions managed Elastic Network Interfaces (ENIs) directly into the private subnets of your customer VPC. These ENIs are owned and managed by the EKS service, but they reside within your network boundary.
When the kube-apiserver needs to communicate with a Pod (e.g., during kubectl logs, kubectl exec, or proxying traffic), the traffic flows from the AWS-managed VPC through these cross-VPC ENIs directly into your private subnets. Conversely, the kubelet and kube-proxy agents running on your worker nodes communicate with the control plane by sending requests to the EKS cluster endpoint, which routes traffic to the managed API servers.
EKS Cluster Endpoints: Access Patterns
EKS supports three distinct network access configurations for the Kubernetes API server endpoint. Selecting the correct access pattern is a fundamental security and architectural decision.
| Endpoint Access Type | Description | Use Case / Security Posture |
|---|---|---|
| Public | The API server endpoint is resolvable to a public IP address and accessible from the internet. All worker node traffic to the control plane exits the VPC and routes over the public internet. | Dev/Test environments only. Not recommended for production due to exposure to public scanning and brute-force attempts. |
| Public and Private | The API server endpoint is resolvable to a public IP for external management, but worker nodes route traffic internally through the cross-VPC ENIs. Public access can be restricted to specific CIDR blocks. | Standard enterprise configuration. Allows remote developers to run kubectl commands (restricted by IP whitelist) while keeping internal data-plane-to-control-plane traffic private. |
| Private | The API server endpoint is only resolvable and accessible from within your VPC or connected networks (VPN, Direct Connect). No public DNS record exists. | Maximum security production environments. Requires administrative traffic to originate from a bastion host, a transit VPC, or a secure corporate VPN/Direct Connect tunnel. |
2. Data Plane Architectures: MNGs, Fargate, and Karpenter
The Kubernetes data plane consists of the worker nodes where your containerized workloads actually execute. EKS provides three primary models for provisioning and managing the data plane.
Managed Node Groups (MNGs)
EKS Managed Node Groups automate the provisioning, lifecycle management, and updating of Amazon EC2 instances. When you create an MNG, EKS provisions an Auto Scaling Group (ASG) on your behalf and handles the operational tasks of node management.
- Automated OS Provisioning: Nodes are launched using the EKS-optimized Amazon Linux AMI (or custom AMIs), which includes pre-configured container runtimes (
containerd),kubelet, and the AWS VPC CNI. - Graceful Upgrades: When upgrading node versions, EKS orchestrates a rolling update. It provisions new nodes, taints old nodes with
NoSchedule, drains existing pods gracefully using the Eviction API, and terminates old instances once workloads have safely migrated. - Managed Labels and Taints: You can apply Kubernetes labels and taints directly through the EKS API/Terraform, and EKS ensures they are applied to all EC2 instances as they boot.
AWS Fargate for EKS
AWS Fargate is a serverless compute engine that allows you to run Kubernetes Pods without managing underlying EC2 instances. Fargate abstracts the server layer entirely.
When a Pod matches a defined Fargate Profile (based on Kubernetes namespace and selectors), EKS schedules the Pod onto a dedicated, single-tenant microVM managed by Fargate.
- Strong Security Isolation: Each Pod runs in its own isolated virtualization boundary (using Firecracker microVMs), sharing no kernel, memory, or storage resources with other Pods. This is highly beneficial for multi-tenant or untrusted code execution.
- No Node Management: You do not configure Auto Scaling Groups, manage OS patching, or configure container runtime versions.
- Pricing Model: You pay strictly for the vCPU and memory resources requested by the Pod, calculated from the time the Pod pulls its image until it terminates.
- Limitations: Fargate does not support DaemonSets, privileged containers, host networking, or local EBS volume mounts (EFS is supported). This makes it unsuitable for running system-level tooling like custom monitoring agents or service meshes that rely on node-level access.
Karpenter: Just-in-Time Node Provisioning
Traditional Kubernetes autoscaling relies on the Cluster Autoscaler, which monitors the cluster for "unschedulable" pods and increases the capacity of existing EC2 Auto Scaling Groups. This model has several limitations: it is slow (often taking 2 to 5 minutes to provision nodes), bound to rigid ASG configurations, and struggles with diverse instance-type requirements.
Karpenter is an open-source, high-performance node provisioning engine designed specifically for Kubernetes on AWS. Instead of managing ASGs, Karpenter bypasses them entirely and interacts directly with the EC2 Fleet API to launch instances in seconds.
- Rapid Provisioning: Karpenter reads pending pod requirements (CPU, memory, volume mounts, node selectors, tolerations) and provisions the exact EC2 instance types needed, bringing nodes online in under 60 seconds.
- Heterogeneous Node Pools: Karpenter can dynamically mix and match Spot and On-Demand instances, Graviton (ARM64) and x86 architectures, and GPU instances within a single configuration.
- Consolidation and Bin-Packing: Karpenter continuously monitors the cluster for underutilized nodes. If it detects that workloads can be consolidated onto fewer or smaller instances, it automatically drains and terminates the redundant nodes, significantly reducing AWS infrastructure costs.
3. Advanced EKS Networking & AWS VPC CNI
Unlike traditional Kubernetes container network interfaces (CNIs) that use overlay networks (such as Flannel or Calico vxlan) where pods are assigned IPs from an isolated, non-routable CIDR block, EKS defaults to the AWS VPC CNI.
AWS VPC CNI Fundamentals
The AWS VPC CNI plugin runs as a DaemonSet (aws-node) on every worker node. It allocates IP addresses directly from your VPC's subnets to your Kubernetes Pods. This means Pods are first-class citizens in your VPC: they receive fully routable, private IPv4/IPv6 addresses that can communicate directly with other AWS resources (RDS, ElastiCache, EC2 instances) without passing through Network Address Translation (NAT).
+---------------------------------------------------------------------------------+
| CUSTOMER VPC |
| VPC CIDR: 10.0.0.0/16 |
| |
| +---------------------------------------------------------------------------+ |
| | Private Subnet A (CIDR: 10.0.1.0/24) | |
| | | |
| | +---------------------------------------------------------------------+ | |
| | | EC2 Worker Node (IP: 10.0.1.50) | | |
| | | | | |
| | | Primary ENI (eth0) | | |
| | | +-- Node IP: 10.0.1.50 | | |
| | | | | |
| | | Secondary ENI (eth1) | | |
| | | +-- Pod 1 IP: 10.0.1.61 <--- Allocated directly from Subnet A | | |
| | | +-- Pod 2 IP: 10.0.1.62 <--- Allocated directly from Subnet A | | |
| | | +-- Pod 3 IP: 10.0.1.63 <--- Allocated directly from Subnet A | | |
| | +---------------------------------------------------------------------+ | |
| +---------------------------------------------------------------------------+ |
+---------------------------------------------------------------------------------+
While this architecture delivers bare-metal network performance and simplifies security group auditing, it introduces a major challenge: IP Address Exhaustion.
The IP Address Exhaustion Challenge
Every EC2 instance type has a hard limit on the number of Elastic Network Interfaces (ENIs) it can attach, and the number of secondary private IP addresses it can host per ENI.
The maximum number of Pods that can run on an EC2 instance is calculated using the following formula:
Max Pods Formula:
(Number of ENIs * (IPs per ENI - 1)) + 2
The -1 represents the primary IP of the ENI (used by the node itself), and the +2 accounts for the host network pods (like aws-node and kube-proxy) that do not consume a secondary IP, plus a buffer.
Let's look at a common instance type, t3.medium:
- Max ENIs: 3
- IPv4 addresses per ENI: 6
- Calculation:
(3 * (6 - 1)) + 2 = 17 Pods
If you run small, microservice-based workloads, you will hit the IP limit (17 Pods) long before you exhaust the CPU and memory resources of the t3.medium instance. This leads to severe underutilization of your compute resources.
Mitigating IP Exhaustion: Prefix Delegation
To resolve the IP bottleneck, the AWS VPC CNI supports Prefix Delegation. Instead of allocating individual secondary IP addresses (e.g., 10.0.1.61) to ENIs, the CNI allocates entire /28 IP prefixes (blocks of 16 contiguous IP addresses, e.g., 10.0.1.64/28).
This increases the density of Pods you can run per node. For a t3.medium instance with Prefix Delegation enabled:
- Each ENI slot can hold a
/28prefix (16 IPs) instead of a single IP. - The theoretical capacity increases to over 110 pods, matching the physical compute capacity of the instance.
To enable Prefix Delegation, the VPC CNI environment variables must be configured as follows:
ENABLE_PREFIX_DELEGATION:"true"WARM_PREFIX_TARGET:"1"(tells the CNI to keep one pre-allocated/28prefix in reserve for rapid pod startup)
Mitigating IP Exhaustion: Custom Networking
In many enterprise networks, the primary VPC CIDR block is assigned from a highly constrained, corporately routed IPv4 range (e.g., a /20 block). Allocating hundreds of Pod IPs from this limited space is impossible.
Custom Networking allows you to assign Pod IPs from a completely different, non-routable CIDR block (typically standard private ranges like 100.64.0.0/10 or 198.19.0.0/16) attached as secondary CIDR blocks to your VPC.
- Node Subnets: Reside in the primary corporate-routed CIDR (e.g.,
10.0.0.0/24). Nodes consume only one corporate IP. - Pod Subnets: Reside in the secondary CGNAT CIDR (e.g.,
100.64.0.0/16). Pods consume IPs from this secondary range. - Routing: The VPC CNI handles translation and routing. When a Pod needs to communicate with resources outside the VPC, the traffic is NATed to the node's primary IP, preserving corporate address space.
4. Enterprise EKS Security: IRSA vs. Pod Identity
Securing access to AWS resources (like S3 buckets, DynamoDB tables, or KMS keys) from workloads running inside Kubernetes is a primary concern for enterprise security architects.
The Anti-Pattern: Node-Level IAM Roles
In early Kubernetes deployments, developers assigned IAM permissions to the EC2 worker node's instance profile. This meant *every* Pod running on that node inherited those exact same AWS permissions. If a single, non-critical frontend Pod was compromised, an attacker could access the node's instance metadata service (IMDS) and assume permissions to delete S3 buckets or modify database tables.
Enterprise environments require least-privilege isolation at the individual Pod level.
IAM Roles for Service Accounts (IRSA)
IRSA establishes a secure link between Kubernetes ServiceAccounts and AWS IAM Roles using an OpenID Connect (OIDC) Identity Provider.
+-------------------------------------------------------------------------------------------------+
| AMAZON EKS |
| |
| 1. Pod requests AWS resource |
| +------------------+ 2. Mutating Webhook injects: |
| | Pod | - AWS_ROLE_ARN |
| | | - AWS_WEB_IDENTITY_TOKEN_FILE (JWT) |
| | +------------+ | - Projected Volume Mount (Token) |
| | | AWS SDK | | |
| | +-----+------+ | |
| +--------|---------+ |
| | |
| | 3. SDK calls STS:AssumeRoleWithWebIdentity(JWT) |
| v |
| +------------------+ |
| | AWS STS | <-- Checks OIDC trust relationship & validates JWT signature |
| +--------|---------+ |
| | |
| | 4. Returns temporary AWS credentials |
| v |
| +------------------+ |
| | AWS SDK | --> 5. Accesses AWS Resource (e.g., S3 Bucket) |
| +------------------+ |
+-------------------------------------------------------------------------------------------------+
How IRSA works under the hood:
- You create an IAM Role with a trust policy that trusts your cluster's unique OIDC provider URL.
- You create a Kubernetes
ServiceAccountand annotate it with the IAM Role ARN:apiVersion: v1 kind: ServiceAccount metadata: name: s3-reader annotations: eks.amazonaws.com/role-arn: arn:aws:iam::111122223333:role/eks-s3-reader-role - An EKS-managed mutating admission controller webhook detects this annotation when a Pod is scheduled.
- The webhook modifies the Pod spec to inject two environment variables:
AWS_ROLE_ARNandAWS_WEB_IDENTITY_TOKEN_FILE. It also mounts a projected volume containing a short-lived JSON Web Token (JWT) representing the Kubernetes ServiceAccount identity. - When the AWS SDK inside your application runs, it detects these variables, calls the AWS Security Token Service (STS) endpoint using
AssumeRoleWithWebIdentity, exchanges the JWT for temporary AWS credentials, and accesses the target AWS service.
EKS Pod Identity: The Modern Alternative
While IRSA is highly secure, managing OIDC providers across dozens of clusters, configuring complex IAM trust policies with long OIDC URLs, and handling cross-account access can be operationally intensive.
Released in late 2023, EKS Pod Identity simplifies this architecture. It removes the requirement for an external OIDC provider and decouples the IAM trust policy from the cluster's unique OIDC metadata.
- Simplified Trust Policies: Instead of trusting a specific OIDC URL, the IAM role trust policy trusts the system service
pods.eks.amazonaws.com:{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "Service": "pods.eks.amazonaws.com" }, "Action": [ "sts:AssumeRole", "sts:TagSession" ] } ] } - No OIDC Administration: You define a direct association between a Kubernetes namespace, ServiceAccount, and IAM Role using the EKS API or Terraform.
- Performance: Credentials are served locally on the node via a daemon agent (
eks-pod-identity-agent) listening on a link-local address, eliminating external STS calls over the internet for token exchange.
5. Step-by-Step Production Provisioning with Terraform
In this section, we will deploy a production-ready, highly available EKS cluster using Terraform. This configuration adheres to security best practices, including private subnets, KMS key encryption for Kubernetes secrets, and managed node groups utilizing IMDSv2 and encrypted root volumes.
Terraform Configuration Structure
We will define our infrastructure in a unified, highly detailed Terraform configuration. This configuration utilizes the official AWS provider to build the VPC, KMS, IAM, and EKS resources from scratch.
# ==========================================
# 1. PROVIDERS & TERRAFORM SETTINGS
# ==========================================
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~> 2.23"
}
}
}
provider "aws" {
region = "us-east-1"
}
# ==========================================
# 2. VPC AND NETWORKING CONFIGURATION
# ==========================================
data "aws_availability_zones" "available" {
state = "available"
}
resource "aws_vpc" "eks_vpc" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "eks-production-vpc"
"kubernetes.io/cluster/eks-prod-cluster" = "shared"
}
}
resource "aws_subnet" "public_subnets" {
count = 3
vpc_id = aws_vpc.eks_vpc.id
cidr_block = "10.0.${count.index + 1}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = {
Name = "eks-public-subnet-${count.index}"
"kubernetes.io/cluster/eks-prod-cluster" = "shared"
"kubernetes.io/role/elb" = "1"
}
}
resource "aws_subnet" "private_subnets" {
count = 3
vpc_id = aws_vpc.eks_vpc.id
cidr_block = "10.0.${count.index + 10}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "eks-private-subnet-${count.index}"
"kubernetes.io/cluster/eks-prod-cluster" = "shared"
"kubernetes.io/role/internal-elb" = "1"
}
}
resource "aws_internet_gateway" "igw" {
vpc_id = aws_vpc.eks_vpc.id
tags = {
Name = "eks-vpc-igw"
}
}
resource "aws_eip" "nat_eip" {
domain = "vpc"
depends_on = [aws_internet_gateway.igw]
}
resource "aws_nat_gateway" "nat_gw" {
allocation_id = aws_eip.nat_eip.id
subnet_id = aws_subnet.public_subnets[0].id
tags = {
Name = "eks-vpc-nat-gw"
}
}
resource "aws_route_table" "public_rt" {
vpc_id = aws_vpc.eks_vpc.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.igw
}
tags = {
Name = "eks-public-route-table"
}
}
resource "aws_route_table_association" "public_assoc" {
count = 3
subnet_id = aws_subnet.public_subnets[count.index].id
route_table_id = aws_route_table.public_rt.id
}
resource "aws_route_table" "private_rt" {
vpc_id = aws_vpc.eks_vpc.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.nat_gw
}
tags = {
Name = "eks-private-route-table"
}
}
resource "aws_route_table_association" "private_assoc" {
count = 3
subnet_id = aws_subnet.private_subnets[count.index].id
route_table_id = aws_route_table.private_rt.id
}
# ==========================================
# 3. KMS KEY FOR EKS SECRETS ENVELOPE ENCRYPTION
# ==========================================
resource "aws_kms_key" "eks_secrets" {
description = "KMS Key for EKS Secrets Envelope Encryption"
deletion_window_in_days = 7
enable_key_rotation = true
tags = {
Environment = "production"
Application = "eks"
}
}
# ==========================================
# 4. IAM ROLES FOR CONTROL PLANE
# ==========================================
resource "aws_iam_role" "eks_control_plane_role" {
name = "eks-control-plane-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "eks.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "eks_cluster_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
role = aws_iam_role.eks_control_plane_role.name
}
# ==========================================
# 5. EKS CLUSTER PROVISIONING
# ==========================================
resource "aws_eks_cluster" "prod_cluster" {
name = "eks-prod-cluster"
role_arn = aws_iam_role.eks_control_plane_role.arn
version = "1.29"
vpc_config {
subnet_ids = aws_subnet.private_subnets[*].id
endpoint_private_access = true
endpoint_public_access = true
public_access_cidrs = ["203.0.113.0/24"] # Restrict to corporate IP space
}
encryption_config {
resources = ["secrets"]
provider {
key_arn = aws_kms_key.eks_secrets.arn
}
}
depends_on = [
aws_iam_role_policy_attachment.eks_cluster_policy
]
}
# ==========================================
# 6. IAM ROLES FOR WORKER NODES
# ==========================================
resource "aws_iam_role" "node_group_role" {
name = "eks-node-group-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "ec2.amazonaws.com"
}
}
]
})
}
resource "aws_iam_role_policy_attachment" "worker_node_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
role = aws_iam_role.node_group_role.name
}
resource "aws_iam_role_policy_attachment" "cni_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"
role = aws_iam_role.node_group_role.name
}
resource "aws_iam_role_policy_attachment" "ecr_policy" {
policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
role = aws_iam_role.node_group_role.name
}
# ==========================================
# 7. LAUNCH TEMPLATE FOR WORKER NODES (IMDSv2 & Encryption)
# ==========================================
resource "aws_launch_template" "eks_node_lt" {
name_prefix = "eks-node-lt-"
instance_type = "t3.medium"
block_device_mappings {
device_name = "/dev/xvda"
ebs {
volume_size = 50
volume_type = "gp3"
encrypted = true
delete_on_termination = true
}
}
metadata_options {
http_endpoint = "enabled"
http_tokens = "required" # Enforce IMDSv2
http_put_response_hop_limit = 2
}
monitoring {
enabled = true
}
tag_specifications {
resource_type = "instance"
tags = {
Name = "eks-prod-worker-node"
}
}
}
# ==========================================
# 8. EKS MANAGED NODE GROUP
# ==========================================
resource "aws_eks_node_group" "prod_nodes" {
cluster_name = aws_eks_cluster.prod_cluster.name
node_group_name = "production-app-nodes"
node_role_arn = aws_iam_role.node_group_role.arn
subnet_ids = aws_subnet.private_subnets[*].id
scaling_config {
desired_size = 3
max_size = 10
min_size = 2
}
update_config {
max_unavailable = 1
}
launch_template {
id = aws_launch_template.eks_node_lt.id
version = "$Latest"
}
depends_on = [
aws_iam_role_policy_attachment.worker_node_policy,
aws_iam_role_policy_attachment.cni_policy,
aws_iam_role_policy_attachment.ecr_policy
]
}
Verifying Cluster Connection
Once Terraform completes execution (which typically takes 10 to 15 minutes), configure your local kubectl context to point to your new cluster by running the following AWS CLI command:
aws eks update-kubeconfig \
--region us-east-1 \
--name eks-prod-cluster
Validate cluster connectivity and verify that worker nodes have successfully joined the cluster:
kubectl get nodes -o wide
kubectl get pods -A
kubectl cluster-info
Expected output should display all worker nodes in a Ready state along with the EKS API endpoint.
6. EKS Storage Architecture: EBS & EFS CSI Drivers
Kubernetes workloads are generally ephemeral, but enterprise applications frequently require persistent storage. Amazon EKS supports storage integration through Container Storage Interface (CSI) drivers.
Amazon EBS CSI Driver
Amazon Elastic Block Store (EBS) provides high-performance block storage for stateful workloads such as:
- MySQL
- PostgreSQL
- MongoDB
- Kafka
- Elasticsearch
Example StorageClass:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-storage
provisioner: ebs.csi.aws.com
parameters:
type: gp3
volumeBindingMode: WaitForFirstConsumer
When a PersistentVolumeClaim (PVC) is created, the EBS CSI driver dynamically provisions a new EBS volume and attaches it to the node hosting the pod.
Amazon EFS CSI Driver
Amazon Elastic File System (EFS) provides shared POSIX-compliant storage accessible simultaneously from multiple pods.
- Shared application content
- WordPress uploads
- Machine learning datasets
- Multi-replica applications
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: efs-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: efs-sc
| Feature | EBS | EFS |
|---|---|---|
| Access Mode | ReadWriteOnce | ReadWriteMany |
| Performance | Very High | Moderate |
| Availability | Single AZ | Multi AZ |
| Best For | Databases | Shared Storage |
7. Scaling EKS: Karpenter and Horizontal Pod Autoscaler
Horizontal Pod Autoscaler (HPA)
HPA dynamically scales the number of pod replicas based on observed CPU, memory, or custom metrics.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: payments-api
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: payments-api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
Karpenter Provisioner
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
name: production
spec:
template:
spec:
requirements:
- key: kubernetes.io/arch
operator: In
values:
- amd64
- key: karpenter.sh/capacity-type
operator: In
values:
- spot
- on-demand
Karpenter continuously evaluates pending pods and launches appropriately sized EC2 instances without requiring preconfigured Auto Scaling Groups.
8. GitOps Continuous Delivery with ArgoCD
GitOps treats Git repositories as the single source of truth for cluster configuration.
Developer Push
|
v
Git Repository
|
v
ArgoCD
|
v
Amazon EKS
Benefits include:
- Declarative deployments
- Automatic drift detection
- Rollback through Git commits
- Auditability and compliance
kubectl create namespace argocd
kubectl apply -n argocd \
-f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml
9. Monitoring, Logging, and Observability
Enterprise EKS platforms require comprehensive observability.
Metrics
- Prometheus
- Amazon Managed Prometheus
- Grafana
- Amazon Managed Grafana
Logs
- Fluent Bit
- CloudWatch Logs
- Loki
Tracing
- OpenTelemetry
- AWS X-Ray
- Jaeger
helm repo add prometheus-community \
https://prometheus-community.github.io/helm-charts
helm install kube-prometheus-stack \
prometheus-community/kube-prometheus-stack
10. Troubleshooting and Common Failure Modes
| Problem | Cause | Resolution |
|---|---|---|
| Nodes Not Joining | IAM or subnet issue | Verify node IAM role and route tables |
| Pods Pending | Insufficient resources | Scale nodes or review requests |
| IP Exhaustion | ENI limits reached | Enable Prefix Delegation |
| Image Pull Errors | ECR permissions missing | Review node IAM policies |
| OOMKilled | Memory limits too low | Increase limits or optimize application |
11. Enterprise Best Practices Checklist
- Use private API endpoints whenever possible.
- Enable envelope encryption with AWS KMS.
- Use EKS Pod Identity instead of node IAM roles.
- Enable VPC CNI Prefix Delegation.
- Run nodes in private subnets.
- Use Karpenter for cost optimization.
- Enable audit logging.
- Implement Pod Security Standards.
- Use Network Policies.
- Continuously patch Kubernetes versions.
12. Architectural Interview Questions & Answers
Q1. Why does EKS create ENIs in customer VPCs?
ENIs provide secure communication between the AWS-managed control plane and customer-managed worker nodes.
Q2. What problem does Prefix Delegation solve?
It significantly increases pod density by allocating IP prefixes instead of individual IP addresses.
Q3. Why is Karpenter preferred over Cluster Autoscaler?
Karpenter launches nodes faster, supports heterogeneous instance types, and performs automatic consolidation.
Q4. IRSA vs Pod Identity?
IRSA relies on OIDC federation, whereas Pod Identity uses a native EKS-managed mechanism with simpler administration.
Q5. Why run worker nodes in private subnets?
It minimizes attack surface and prevents direct internet exposure.
13. Frequently Asked Questions (FAQs)
Can EKS run Windows containers?
Yes. EKS supports both Linux and Windows worker nodes.
Does EKS support IPv6?
Yes. EKS supports dual-stack and IPv6 networking models.
Can EKS be fully private?
Yes. Configure private-only API endpoints and access through VPN or Direct Connect.
Is Karpenter production ready?
Yes. Karpenter is widely adopted for production-scale EKS deployments.
Summary & Next Steps
Amazon EKS abstracts the operational complexity of Kubernetes control plane management while providing deep integration with AWS networking, security, storage, observability, and automation services.
By combining private networking, Pod Identity, KMS encryption, Karpenter-based autoscaling, GitOps delivery through ArgoCD, and enterprise observability patterns, organizations can build highly secure, scalable, and resilient Kubernetes platforms capable of supporting mission-critical workloads at global scale.
In the next chapter, we will explore Advanced Amazon EKS Networking: Custom Networking, Prefix Delegation, Security Groups for Pods, and Multi-Cluster Architectures.