1. Table of Content
- 1. Table of Content
- 2. Intro
- 3. Certification
- 4. CKA Lecture and Lab Notes
- 5. Core Concepts
- 5.1. Kubernetes Architecture
- 5.2. Open Container Initiative (OCI)
- 5.3. Container Runtime Interface (CRI)
- 5.4. Docker vs ContainerD
- 5.5. ETCD
- 5.6. Kube-API Server
- 5.7. Kube Controller Manager
- 5.8. Kube Scheduler
- 5.9. Kubelet
- 5.10. Kube Proxy
- 5.11. Pods
- 5.12. ReplicaSets
- 5.13. Labels and Selectors
- 5.14. Annotations
- 5.15. Deployments
- 5.16. Services
- 5.17. Namespaces
- 5.18. Imperative vs. Declarative
- 6. Scheduling
- 7. Commands
- 8. References
2. Intro
This document contains my notes on preparation course of Certified Kubernetes Administrators (CKA) Certification by Mumshad Mannambeth.
3. Certification
4. CKA Lecture and Lab Notes
5. Core Concepts
5.1. Kubernetes Architecture
- Worker Nodes: hosts application as containers
- Worker Nodes Components:
- kubelet: is an agent which runs on Worker Nodes
- kube-proxy: is a service responsible for networking on Worker Nodes
- Worker Nodes Components:
- Master Node: manage, plan schedule, monitor etc. nodes using Control Plane Components
- Control Plane Components:
- kube-apiserver: is responsible for orchestration of all operations within k8s cluster Control Plane Components
- ETCD Cluster: is a database which stores information in a key-value format
- kube-scheduler: identifies right node based on containers resource requirements
- Controller-Manager
- Node-Controller
- Replication-Controller
- Control Plane Components:
5.2. Open Container Initiative (OCI)
- imagespec: specifications how container image must be built
- runtimespec: specifications how conainer runtime should be developed
5.3. Container Runtime Interface (CRI)
- Kubernetes introduced Container Runtime Interface (CRI) which allows any vendor to work as container runtime as long as it adheres to OCI specification and not only Docker which used initially dockershim.
- Starting from version v1.24 Kubernetes deleted dockershim and Docker support
5.4. Docker vs ContainerD
5.4.1. Docker
- Docker is not only container runtime, but also consist of multiple tools
- CLI
- API
- BUILD
- Volumes
- AUTH
- SECURITY
- ContainerD
- Docker uses container runtime ContainerD
5.4.2. ContainerD
- ContainerD is part of Cloud Native Computing Foundation (CNCF)
- ContainerD is a higher-level tool that manages the entire lifecycle of containers.
- ContainerD is a deamon which manages runc and compatable with Container Runtime Interface (CRI)
- runc is a container runtime
- ContainerD works as container runtime on its own separate from Docker
- ContainerD comes with
crt
CLI tool mainly made for debugging container.nerdctl
(similar to docker CLI) provides stable and humand-friendly user experience crictl
is another CLI tool from k8s community compatable with CRI which works accross different container runtime. Mainly used for debugging- Read more
5.5. ETCD
- ETCD is a distributed reliable key-value store that is simple, secure and fast
- Key-Value Store stores information if form of documents, pages (i.e. files) which can be in any format and structure
5.5.1. Get started quickly with ETCD
-
Download the relevant binary for your operating system from github releases page (https://github.com/etcd-io/etcd/releases)
For Example: To download ETCD v3.5.6, run the below curl command
$ curl -LO https://github.com/etcd-io/etcd/releases/download/v3.5.6/etcd-v3.5.6-linux-amd64.tar.gz
-
Extract it.
$ tar xvzf etcd-v3.5.6-linux-amd64.tar.gz
-
Run the ETCD Service
$ ./etcd
-
When you start
ETCD
it will by default listen on port2379
-
The default client that comes with
ETCD
is theetcdctl
client. You can use it to store and retrieve key-value pairs.Syntax: To Store a Key-Value pair
$ ./etcdctl put key1 value1Syntax: To retrieve the stored data
$ ./etcdctl get key1Syntax: To view more commands. Run etcdctl without any arguments
$ ./etcdctl
# etcdctl command in API 3 version
etcdctl snapshot save
etcdctl endpoint health
etcdctl get
etcdctl put
export ETCDCTL_API=3 # set the environment variable ETCDCTL_API for desired API version
# Certs for ETCDCTL to authenticate to the ETCD API Server
--cacert /etc/kubernetes/pki/etcd/ca.crt
--cert /etc/kubernetes/pki/etcd/server.crt
--key /etc/kubernetes/pki/etcd/server.key
# Example:
kubectl exec etcd-master -n kube-system -- sh -c "ETCDCTL_API=3 etcdctl get / --prefix --keys-only --limit=10 --cacert /etc/kubernetes/pki/etcd/ca.crt --cert /etc/kubernetes/pki/etcd/server.crt --key /etc/kubernetes/pki/etcd/server.key"
5.5.2. ETCD in Kubernetes
-
The ETCD Datastore stores following cluster information:
- Nodes
- PODS
- Configs
- Secrets
- Accounts
- Roles
- Bindings
- Others
-
kubectl get
command retrieves all information from the ETCD Server
5.5.3. ETCD Setup - Manual vs kubeadm
- Read more
5.6. Kube-API Server
- Kube-API Server is a primary management component in K8s
- kube-api server performs following actions when
kubectl
command is executed:- Authenticate User
- Validate Request
- Retrieve dataf
- Read more
5.7. Kube Controller Manager
- Kube Controller Manager manages various controllers in kubernetes
- Controller is a process that continuously monitors the state of the components within the system and ensures the whole system to have the desired functioning state
- Controllers:
- Node Controller - monitors health of Nodes every
5s
Node Monitor Period = 5s
Node Monitor Grace Period = 40s
(i.e. waits 40s before marking it as unreachable)POD Eviction Timeout = 5m
(i.e. waits 5m to come back up if unreachable before removing pods and provisioning resources to the healthy nodes if it is part of replica set)
- Replication Controller - monitors state of replica sets
- Deployment Controller
- Namespace Controller
- Endpoint-Controller
- CronJob
- Job Controller
- PV Protection Controller
- Service Account Controller
- Stateful Set
- Replica Set
- PV Binder Controller
- Node Controller - monitors health of Nodes every
5.7.1. Installation of Kube-Controller-Manager
- Read more
5.8. Kube Scheduler
- kube-scheduler is responsible for scheduling pods on nodes. The kube-scheduler is only responsible for deciding which pod goes on which node depending on certain criteria (e.g. Filter Nodes, Rank Nodes). It doesn't actually place the pod on the nodes, that's the job of the
kubelet
. - Read more
5.9. Kubelet
- kubelet is a node agent which is responsible for creating the pods on the nodes and the sole point of contact for the kubernetes cluster
- kubelet also collects reports and sends regularly to master node
5.10. Kube Proxy
- POD Network is an internal virtual network that spans across all nodes in the cluster to which all pods are connected. Within k8s cluster every pod can reach every other pod which is accomplished by deploying networking solution, i.e. POD Network.
- kube-proxy is a network component in Kubernetes that helps manage network communication to and from the Pods in a k8s cluster. Key function of kube-proxy:
- Traffic Routing
- Load Balancing
- Service Discovery
- Protocol Support
5.11. Pods
- Pod is a logical host for one or more containers that are deployed together on the same host
# Example of Pod definition with YAML
apiVersion: v1
kind: Pod
metadata: # dictionary
name: myapp-pod
namespace: dev
labels: # under labels custom key:value allowed
app: myapp
type: front-end
spec: # dictionary
containers: # list/array
- name: nginx-controller
image: nginx
5.12. ReplicaSets
- ReplicaSet and ReplicationController (deprecated in favor of ReplicaSet) are both Kubernetes resources for managing pod replication to ensure a specified number of pod replicas running at any given time
# Example of ReplicationController definition with YAML
apiVersion: v1
kind: ReplicationController
metadata: # dictionary
name: myapp-rc
namespace: dev
labels: # under labels custom key:value allowed
app: myapp
type: front-end
spec: # dictionary
template: # Pod template for replication
metadata:
name: myapp-pod
labels:
app: myapp
type: front-end
spec:
containers:
- name: nginx-controller
image: nginx
replicas: 3
# Example of ReplicaSet definition with YAML
apiVersion: apps/v1
kind: ReplicaSet
metadata: # dictionary
name: myapp-rs
namespace: dev
labels: # under labels custom key:value allowed
app: myapp
type: front-end
spec: # dictionary
template: # Pod template for replication
metadata:
name: myapp-pod
labels:
app: myapp
type: front-end
spec:
containers:
- name: nginx-controller
image: nginx
replicas: 3
selector: # Major difference from ReplicationController
matchLabels:
type: front-end # To select all existing Pods matching defined label
5.13. Labels and Selectors
- Labels are key/value pairs that are attached to objects such as Pods for categorizing resources.
- Selectors are queries to filter resources based on labels.
5.14. Annotations
- Annotations are key-value pairs that can be attached to various Kubernetes objects (e.g. pods, services, deployments etc.) to store arbitrary metadata that is not used for identification or selection.
- Usage of Annotations:
- Storing build or version information
- Providing links to documentation or related resources
- Specifying configuration options for tools that interact with the resource (e.g., deployment tools, monitoring systems)
5.15. Deployments
- Deployment is a higher-level abstraction that manages a ReplicaSet, which in turn manages the Pods.
# Example of Deployment definition with YAML
apiVersion: apps/v1
kind: Deployment
metadata: # dictionary
name: myapp-rs
namespace: dev
labels: # under labels custom key:value allowed
app: myapp
type: front-end
spec: # dictionary
template: # Pod template for replication
metadata:
name: myapp-pod
labels:
app: myapp
type: front-end
spec:
containers:
- name: nginx-controller
image: nginx
replicas: 3
selector:
matchLabels:
type: front-end # To select all existing Pods matching defined label
5.16. Services
- Service is used to expose a set of Pods and enable communication between them
- Services enable communication between different components of an application, allowing Pods to discover and interact with each other reliably
- There are different types of Services, each serving a specific purpose
- Service acts as built-in load balancer to distribute load across different Pods (
Algorithm: Random, SessionAffinity: Yes
)
5.16.1. ClusterIP
- This is the default service type
- It exposes the service on a cluster-internal IP
- Only accessible from within the cluste
- Useful for internal communication between services
# Example of Service of type ClusterIP definition with YAML
apiVersion: v1
kind: Service
metadata: # dictionary
name: back-end
namespace: dev
spec:
type: ClusterIP
ports:
- port: 80 # The port that the service will expose
targetPort: 8080 # The port on the pod that the service should forward to
selector:
app: myapp
type: back-end
5.16.2. NodePort
- Service of type NodePort listens on the static port of each node's IP address in cluster and forward request to the Pods
- Node's port range
30 000 to 32 767
# Example of Service of type NodePort definition with YAML
apiVersion: v1
kind: Service
metadata: # dictionary
name: myapp-service
namespace: dev
spec:
type: NodePort
ports:
- port: 80 # The port that the service will expose
targetPort: 8080 # The port on the pod that the service should forward to
nodePort: 30008 # The port on each node that will forward to the service
selector:
app: myapp
type: front-end
5.16.3. LoadBalancer
- Creates an external load balancer (if supported by the cloud provider)
- Automatically assigns a public IP address to the service
- Distributes incoming traffic across multiple nodes
- Ideal for production environments where you need to handle large amounts of traffic
# Example of Service of type LoadBalancer definition with YAML
apiVersion: v1
kind: Service
metadata: # dictionary
name: myapp-service
namespace: dev
spec:
type: LoadBalancer
ports:
- port: 80 # The port that the service will expose
targetPort: 8080 # The port on the pod that the service should forward to
nodePort: 30008 # The port on each node that will forward to the service
5.16.4. ExternalName
5.17. Namespaces
- Namespace is a way to organize and manage resources within a cluster which provides a mechanism for isolating groups of resources
- Key Points of Namespaces:
- Isolation
- Resource Management
- Access Control
- Resource Quotas
- Network Policies
- Scoped Resources
Info: A pattern to access a service in a different namespace using DNS:
<service-name>.<namespace>.svc.cluster.local
<service-name>
: name of the service to access<namespace>
: namespace where the service is locatedsvc
: indicates that a service being accessedcluster.local
: default domain name for services in the k8s cluster (can be customized)
# Example of Namespace definition with YAML
apiVersion: v1
kind: Namespace
metadata:
name: dev
5.17.1. Resource Quota
# Example of ResourceQuota definition with YAML
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: dev
spec:
hard:
pods: "10"
requests.cpu: "4"
requests.memory: "5Gi"
limits.cpu: "10"
limits.memory: "10Gi"
5.18. Imperative vs. Declarative
In the context of Infrastructure as Code (IaC), imperative and declarative are two different approaches to defining and managing infrastructure.
- In an imperative approach, you specify how to achieve a desired state by providing a sequence of commands or instructions.
- In a declarative approach, you specify what the desired state of the infrastructure should be, without detailing the steps to achieve that state.
# IMPERAIVE APPROACH
# create objects
kubectl run --image=nginx nginx
kubectl create deployment --image=nginx nginx
kubectl expose deployment nginx --port 80
# update objects
kubectl edit deployment nginx
kubectl scale deployment nginx --replicas=5
kubectl set image deployment nginx nginx=nginx:1.18
kubectl create -f nginx.yaml
kubectl replace -f nginx.yaml
kubectl delete -f nginx.yaml
# DECLARATIVE APPROACH
kubectl apply -f object-definition-file.yaml
6. Scheduling
- Scheduling is a process of assigning pods (the smallest deployable units in Kubernetes) to nodes (the machines or virtual machines in the cluster) based on resource availability and other constraints.
6.1. Manual Scheduling
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
name: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 8080
nodeName: node02 # Specifies the name of Node, by default is not set
6.1.1. Binding Object
apiVersion: v1
kind: Binding
metadata:
name: my-pod
namespace: default
target:
apiVersion: v1
kind: Node
name: my-node
6.2. Taints and Tolerations
- Taints are applied to nodes to prevent certain pods from being scheduled on them
- Tolerations are applied to pods to allow them to be scheduled on nodes with specific taints
kubectl taint nodes <NODE-NAME> <KEY>=<VALUE>:<TAINT-EFFECT>
# TAINT-EFFECT defines what happens to Pods that do not TOLERATE this taint.
# TAINT-EFFECTs: NoSchedule, PreferNoSchedule, NoExecute
# TAINTING NODE EXAMPLE
kubectl taint nodes node1 app=blue:NoSchedule
# TOLERATION DEFINITION OF POD EXAMPLE
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
name: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 8080
tolerations:
- key: "app"
operator: "Equal"
value: "blue"
effect: "NoSchedule"
6.3. Node Selectors
- Label a node:
kubectl label nodes node01 size=Large
- Assign a Pod to the Node
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
spec:
containers:
- name: nginx
image: nginx
nodeSelector:
size: Large
- Node Selector's Limitations:
- Advanced expressions, e.g. Large OR Medium, NOT Small etc.
6.4. Node Affinity
apiVersion: v1
kind: Pod
metadata:
name: with-node-affinity
spec:
containers:
- name: nginx
image: nginx
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: size
operator: In
values:
- Large
- Medium
7. Commands
# GET
kubectl get all
kubectl get pods --watch
kubectl get pods -n kube-system
kubectl get pods -o wide
kubectl get pods -l environment=production,tier=frontend
kubectl get pods -l 'environment in (production),tier in (frontend)'
kubectl get pods -l 'environment,environment notin (frontend)'
kubectl get pods --selector app=App1
kubectl get pods --selector app=App1 --no-headers | wc -l
kubectl get pod <pod-name> --all-namespaces
kubectl get replicationcontroller
kubectl get replicaset
kubectl get replicaset myapp-rs -o yaml
kubectl get deployments
kubectl get services
# DESCRIBE
kubectl describe pod myapp-pod
kubectl describe node kubemaster
# APPLY
kubectl apply -f object-definition-file.yaml
kubectl apply -f /path/to/object-definition-files
# CREATE
kubectl create deployment --image=nginx nginx --namespace=dev
kubectl create deployment my-deployment --image=nginx nginx --replicas=3 --dry-run=client -o yaml
kubectl create service nodeport myapp-service --tcp=8080:8080 --dry-run=client -o yaml
kubectl create namespace dev
# EXEC
kubectl exec etcd-master -n kube-system etcdctl get / --prefix -keys-only
# RUN
kubectl run nginx --image=nginx --namespace=default --dry-run=client -o yaml
# EDIT
kubectl edit replicaset
# LABEL
kubectl label nodes <node-name> <label-key>=<label-value>
# REPLACE
kubectl replace -f replicaset-definition.yaml
# SCALE
kubectl scale --replicas=6 -f replicaset-definition.yaml
kubectl scale --replicas=6 -f replicaset myapp-rs
# DELETE
kubectl delete replicaset myapp-replicaset # deletes all underlying Pods
# CONFIG
kubectl config set-context $(kubectl config current-context) --namespace=dev
# TAINT
kubectl taint nodes <NODE-NAME> <KEY>=<VALUE>:<TAINT-EFFECT> # TAINT-EFFECTs: NoSchedule, PreferNoSchedule, NoExecute
kubectl taint nodes <NODE-NAME> <KEY>=<VALUE>:<TAINT-EFFECT>- # UNTAINT