- Published on
K8s learning notes
- Authors
- Name
- sinhnt
- @sinhnt
Why k8s?
- When deploying the application, you need a Computer and bunch of dependency software to run the application.
- To eliminate the complicated of installing the dependency software, you use Docker to package the application and its dependencies into a container.
- But when having a lof of Application running on Docker, especially in a Microservice architecture, there are bunch of management tasks such as scaling, deploy, resource allocation, communication ... for each of the application. That where the Docker Orchestration comes in, and K8s is one of the most popular.
Type of K8s
- Run on-premises: On windows you can install K8s using Docker Desktop, on Linux you can use Minikube
- Run on cloud: Azure(AKS), AWS(EKS), GCP(GKE)
Internal architecture
The k8s cluster is made up of 1 or more worker nodes and usually 1 master node.
- Master Node (Control Plane): Manage the cluster, schedule the application, monitor the application, and manage the worker node.
API Server
: Frontend for the K8s control planeScheduler
: Select the worker node for the PODController Manager
: Listen API Server, then create and manage controller (RC, DC, SFC)etcd
: Store the configuration data
- Worker Node: Run the application (1 Worker node similar to 1 Virtual Machine)
kubelet
: comunicate between master and workder node throughAPI server
kube-proxy
container runtime
On cloud, the master node is managed by the cloud provider, the worker node is managed by you - using kubectl
.
Kubectl
Use kubectl
to interact with the master node, then the master node will interact with the worker node to create, delete, update the application.
After connect to cluster, you can use kubectl
to interact with the cluster
AKS Network plugin
Define how the pods and nodes comunicate each other and with external resource. See this decision making from Microsoft Learn
- Kubenet: Default (Design for fewer 400 nodes)
- You must define the UDRs (User-defined routes). To reach the pods from outside of the cluster, a load balancer must be used
- Minor latency
- Azure CNI: (Number of Node depend on how you configured)
- With Azure CNI, pods get full virtual network connectivity and can be directly reached via their private IP address from connected networks.
- Latency-sensitive workloads
- Your custom
Pod
- Smallest deployable unit in k8s, POD is a wrapper layer around the container and K8s manage the POD instead of the container (example is the pod running, should the pod restart ...)
- 1 POD = 1 container (Some special case, there can be more than one container in a POD)
- A Pod does not expose the container port to the outside world. There are several ways:
- kubectl port-forward pod/
<pod-name>
3000:3000 - using #service
- kubectl port-forward pod/
Pod or any k8s resource will be described in the yaml template, then use kubectl
to execute the yaml.
You also can use kubectl
to run a pod instead of declare the pod in yaml file:
ReplicationControllers, ReplicaSets, DaemonSets
ReplicationControllers (RC)
- We not use Pod directly, we use ReplicationControllers to manage the PODs. The reason is Pod is helpfull incase Pod get down, but not helpfull incase the entire Node down. Remember POD is inside the worker node while the Controller is in master Node.
- When updating the POD template.yml, you have to delete a pod, then controller will re-create the pod, or you need to delete entire the controller and re-create it.
ReplicaSets(RS)
- Newer version of ReplicationControllers and will be used instead of ReplicationControllers
- RS support more complex selector than RC using matchExpressions (In, NotIn, Exists, DoesNotExist ...)
DaemonSets(DS)
- While RS and RC can have one or more pod running on the same worker node, DaemonSets ensure that there is only ONE pod running on each worker node. (It won't have the Replicate property like RS and RC)
- A usecase for logging or monitoring, only 1 pod needed on each node
Services
Service is a single, constant point of a pod or group of PODs. It has an immutable IP and Port. Remember PODs are ephemeral, can be delete and re-create at anytime that make IP of the PODs are not stable thus we need a service to keep track of the PODs.
There are 4 types of services:
- ClusterIP: Expose the service on a cluster-internal IP. This is the default type. Use for internal communication
- NodePort: Expose the service on each Node’s IP at a static port (the NodePort). A ClusterIP service, to which the NodePort service will route, is automatically created.
- LoadBalancer: Expose the service externally using a cloud provider’s load balancer. NodePort and ClusterIP services, to which the external load balancer will route, are automatically created.
- External Name: Maps the service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up.
Ingress
Allow expose the service to the outside world
Deployment
Manage the deployment of application
- Recreate
- Rolling Update
Deployment will create the ReplicaSets under the hood, then RS will create the PODs. The way the deployment work is it will create a new ReplicaSet with the new version of the application, then it will scale down the old ReplicaSet and scale up the new ReplicaSet. The old RS won't be deleted for Rollback purpose (You can set the revisionHistoryLimit in the deployment yaml file)
Namespace
Virtual concepts to manage the resources in the cluster. It's like a folder to organize the resources.
Volumn
When the POD restart, the data inside the container will be lost. Volumn help the container can access the disk store of Node, so that the application run on the container can have persistent state or share data with other container in different pod.
There are so many but the most common are:
- emptyDir: share data between containers that run on same POD (not persistent because data store on the POD)
- hostPath: the share folder is in the Worker Node so multiple Pod can access the same data (not persistent if have multiple worker node)
- cloud storage: gcePersistentDisk, awsElasticBlockStore, azureDisk, azureFile (only available on cloud, persistent)
With above volumn, it require developer have to care much about the underlying storage infrastructure in order to setup. To abstracted underlying storage infrastructure away from the developer, we can use PersistentVolume and PersistentVolumeClaim.
- PersistentVolume (Pre-provisioning): The Administrator will create the PV, they determine the underlying storage infrastructure
- PV is global, can be used by any namespace (It's
cluster resource
and notnamespace resource
) persistentVolumeReclaimPolicy
: retain (PV is not Release even the PVC is deleted), recycle (PV is available for other PVC to pick up), delete (PV is deleted when PVC is deleted)
- PV is global, can be used by any namespace (It's
- PersistentVolume (Dynamic provisioning): No need Administrator effort to setup the PV
- Use storageClass with a cloud provided
provisoner
. On-premises, you have to declare you own storageClass andprovisioner
- Use storageClass with a cloud provided
- PersistentVolumeClaim: Then the developer will create the PVC and used in their application, only care about the size, access mode and simple information
Secret & Config map (Kind of Volumn)
The basic usage of configration in container is to use environment variable.
- Pass env to the command line:
docker run -e "ENV=DEV" ...
orkubectl run ... --env="ENV=DEV"
- Declare in the yaml file:
env: - name: ENV value: DEV
Above ways is not easy to manage the complex configuration and in a secure way. That's why we have Secret and Config map.
- Config map: To store the application config
- CM will pass value into the PODs as environment variable, this value in PODs won't change until POD recreate
- CM can be used as a volumn and automatically update the value in the PODs (Your app also need to detect config change)
- Secret: To store the credentials (convert to base 64/encrypted before save)
- Secret was store in master node in etcd and encrypted
- Secret will be grant to particular Node only and store in Memory
- Finally only Pod that was granted able to see the Secret value