Kubernetes basics introduction
Kubernetes is an open-source orchestrator platform for running and managing containers. Kubernetes takes care of the following things:
- Containers runtime
- Compute and memory
It is a complete package for running applications and managing them. I want to emphasize management capabilities. Management of every aspect listed above is straightforward and allows the end user to tweak the configuration per his needs.
Another important aspect of Kubernetes and possibilities is from which perspective you are looking at it.
You can look at it from these perspectives:
- Kubernetes Administrator
- End user - Developer
Kubernetes Administrator is responsible for installing, configuring, and running the Kubernetes cluster. Since the Kubernetes clusters don't appear out of thin air someone needs to take care of the infrastructure where the cluster is running, install it and configure that Kubernetes for example:
- Storage associated with the cluster which can be used on demand by the end user
- Configure networking and how pods communicate between themselves
- Maintain RBAC for the end users
- Take care of security aspects and implement rules for the enforcement of the security rules
End user - Developer, on the other hand, just runs the applications on Kubernetes and configures everything needed for the smooth running of the application itself on Kubernetes.
The end user uses all the resources (Computing, Memory, Storage, and Network) via YAML definitions applied to the cluster, using kubectl CLI or communicating directly to the Kubernetes API.
Following these perspectives, there are available Kubernetes certifications.
- Kubernetes and Cloud-native associate is the certification focused on the fundamental knowledge about Kubernetes and wider concepts in the Cloud-native world.
- Certified Kubernetes Application Developer (CKAD) is focused on the end-user who uses Kubernetes to run the application on Kubernetes - primarily developers.
- Certified Kubernetes Administrator focuses on the skills needed for the Kubernetes administrator.
- Certified Kubernetes Security Specialist focuses on the security aspect of Kubernetes and best practices.
Kubernetes uses the node concept as the base unit for running the cluster. Node is the biggest entity in Kubernetes - responsible for the operation of the cluster itself.
There are two types of nodes:
- Master node
- Worker node
Kubernetes is API based which means that tool like kubectl is talking to the Kubernetes API. One could hit the Kubernetes API endpoints with specific parameters to do the same thing as with kubectl.
The Master node is responsible for the control plane of Kubernetes which means that all commands sent to Kubernetes API are handled by the Master after which the Master node instructs the worker nodes to do a specific action.
Worker nodes are the workhorses of the cluster. On worker nodes, the containers are running.
There can be multiple master nodes and multiple worker nodes.
The usual setup is one master node and multiple worker nodes. Where payload (applications that are running) is distributed over worker nodes.
To be able to run the cluster in harmony master and worker nodes needs software installed on them to provide the features we need.
The master node is running Kubernetes API. Worker nodes are running kubelet (handles the running of the containers) and kube-proxy which handles the networking.
Having this setup allows scalability to be implemented:
- Horizontally (Add more nodes to distribute the work)
- Vertically (Increase nodes resources)
To sum up: The cluster needs to be installed and configured using the tools for the job. To install the cluster you need a physical or virtual environment that will play the role of the node - either a physical machine or virtual machine. The networking needs to be configured in a way that every node can talk to each other. Other things like storage, computing, memory, etc. need to be configured on the node level by the Kubernetes administrator to be used by the end users. This is all that needs to be done to have a production-ready cluster.
Since Kubernetes is open-source there are multiple distributions of Kubernetes. Same thing as Linux flavors.
Some of the Kubernetes distributions you may have heard of:
- Rancher K3s
- AKS (Cloud)
- EKS (Cloud)
- VMware Tanzu
These are all distributions that have the same concepts of Kubernetes implemented but modified in a way that suits the specific use case.
Kubernetes definitions are created in the YAML format which is translated by the kubectl tool to the format needed to speak to the Kubernetes API.
This means that every Kubernetes component can be defined in the YAML file and applied to the master node which handles the parameters and create the state in the cluster same as in the definition of the file itself.
The kubectl is the helper CLI tool which simplifies the task of talking to the Kubernetes API. One could directly use kubectl from the terminal to talk to the Kubernetes master node/cluster to apply any definition directly from the YAML file.
Kubectl also has helpers implemented and you can specify commands directly (without using definition files) to create a specific component on the cluster itself.
As we increase granularity and focus not on the infrastructure part of the cluster but on the end user we see other things pop up. In simpler terms when we zoom in on the Kubernetes cluster we see many components:
There are many components in which one can easily get lost. But if we layer them correctly, categorize them, and change the perspective from which we are looking - we can manage a lot better in this sea.
Master worker nodes are running, in the forever loop, controllers. Controllers have the responsibility to track the status of the cluster, watch for the applied resources, and converge the cluster state to the applied resources at any time.
What needs to be mentioned is that Kubernetes controllers and controller definitions are separate things. Kubernetes controllers are software that is running in the loop, on the master node, to apply any definition of the Controller definition to the cluster.
The most popular controllers which are known by the end-users are:
Each one of these controllers is responsible for running the pods.
Pod is the abstraction layer around the container and has these properties:
- Each pod has a unique IP address and is the entry point to the container running on the cluster
- Each pod can run one or many containers
- Each pod gets a new IP address when restarted or recreated by any event on the cluster
To sum up: Pod is running the container payload where the application is packaged and has network capabilities since it has an IP address.
To bring it closer: You tell the pod here is the container image I want to run on the cluster please run it. Kubernetes created the pod, handles networking, and contacts container runtime to start running the container image itself.
Pod by itself is not so useful. Why? If we create a Pod object by talking to the Kubernetes API, when we delete it it will be gone. That is the reason we have Controllers.
For example, the Deployment controller ensures that at any moment there is a specified number of pods running with the help of the ReplicaSet. Each time pod is recreated pod gets a random name.
For the deployment controller, there is a possibility to define the number of replicas (Number of pods) that will be created by this deployment.
Another type of controller is the StatefulSet. StatefulSet ensures that pods maintain the same name when recreated in the same order in the way they were first created.
A high number of other resources (Not all) are gravitating around the pods. ServiceAccounts, Secrets, ConfigMaps, PersistentVolumeClaims, Limits, etc.
To sum up: Pod is the basic running unit where containers are running. Pod has an IP address over which the running container can be contacted, storage, compute and memory resources, and all other components needed for the specific pod. Pods are abstracted by the controller which handles the (re)creation and horizontal scalability of the pods. Many other Kubernetes components are gravitating around the pods and their configuration is directly impacting and showing in action when applied to the specific pod/s.
ConfigMaps and Secrets
ConfigMaps and Secrets are a way to externalization of the configuration and secret credentials.
ConfigMaps and secrets are created apart from the pods and they exist by themselves with no connection to the pods or controllers which are running the pods. After they are created and visible in the cluster the end-user needs to alter the definition of the Pod/Controller to connect the ConfigMap or the Secret to be used in the Pod which in fact makes it available to the container in running state.
Namespace in Kubernetes is the same as in many other fields of software development. A namespace is a logical group of Kubernetes components that are isolated from the other namespace. For example the pods in the same namespace can't have the same name while in different namespaces they can.
The namespace named default is a default namespace that is created after installing the cluster. The namespace kube-system is the namespace where critical software for the cluster operation is located. For example kube-proxy.
These are the main concepts that need to be introduced before one could easily start using Kubernetes on the basic level and understand what is happening and why is happening in the specified manner.
The reader should have a basic understandment of what Kubernetes is, how it runs containers, and what the most basic components are.
Kubernetes ecosystem is big and every sub-concept is a story for itself.
Read more in detail
If the reader is interested more in detail about specific categories there is a list of articles below which are explaining different concepts of Kubernetes.