core concepts

  • cluster architecture
  • service & other network primitives
  • api primitives

kubernets architecture

  • worker nodes

    • host applications
    • kubelet
    • kube-proxy
    • container runtime
      • containerd
      • cri-o
      • docker
  • master

    • etcd
    • kube-scheduler
    • node-controller
    • replication-controller
    • kube-apiserver

etcd

a distributed reliable key-value store that is simple, secure & fast

  • key-value store
    • no duplicate keys cause each value is used once

installation

  1. download binaries
  2. extract
  3. run the executeable

etcd will listen on port 2379 by default

etcdctl is used to handle keys and values

etcd in kubernetes

every information we became using the kubectl get command is received from etcd

  • nodes
  • pods
  • configs
  • secrets
  • accounts
  • roles
  • bindings
  • others

every changes are updated in etcd server

therte are 2 ways to install etcd

  • setup manual
    • configure using the advertise client url
  • setup kubeadm
    • create a pod which contains the etcd server

etcdctl is the cli tool to interact with etcd
etcdctl can interact using api version 2 and version 3
by default etcdctl uses version 2

etcd commands

# version 2
> etcdctl backup
> etcdctl cluster-health
> etcdctl mk
> etcdctl mkdir
> etcdctl set

# version 3
> etcdctl snapshot save 
> etcdctl endpoint health
> etcdctl get
> etcdctl put

kube-apiserver

the primary management resource in kubernetes - api server can also be managed by sending a post request instead of the the kubectl command

responceable for

  1. authenticate user
  2. validate request
  3. retrieve data
  4. update etcd
  5. scheduler
  6. kubelet

api-server is the only part that communicate with etcd directly

important options in kube-apiserver service are

  • --etcd-servers
  • --etcd-cafile
  • --etcd-certfile
  • --etcd-keyfile
  • --kubelet-certificate-authority
  • --kubelet-client-certificate
  • --kubelet-client-key
  • --kubelet-https

if installed per kubeadm - the kube-apiserver is deployed by using a pod on the master node. in this case the api-server configuration file lays under: /etc/kubernetes/manifests/kube-apiserver.yaml - to get this information on a none kubeadm setup is to inspect the systemd unit file - also ps aux|grep kube-apiserver shows the actually used options as well

controller manager

can continiously monitoring the status of the components using in the system and bringing the whole system in a functionally state.

node-controller - receives every 5 seconds a heartbeat from the nodes (over the apiserver)

  • node monitor period = 5sec
  • node monitor grace period = 40sec
  • pod eviction timeout = 5min
    • if this fail the node controller removes all pods from there and roll the pods in another node relais on the replicaset

replication controller - monitoring the status of the replicasets and ensure the given number of pods are available

there are also

  • deployment-controller
  • namespace-controller
  • endpoint-controller
  • cronjob
  • job-controller
  • pv-protection-controller
  • service-account-controller
  • statefull-set
  • replicaset
  • pv-binder-controller

all of these are managed in the kube-controller-manger - the kube-controller-manager is setting up by systemd unit file as well

in this unitfile we specified the

  • --node-monitor-period
  • --node-monitor-grace-period
  • --pod-eviction-timeout

with the option --controller we can chosse which kind of controllers are used - by default all controllers are enabled

if installed per kubeadm - the kube-controller-manager is deployed by using a pod on the master node - in this case the kube-controller-manager configuration file lays under: /etc/kubernetes/manifests/kube-controller-manager.yaml - to get this information on a none kubeadm setup is to inspect the systemd unit file - also ps aux|grep kube-controller-manager shows the actually used options as well

kube-sheduler

responsable for scheduling pods or nodes

the kube-scheduler only decide which pod goes on which node

here we can setup a multinode setup with different efficient hardware dedicated to the application runs on

high level workflow

  1. filter nodes that fits the profile - maybe 10 cpus
  2. ranks nodes the get the best node for the pod - which has more free cpus

if installed per kubeadm - the kube-scheduler is deployed by using a pod on the master node - in this case the kube-scheduler configuration file lays under: /etc/kubernetes/manifests/kube-scheduler.yaml - to get this information on a none kubeadm setup is to inspect the systemd unit file - also ps aux|grep kube-scheduler shows the actually used options as well

kubelet

the kubelet - which runs on the worker nodes - register the nodes on k8s cluster. when it get a instruction - from the scheduler over the api-server - it request the container runtime to pull the required image and run that instance. kubelet monitors the state of the node and pods

kubeadm does NOT deploy kubelets

inspect the systemd unit file - also ps aux|grep kubelet shows the actually used options as well

kube proxy

is a process that runs on each node in a cluster. its job is to look for new services - it creates the appropiate rules on each node to forward traffic to those services to the backend. one way to do this is using iptables to forward traffic heading to the ip of the service to the ip of the pod.

if installed per kubeadm - the kube-scheduler is deployed in a daemon-set by using a pod on each node in the cluster

to install this service per hand you have to download the binary and create a systemd unit with all needed options for it. this will deploy it as an daemonset in a pod avail on each node.

pod

a single instance of an application. k8s does not deploy container directly on a node. the containers are encapsulated into a object called pod. it is the smallest object you can craete in a k8s cluster. pods have an 1to1 relationship with the containers running our application. to scale up your cluster you add additianl pods to your cluster and additional nodes for our pods. what we not do to scale our application is to add more containers to our pods. but you can or must have more containers in our pod. as example: an application needs “helper containers” to. this is helpfull when container stays in direct relationship with another on. so when one pod gets updated the other one does too, same as when delete or upscale. these booth containers as example share the same network namespace and storage. multiple container in a pod are rare usecases.

this example shows how to create a pod using cli

> kubectl run nginx --image nginx

this example shows how to create a pod using a yml file.

apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    apps: myapp
    type: front-end

spec:
  containers:
  - name: nginx-container
    image: nginx

as metadata k8s accept only pre defined entrys.
as labels you can define as much as you like.

to run this yml use kubectl create -f my-pod.yml

to check the pod use kubectl describe pod myapp-pod

if you run kubectl create or apply it doesn’t matter - they do the same

Controllers

replication controller

the rp take care of our deployment or our pod. it can scale the pod or ensure that a given value of instances is available. how ever. it is the older specification of controllers. the new recommended way is to use the successor which is called replica set.

replicaset

the rs is very similar to rc. instead of rc - rs can also manage pods which are not created as a part of the replication. replicasets can be scaled by label which makes the usage by many pods very handy.

to update an rs you can use: kubectl replace -f foo.yml or use the scale command: kubectl scale --replicas=6 -f foo.yml

for more details have a look on the replication part in the k8s beginner section.

deployments

deployments stays higher then replicasets which stays higher then pods which stays hihger than containers. deployments are usefull to make usage of rolling upgrades since they instruct replicasets to ensure a given number of pods are avail. so if we upgrade our pods - or better the container inside of the pods - with the rolling upgrade strategy and something goes wrong so that the pod won’t came up again - we are not losing our availabillity since the deployment brings pods down and start the new ones one by another and the rs take of our replications we ensure that a given number of pods are always available.

for more details have a look on the deployments part in the k8s beginner section.

namespaces

per default k8s creates a few namesspaces. one of them is the default namespace in which we run our whole deployment. a other one which is created by k8s is the kube-system namespace in which run all core parts of k8s - this prevents an accidently deletion of our whole setup - a third one is called kube-public in which where ressources that should be made avail to all users. on bigger installations we can make usage of more then the default namespace to create isolated environments like dev, staging and production. so while working on dev you cannot accidentally modify ressources in production. you can call from one ns to another by append the name of the ns to the hostname you want to call. in the same ns you call other pods by there hostnames only. example address: servicename.namespace.svc.cluster.local

to start a deployment or a single pod in another ns just append --namespace=dev as example to the end of the kubectl create command. you can also do this by adding a namespace: dev under the meta section like name and labels are in a definition file.

...
metadata:
  name: myapp-deployment
  namespace: dev
  labels:
    app: myapp
    type: front-end
...

to create a namespace write a config for it

apiVersion: v1
kind: Namespace
metadata:
  name: dev

or just do a create command

> kubectl create namespace dev

you can also make usage of quotas in a namespace. to do this you need to create a hard entry under the specs section in which are all limits listed.

services

services enable communication betweeen various components in- and outside of the application. you can connect applications with each other, front to backend - expose application to the web or linked with other services like databases.

for more details have a look on the services part in the k8s beginner section.