Creating diagrams from the Kubernetes API

Using Python, Diagrams and JupyterLab

Frederico Muñoz

Jupyter logo diagrams logo 285727?s=460&v=4
   

Why?

Using text-based formats for documentation makes a lot of things easier, namely the way it aligns with the rest of the toolset used.

  • Ability to use all the functionalities of SCM systems: merges, forks, pull requests, history, reversion, etc.
  • Easy to automate the production and final output, from PDF documents to HTML pages, all from the same source.
  • Integration of scripts and code

For the last point Jupyter notebooks are a common choice (especially for Python); here I demonstrate the use of the official Kubernetes library to produce this prensentation and some very simple diagrams of a Kubernetes cluster with a simple application deployed (using minikube)

Above all using this approach allows the creation of up-to-date documentation since the sections that are supposed to reflect what exists are automatically updated: this in turn reduces issues from out-of-date documentation that is always playing catch-up with changes.

Setting things up

This is straight form the official Kubernetes client for Python documentation; to work kubectl should be working correctly in the desired context, e.g. kubectl config viewshould work.

In [344]:
from kubernetes import client, config
config.load_kube_config()
v1 = client.CoreV1Api()
display(v1)
<kubernetes.client.api.core_v1_api.CoreV1Api at 0x7f9ae7fec470>

Nodes

Let's start by creating a dataframe with some basic information on the nodes.

In [345]:
import collections
import pandas as pd

nodes = v1.list_node(watch=False)
node_list = []
for i in nodes.items:
    node_dict = collections.OrderedDict()
    node_dict["Name"]= i.metadata.name
    node_dict["OS"]= i.status.node_info.operating_system
    node_dict["OS image"]= i.status.node_info.os_image
    node_dict["Runtime"]= i.status.node_info.container_runtime_version
    node_list.append(node_dict)

nodes_df = pd.DataFrame(node_list)
display(nodes_df)
Name OS OS image Runtime
0 multinode linux Buildroot 2020.02.10 docker://20.10.4
1 multinode-m02 linux Buildroot 2020.02.10 docker://20.10.4

We set up some global attributes to be used later; these specify Graphviz attributes that determine how the graphs looks.

In [346]:
graph_attr = {
    "layout": "fdp",
}
cluster_attr = {
    "fontsize": "12",
    "style": "filled",
    "fillcolor": "grey97"
}
node_attr = {
    "fontsize": "8"   
}

We also do all the imports right away to make the code shorter in later blocks.

In [347]:
from diagrams import Cluster, Diagram
from diagrams.k8s.clusterconfig import HPA
from diagrams.k8s.compute import Deployment, Pod, ReplicaSet
from diagrams.k8s.infra import Node, Master
from diagrams.k8s.network import Ingress, Service
from diagrams.k8s.group import Namespace
from diagrams.k8s.podconfig import Secret

About Diagrams

Diagrams is the library we will use to produce the diagrams; from the home page:

Diagrams lets you draw the cloud system architecture in Python code. It was born for prototyping a new system architecture without any design tools. You can also describe or visualize the existing system architecture as well. Diagram as Code allows you to track the architecture diagram changes in any version control system.

It uses Graphviz to build the graphs so that needs to be installed as well.

Show the nodes

Using the dataframe and Diagrams we can build this simple cluster diagram by iteractig through the nodes:

In [357]:
with Diagram(show=False, direction="LR",graph_attr=graph_attr, node_attr=node_attr) as diag:
    with Cluster("k8s cluster", graph_attr=cluster_attr):
        for idx,node in  nodes_df.iterrows():
            Node(node["Name"]) #>> Service("svc")
diag
Out[357]:

Namespaces

The cluster has the following namespaces (or "projects") - again, we process the dict to build a dataframe.

In [358]:
ns = v1.list_namespace(watch=False)
ns_list = []
for i in ns.items:
    ns_dict = collections.OrderedDict()
    ns_dict["Name"]= i.metadata.name
    ns_dict["UID"]= i.metadata.uid
    ns_list.append(ns_dict)
ns_df = pd.DataFrame(ns_list)
display(ns_df)
Name UID
0 default 0086019d-bc33-4b04-a4e3-4f63f3472621
1 kube-node-lease 6d7de7e1-02ab-4d70-9e4a-0b9237890f73
2 kube-public 4dd3461e-726c-4595-a3dc-d2bbc0748851
3 kube-system db61c947-b684-4efc-8c45-9a9ed915852d

We can show the namespaces using a similar approach.

In [359]:
with Diagram(show=False,graph_attr=graph_attr,node_attr=node_attr) as diag:
    with Cluster("k8s cluster", graph_attr=cluster_attr):
        for idx,namespace in ns_df.iterrows():
            Namespace(namespace["Name"])
diag
Out[359]:

Pods

The approach is the same; not that we are building Pandas dataframes because this is useful for other purposes (namely for the production of a PDF via LaTeX), but directly using the dict is possible and would actually be simpler. I'm using a simplistic approach for the label/selector pairing, so I simply add an "App" column that contains the label.

In [360]:
ret = v1.list_pod_for_all_namespaces(watch=False)
pods_list = []
for i in ret.items:
    pods_dict = collections.OrderedDict()
    pods_dict["Name"]= i.metadata.name
    pods_dict["Namespace"]= i.metadata.namespace
    pods_dict["IP"] = i.status.pod_ip
    pods_dict["Node"] = i.status.host_ip
    pods_dict["Start"]= i.status.start_time
    if "app" in i.metadata.labels:
        pods_dict["App"] = i.metadata.labels["app"]
    else:
        pods_dict["App"] = ""
    for v in i.spec.volumes:
        if v.secret is not None:
            pods_dict["Secret"] = v.secret.secret_name
        else:
            pods_dict["Secret"] = ""
    pods_list.append(pods_dict)
pods_df = pd.DataFrame(pods_list)
display(pods_df)
Name Namespace IP Node Start App Secret
0 web-79d88c97d6-qjwrb default 10.244.1.2 192.168.99.102 2021-04-15 11:08:42+00:00 web default-token-xcg4s
1 coredns-74ff55c5b-22d2c kube-system 10.244.0.2 192.168.99.101 2021-04-15 10:56:28+00:00 coredns-token-mh9wz
2 etcd-multinode kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:55:54+00:00
3 kindnet-h6jqn kube-system 192.168.99.102 192.168.99.102 2021-04-15 10:56:47+00:00 kindnet kindnet-token-9f9xk
4 kindnet-n4h8k kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:56:09+00:00 kindnet kindnet-token-9f9xk
5 kube-apiserver-multinode kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:55:54+00:00
6 kube-controller-manager-multinode kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:55:54+00:00
7 kube-proxy-n8jz5 kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:56:09+00:00 kube-proxy-token-brdgp
8 kube-proxy-zlnwn kube-system 192.168.99.102 192.168.99.102 2021-04-15 10:56:47+00:00 kube-proxy-token-brdgp
9 kube-scheduler-multinode kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:55:54+00:00
10 storage-provisioner kube-system 192.168.99.101 192.168.99.101 2021-04-15 10:56:28+00:00 storage-provisioner-token-kg6jz

We also get the Services, including their selector.

In [361]:
services = v1.list_service_for_all_namespaces(watch=False)
srv_list = []
for i in services.items:
    srv_dict = collections.OrderedDict()
    srv_dict["Name"]= i.metadata.name
    srv_dict["Type"]= i.spec.type
    srv_dict["IP"]= i.spec.cluster_ip
    if i.spec.selector is not None:
        if "app" in i.spec.selector:
            srv_dict["App"] = i.spec.selector["app"]
    srv_list.append(srv_dict)
srv_df = pd.DataFrame(srv_list)
display(srv_df)
Name Type IP App
0 kubernetes ClusterIP 10.96.0.1 NaN
1 web NodePort 10.99.132.93 web
2 kube-dns ClusterIP 10.96.0.10 NaN

And now we tie it all up: looking to the default namespace (which is where I deployed the application) the pod, service and secret - with their relatoinships - are built.

In [362]:
with Diagram(show=False, graph_attr=graph_attr, node_attr=node_attr) as diag:
    with Cluster("ns: default", graph_attr=cluster_attr):
        for idx,pod in pods_df[pods_df["Namespace"] == "default"].iterrows():
            p = Pod(pod["Name"])
            if srv_df[srv_df["App"] == pod["App"]]["Name"].any():
                p >> Service(pod["App"])
            if pod["Secret"]:
                Secret(pod["Secret"]) >> p
diag
Out[362]:

Final words

JupyterLab can export to several different formats, including AsciiDoc, LaTeX and obviously Reveal.js that was used to produce this presentation - on top of the HTML export and the ablity to use it interactively.

Since we're in a Jupyter notebook a lot more can be done: using any of the visualisations libraries (like Seaborn) we can automatically produce plots and statistics about our cluster: this simple example usesmatplotlib and Seaborn to show how many pods are running in each node.

In [363]:
import seaborn as sns
sns.set()
sns.set_style("whitegrid")

pods_df[["Name","Node"]].groupby('Node').count().plot(kind="bar",legend=False)
Out[363]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f9af842dc50>