Kubernetes Service Discovery for a multi-pod design

Service Discovery

In the previous post, we have seen services in action in the context of allowing external access to pods running in the cluster when we created a NodePort service to access a web server. It’s time to see how services are useful within the cluster.

We’ll split our example microservices application into three pods, one for each tier.

sd

Remember that we used the fact that containers in the same pod can communicate with each other using localhost. In a multi-pod design this won’t work. This is where Services come in. Also recall from the previous post, where I mentioned that there is a better way to do this, well, this is it. Services are what makes multi-pod design work.

Services provide a static endpoint to access pods in each tier.

We could directly use the individual pod IP addresses on the container network but that would cause the application to break when pods restarted because their IP address could change.

An added benefit of services is that they also distribute load across the selected group of pods allowing us to take advantage of scaling the application tier out across multiple server pods. So to realize these benefits we need to create a data tier service in front of the redis pod and an application tier service in front of the server pod. As shown here:

sd1

Service Discovery mechanisms

There are two service discovery mechanisms built into Kubernetes.

  • The first is environment variables and
  • the second is DNS.

Environment Variables: Kubernetes will automatically inject environment variables in containers. These env variables provide the address to access services. The environment variables follow a naming convention so that all you need to know is the name of the service to access it.

DNS: Kubernetes also constructs DNS records based on the service name. And containers are automatically configured to query the cluster’s DNS to discover services.

sd2

Example

We’ll start with a new namespace to organize the resources for this demo. It is called service-discovery. The namespace manifest is shown here:

apiVersion: v1
kind: Namespace
metadata:
  name: service-discovery
  labels:
    app: counter

Create the namespace:

ubuntu@ip-10-0-128-5:~/src# kubectl create -f 4.1-namespace.yaml
namespace/service-discovery created
ubuntu@ip-10-0-128-5:~/src#

Create the data tier

Moving on to the data tier, we have a manifest that includes multiple resources. The YAML format allows us to declare multiple resources by separating them with three hyphens ---. It’s possible to cram all the pods and services into one file but separating them by tier mimics the way we want to manage each tier independently.

apiVersion: v1
kind: Service
metadata:
  name: data-tier
  labels:
    app: microservices
spec:
  ports:
  - port: 6379
    protocol: TCP # default
    name: redis # optional when only 1 port
  selector:
    tier: data
  type: ClusterIP # default
---
apiVersion: v1
kind: Pod
metadata:
  name: data-tier
  labels:
    app: microservices
    tier: data
spec:
  containers:
    - name: redis
      image: redis:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 6379

We have a service and our redis pod. Both are named data tier.

The pod has a tier label which is used by the service as its selector.

In our example we only have one microservice in the data tier, but that won’t be the case in general. You can include as many labels as necessary in the selector to select just what you need. We can get by with just the one label selector in our case.

Services can also publish more than one port which makes naming the ports mandatory to identify them.

We only have one so the name is optional.

Lastly we set the type to ClusterIP, which is the default so the line could be omitted. ClusterIP creates a virtual IP inside the cluster for internal access only.

Time to create the data tier: kubectl create -f 4.2-data_tier.yaml -n service-discovery

ubuntu@ip-10-0-128-5:~/src# kubectl create -f 4.2-data_tier.yaml -n service-discovery
service/data-tier created
pod/data-tier created
ubuntu@ip-10-0-128-5:~/src#

To create the resources. The command is the same regardless of how many resources are specified in the file. The resources in the file are created in the order they are listed in the file.

Check that the pod is running with: kubectl get pod -n service-discovery. Notice, if we don’t specify the namespace then you won’t see the pods because it lists the pods in the default namespace.

ubuntu@ip-10-0-128-5:~/src# kubectl get pods
No resources found.
ubuntu@ip-10-0-128-5:~/src# kubectl get pods -n service-discovery
NAME        READY   STATUS    RESTARTS   AGE
data-tier   1/1     Running   0          103s
ubuntu@ip-10-0-128-5:~/src#

Then describe the service, notice here that the service named data-tier has a ClusterIP: 10.105.61.13 and that it has one endpoint that corresponds to the redis pod selected by the service.

ubuntu@ip-10-0-128-5:~/src# kubectl describe service -n service-discovery data-tier
Name:              data-tier
Namespace:         service-discovery
Labels:            app=microservices
Annotations:       <none>
Selector:          tier=data
Type:              ClusterIP
IP:                10.105.61.13
Port:              redis  6379/TCP
TargetPort:        6379/TCP
Endpoints:         192.168.28.193:6379
Session Affinity:  None
Events:            <none>
ubuntu@ip-10-0-128-5:~/src#

Create the App Tier

Now let’s move on to the app tier. Again we have a service and a pod.

apiVersion: v1
kind: Service
metadata:
  name: app-tier
  labels:
    app: microservices
spec:
  ports:
  - port: 8080
  selector:
    tier: app
---
apiVersion: v1
kind: Pod
metadata:
  name: app-tier
  labels:
    app: microservices
    tier: app
spec:
  containers:
    - name: server
      image: lrakai/microservices:server-v1
      ports:
        - containerPort: 8080
      env:
        - name: REDIS_URL
          # Environment variable service discovery
          # Naming pattern:
          #   IP address: <all_caps_service_name>_SERVICE_HOST
          #   Port: <all_caps_service_name>_SERVICE_PORT
          #   Named Port: <all_caps_service_name>_SERVICE_PORT_<all_caps_port_name>
          value: redis://$(DATA_TIER_SERVICE_HOST):$(DATA_TIER_SERVICE_PORT_REDIS)
          # In multi-container example value was
          # value: redis://localhost:6379

The service selects the pods with the tier: app label, matching the server pod declaration.

sd3

The pod spec is the same as before with one exception:

The value of the REDIS_URL environment variable is set using environment variables set by Kubernetes for service discovery.

The value used to be localhost:6379 but now we need to access the data tier service.

There are separate environment variables made available to you. The service ClusterIP address is available using the environment variable following the pattern of service name in all capital letters with hyphens replaced by underscores followed by underscore service underscore host in all caps.

  • IP address: <all_caps_service_name>_SERVICE_HOST
  • Port: <all_caps_service_name>_SERVICE_PORT
  • Named Port: <all_caps_service_name>_SERVICE_PORT_<all_caps_port_name>

By knowing the service name you can construct that environment variable name to discover the service IP address.

In our example, DATA_TIER_SERVICE_HOST. If the port includes a name, you can also append underscore port name in all caps and hyphens replaced by underscores. Which is DATA_TIER_SERVICE_PORT_REDIS in our example. The data tier service only declares one port so the appended name is optional. As a best practice you can append the service name to tolerate adding ports to the service in the future.

When using environment variables in the value field, you need to enclose the variable name in parentheses and precede it with a dollar sign. $(DATA_TIER_SERVICE_HOST):$(DATA_TIER_SERVICE_PORT_REDIS)

This allows differentiating container environment variables from the Kubernetes-provided values.

When using environment variables for service discovery the service must be created before the pod in order to use environment variables for service discovery.

Kubernetes does not update the environment variables of running containers, they only get set at startup.

The service must also be in the same namespace for the environment variables to be available.

Let’s create the app tier now:

ubuntu@ip-10-0-128-5:~/src$ kubectl create -f 4.3-app_tier.yaml -n service-discovery
service/app-tier created
pod/app-tier created
ubuntu@ip-10-0-128-5:~/src$

ubuntu@ip-10-0-128-5:~/src$ kubectl describe -n service-discovery service app-tier
Name:              app-tier
Namespace:         service-discovery
Labels:            app=microservices
Annotations:       <none>
Selector:          tier=app
Type:              ClusterIP
IP:                10.96.245.156
Port:              <unset>  8080/TCP
TargetPort:        8080/TCP
Endpoints:         192.168.203.66:8080
Session Affinity:  None
Events:            <none>
ubuntu@ip-10-0-128-5:~/src$
ubuntu@ip-10-0-128-5:~/src$ kubectl describe -n service-discovery pod app-tier
Name:         app-tier
Namespace:    service-discovery
Priority:     0
Node:         ip-10-0-17-1.us-west-2.compute.internal/10.0.17.1
Start Time:   Tue, 28 Apr 2020 01:42:30 +0000
Labels:       app=microservices
              tier=app
Annotations:  <none>
Status:       Running
IP:           192.168.203.66
Containers:
  server:
    Container ID:   docker://8268f5e53ae6bfb1acbab78f8932bbaaac8c9934e00a7bb5603e0b4e95986b75
    Image:          lrakai/microservices:server-v1
    Image ID:       docker-pullable://lrakai/microservices@sha256:9e3e3c45bb9d950fe7a38ce5e4e63ace2b6ca9ba8e09240f138c5df39d7b7587
    Port:           8080/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Tue, 28 Apr 2020 01:42:36 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      REDIS_URL:  redis://$(DATA_TIER_SERVICE_HOST):$(DATA_TIER_SERVICE_PORT_REDIS)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rjbsl (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  default-token-rjbsl:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rjbsl
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type    Reason     Age   From                                              Message
  ----    ------     ----  ----                                              -------
  Normal  Scheduled  2m9s  default-scheduler                                 Successfully assigned service-discovery/app-tier to ip-10-0-17-1.us-west-2.compute.internal
  Normal  Pulling    2m7s  kubelet, ip-10-0-17-1.us-west-2.compute.internal  Pulling image "lrakai/microservices:server-v1"
  Normal  Pulled     2m3s  kubelet, ip-10-0-17-1.us-west-2.compute.internal  Successfully pulled image "lrakai/microservices:server-v1"
  Normal  Created    2m3s  kubelet, ip-10-0-17-1.us-west-2.compute.internal  Created container server
  Normal  Started    2m3s  kubelet, ip-10-0-17-1.us-west-2.compute.internal  Started container server
ubuntu@ip-10-0-128-5:~/src$

Create the support tier

Now on to the support tier. We don’t need a service for this tier. Just a pod will do and it contains the counter and poller containers used before.

apiVersion: v1
kind: Pod
metadata:
  name: support-tier
  labels:
    app: microservices
    tier: support
spec:
  containers:

    - name: counter
      image: lrakai/microservices:counter-v1
      env:
        - name: API_URL
          # DNS for service discovery
          # Naming pattern:
          #   IP address: <service_name>.<service_namespace>
          #   Port: needs to be extracted from SRV DNS record
          value: http://app-tier.service-discovery:8080

    - name: poller
      image: lrakai/microservices:poller-v1
      env:
        - name: API_URL
          # omit namespace to only search in the same namespace
          value: http://app-tier:$(APP_TIER_SERVICE_PORT)

This time we use DNS for service discovery of the app tier service. Kubernetes will add DNS A records for every service. The service DNS names follow the pattern of: <service_name>.<service_namespace>. In our example that is app-tier.service-discovery. However, if the service is in the same namespace then you can simply use only the service name.

sd4

The poller omits the namespace in this manifest. No need to convert hyphens to underscores or use all caps when using DNS service discovery. The cluster DNS resolves the DNS name to the service IP address. You can get service port information using DNS SRV records but that isn’t something we can use in the manifest so we have to either hard-code the port information or use the service port environment variable.

The counter uses a hard-coded port and the poller uses the port environment variable for illustration. It is possible to use the DNS SRV port record to configure the pod on startup using something called InitContainers which I will cover in a later post.

sd5

Let’s create the support tier, and check all the pods again. There are three running pods creating four containers in total. Let’s check on the poller logs to see what’s going on with our count.

ubuntu@ip-10-0-128-5:~/src$ kubectl create -f 4.4-support_tier.yaml -n service-discovery
pod/support-tier created
ubuntu@ip-10-0-128-5:~/src$ kubectl get pods -n service-discovery
NAME           READY   STATUS    RESTARTS   AGE
app-tier       1/1     Running   0          3m27s
data-tier      1/1     Running   0          93m
support-tier   2/2     Running   0          13s
ubuntu@ip-10-0-128-5:~/src$ kubectl logs -n service-discovery support-tier poller -f
Current counter: 7
Current counter: 22
Current counter: 33
Current counter: 40
Current counter: 52
Current counter: 60
Current counter: 73
Current counter: 78
Current counter: 92



Current counter: 103
Current counter: 117
^C
ubuntu@ip-10-0-128-5:~/src$

The application is just plugging right away. A satisfying result.

Summary

  • We’ve covered structuring N-tier applications using services as interfaces between tiers. We used the cluster IP type of service for accessing the data and application tiers within the cluster.
  • We also covered how Kubernetes service discovery works with environment variables and DNS. That allowed us to refactor our multi-container pod application into the multi-tier application that we stood up.
  • When using environment variables for service discovery the service must be created before the pod in order to use environment variables for service discovery. The service must also be in the same namespace.
  • DNS records overcome the shortcomings of environment variables. DNS records are added and removed from the cluster’s DNS as services are created and deleted. The DNS name for services include the namespace allowing communication with services in other namespaces. SRV DNS records are created for service port information.

What’s next?

Consider how we could scale our current n-tier application. We could increase the number of server pods by changing the name to something like example apptier-1 then creating example apptier-2 and so on. We could probably glue this together with some scripting. A bit extra work but worth it to make scaling easy.

sd6

So then what happens when we would want to reconfigure the server container? Well, let’s see. We could create example apptier-v1-1 and then example apptier-v2-1 and with some updated scripting, these things could probably handle that. So what happens when something goes wrong or what if there’s an error in the new version? We could probably handle that by polling the API and checking the status again with probably some scripting and some glue code on our end, but there probably should be a better way to do this.

The good news is that there is a much better way using deployments.