What is Service Mesh, Data planes, Control planes?

4 min readDec 5, 2020

To understand the concepts of service mesh let’s try to go back & understand the problem that we are trying to solve through the Service mesh technology?

In this modern cloud era, we have containers, VMs, serverless functions & we’re running this whole collection of infrastructure. But one thing that comes to mind is how all these things communicate with each other?

The core of the problem is, running all of these different pieces of distributed applications, distributed infrastructure, but, they need a way to communicate.

So a key challenge becomes, how do I route and discover all these different applications. How did all of these components find each other?

Let’s say we have a web server that needs to talk to the database to be able to query data. So how does the webserver even know where the database is running?

It’s running somewhere in the cloud and maybe the database fails and it moves somewhere else in the cloud.

Now that’s really a simple example, but you can imagine if there is a caching layer, API services & all sorts of backend pieces. There are many many components that all suffer from the same problem.

Let’s say there is a 1000 person team developing different microservices with bounded context, in such a scenario how does the service mesh help? In short, when should I use service mesh technology?

So before answering this question, we need to understand what service mesh does?
It gives us a consistent way to route between all of our services, discover between them.
Now, with that definition, another question arises is, when to use it?
Does it make sense to use service mesh if I have two services like webserver & a database?
The answer is not really, it’s an overkill. There are multiple factors that one should consider before deep-diving into the Service Mesh approach.

How many people are in your engineering organization?
How many microservices do you have?
What languages are being used for these microservices?
Do you have experience adopting open source projects?
What platforms are you running your services on?
What features do you need from service mesh?
Are the features stable for a given service mesh project?

Now coming back to a scenario of having a team of 1000 developers working on 1000s of services, microservices (system APIs, process APIs, experience APIs- API Led connectivity approach). Actually, this is where we have a ton of complexity to address & frankly we don’t really want to solve the same problem (how does one service talk to another service) a 1000 different times, we’d rather solve it once with a central service mesh & then apply it to all 1000 applications

Till this point, we have discussed the first level concern i.e how do we route, discover & connect these pieces. But then the second level concern is, once you’ve connected all these pieces how do we know how much traffic is going from our application to a Microservice or to a database. Is there no request happening or too many requests or they’re getting errors between them or is it too slow?
In a nutshell, how should one observe what’s happening in the service mesh, & when we talk about observability it’s important to understand, how does that happen, how do we actually enable getting that data in terms of who’s talking to who?

And that’s where the data plane comes in to observe the distributed traffic flow and debug problems

And now let’s understand how the service mesh functions?
It has two critical layers, layer one is the control plane & layer two is the data plane.
For instance, if I have to set up a rule that my web server is allowed to talk to my database or how much traffic is flowing between web server & a database, I’ll go to control plane as a central management unit.
Whereas the data plane is the flip side of it that actually query the data between these things enforcing who can talk to who, collecting that data of how much traffic are we seeing & so on.

To further Simplify, the service mesh can be correlated with the elements in movies such that, the control plane is the director & data plane are actors

From the deployment perspective, data plane lives everywhere. It can run either as an agent or a proxy on every container, VMs, serverless function & also talk to the control plane either through API or CLI or WebUI interface. You might interface the control plane in different ways but then it’s the brain that controls the rest of the data planes for you.

So does that mean, if you are an app developer you’ll not really touch the control plane at all?

The answer is Yes, but well, instead of hardcoding the DB IP address in the application, it will talk to DB through the data plane i.e. the local proxy or agent to make sure the data goes over the data plane. But beyond that, the control plane is kind of invisible to the application developer.

To summarize:
Control plane manages
1. Administrative traffic
2. Configuration
3. System control
4. Management
whereas Data plane manages
1. Application Traffic
2. Routing
3. Load balancing
4. Observability

What is Service Mesh, Data planes, Control planes?

Written by Parag Patil