Architecture Design 101 (From Monolith to Multi-Cluster with Istio)
Back then we hosting all our applications in a single Monolith server (we were using Exabytes at that time).
Single Monolith Server
Pros:
- Cheap
Cons:
- Not able to scale
- Single point of failure
- Very hard to isolate between applications
- Unable to handle a huge load
- No disaster recovery
In 2018, we migrated our monolith server to a single cloud Kubernetes cluster which contains 2 master and 3 worker nodes.
Single Kubernetes Cluster
Pros:
- Cheap
- Easy to manage especially having very few projects
- Disaster recovery
Cons:
- Single point of failure
- Unable to separate the billing based on the projects — we are unable to justify the actual server cost for specific project/product
- Very hard to manage by different teams
- Configurations conflict, required to set a standard naming convention
Since 2019, we started to create more and more clusters due to business needs.
Multiple Kubernetes Cluster (Multi-Cluster)
As you can see from the diagram below, we are unable to share the common applications such as Redis, NATS, etc between the clusters. Furthermore, this limitation also increase the cost of maintainability (Whenever a new version of Redis is released, we need to go into every cluster and upgrade it accordingly)
Pros:
- Separation of billing
- Easy to manage by different teams
- Separation of products
- Disaster segregation
Cons:
- Much higher cost
- Required more manpower on managing the clusters
- Wasting resources
- Unable to share common resources
Around September 2021, we decided to restructure it to Service Mesh.
Multiple Kubernetes Cluster with Istio (Service Mesh)
We move those sharable applications to a standalone cluster (we named it as shared in the diagram). In this way, we are no longer required to deploy Redis cluster in each of the Kubernetes clusters. Instead, we can just consume from the shared cluster.
Pros:
- A/B Testing
- Canary deployment
- Separation of products
- Much better version control
- High availability
- Cheaper than multi-cluster because some resources can be sharable
Cons:
- Slightly higher latency (around 2–5ms)
- Slightly higher memory consumption due to envoy proxy
- Hard to learn and setup
Currently, we only configured for the single region with multiple zone support. In the future, we probably will explore on multi-region cluster load balancing with Istio.
Thank you for reading.