Horizontal Scaling
Horizontal Scaling of the Builder Vault TSM
Horizontal scaling is achieved by running multiple instances of the same MPC node for a given player and distributing requests across them, improving both throughput and fault tolerance. The key challenge is that all state for an MPC session is held in memory, so a session must be handled by the same node throughout its lifetime.
The Builder Vault TSM supports four horizontal scaling strategies:
| Strategy | Load Balancer Required | Session Affinity | Scale Granularity |
|---|---|---|---|
| Replicated TSM | No | N/A (SDK-managed) | Full TSM |
| Message Broker Communication | Standard (round-robin) | Handled by broker | Individual node |
| Direct Communication | Session-affinity capable | HTTP header or query param | Individual node |
| Intra-Node Routing | Standard (round-robin) | Handled by nodes | Individual node |
DKLS23If you use the DKLS23 protocol and do horizontal scaling with more than five instances of the same MPC node, then make sure to set the
DeactivatedPlayersCache = databasein the MPC node configuration files in the[DKLS23]section as explained here.
Replicated TSM
The simplest approach. You deploy multiple complete TSMs, where all nodes of a player share the same database. Any SDK can connect to any TSM and access the same keys.
How to scale: Add or remove entire TSMs. You cannot add a single node independently.
Trade-offs:
- Simple to set up with no load balancer required.
- The SDK is aware of multiple TSMs; your application must decide which one to use.
- Scaling requires coordinating all players simultaneously, which can be difficult when players are controlled by different organizations.
Replicated MPC Nodes with Message Broker Communication
If your MPC nodes communicate via a message broker (such as Redis or AMQP), you can place a standard round-robin load balancer (e.g., HAProxy) in front of multiple nodes representing the same player. Because all inter-node MPC communication goes through the broker, a session always continues on the node that started it — no special session affinity configuration is needed on the load balancer.
How to scale: Add or remove individual node instances at any time, independently of other players.
Trade-offs:
- The SDK sees a single TSM endpoint, with no awareness of the underlying instances.
- No additional MPC node configuration is required.
- Requires a message broker as part of your infrastructure.
Replicated MPC Nodes with Direct Communication
When nodes communicate directly rather than through a message broker, session affinity must be handled by the load balancer. Since most load balancers can only apply affinity on a single port, MPC communication must be routed through the SDK port rather than a separate MPC port.
Configuration — enable MPC over the SDK port:
[SDKServer]
MPCWebSocketPath = "/mpc"Also configure the other nodes to connect to this player via WebSocket.
Configure the load balancer to apply session affinity based on one of the following:
- The
MPC-RoutingIDHTTP header - The
mpc_routing_idHTTP query parameter
Trade-offs:
- The SDK sees a single TSM endpoint.
- Requires a load balancer that supports header- or query-parameter-based session affinity.
- MPC and SDK traffic must share the same port.
Replicated MPC Nodes with Intra-Node Routing
If you cannot use a message broker and cannot route MPC traffic over the SDK port, nodes can manage session affinity themselves. In multi-instance mode, each node checks incoming requests and forwards any that belong to a different node's session.
How to scale: Add or remove individual node instances at any time, independently of other players.
Trade-offs:
- The SDK sees a single TSM endpoint.
- No special load balancer configuration required — a standard round-robin balancer works.
- The intra-node forwarding adds overhead; do not expect the same throughput as the other approaches.
Configuration:
[MultiInstance]
# IP address used to reach this instance from other instances.
# If unset, an address is auto-detected, which may not be correct
# when multiple network interfaces are present.
Address = ""
# SDK port announced to other instances.
# Defaults to the port defined in [SDKServer].
SDKPort = 0
# MPC port announced to other instances.
# Defaults to the port defined in [MPCTCPServer].
MPCPort = 0
# How often to run the cleanup job that removes stale routing entries.
CleanupInterval = "5m"
# Probability (0–100) that the cleanup job runs on each interval.
# 0 = never, 100 = always.
CleanupProbability = 25Contact Blockdaemon for performance benchmarks and guidance on choosing the right strategy for your deployment.
Updated 9 days ago
