AI Assistant maintains a persistent connection between the browser and the backend for real-time chat completion and MCP tool calls. This connection must reach the same backend instance throughout a session. Running multiple instances behind a load balancer isn’t currently supported.

Run a single AI Assistant instance. This is the supported configuration for all current deployments.

If that instance restarts or is terminated, the WebSocket connection closes, and the AI Assistant user interface (UI) displays a connection error. Users can reload to start a new session. This is expected behavior.

Horizontal scaling of AI Assistant is a known limitation. Multi-instance support is planned for a future release.