
If you search for how to build a real-time chat app, youโll find thousands of tutorials showing you how to send a message from User A to User B on the same screen. But thatโs "Hello World" complexity. In the real world, scaling that to 10,000 concurrent users with low latency requires a serious architectural shift.
Understanding how to build a real-time chat app (scalable architecture) means abandoning traditional request-response HTTP models and moving to a persistent connection model. The first 100 words of this guide focus on the core problem: synchronous communication needs open TCP pipes, not restart loop headers. Whether you are architecting a customer support platform or an internal team tool, the architecture remains strikingly similar.
-->
The essence of real-time communication is bi-directionality. Unlike a traditional REST API where the client must ask (poll) for updates, a chat app needs an open line where the server pushes data instantly.
There are three main paradigms you can use:
When analyzing how to build a real-time chat app, you aren't just choosing HTTP; you are choosing an architecture that maintains stateful connections.
"Do not over-architect your initial chat app with microservices unless you handle Quorum-based distributed locking manually."
Most engineers default to Kafka and RabbitMQ for chat architecture immediately. Iโve seen .NET developers spin up a whole Kubernetes cluster for a chat server. My advice: Start with a single monolithic WebSocket server or a horizontally scalable stateless gateway. The bottleneck in early-stage chat apps isn't the architecture; it's the disk writes and the message complexity. Get it working with Redis Pub/Sub first. Only move to Kafka if you have thousands of messages per second per user.
To build a scalable chat app, you must decouple the protocol layer from the persistence layer.
PUBLISH chat:room_1).db.collection.insert() inside your WebSocket on('message') handler, you will hit 100ms+ latency instantly. Fix: Use a Producer-Consumer pattern. Write to Redis List/Queue, have a side-cron worker save to disk.User_ID, Room_ID, Timestamp, Body. Keep indexes minimal.POST /api/v1/auth โ Returns JWT + Session Token.GET /api/v1/sessions โ Returns a list of active WebSocket connections (even if ephemeral).POST /api/v1/chat โ For historical data.WS /socket.io/ โ For real-time streams.Here is the production-grade setup logic. Do not copy-paste this blindly; understand why we are separating the logic.
1. The Server (Node.js)
const io = require('socket.io')(server);
const redis = require('socket.io-redis');
// This allows socket.io to broadcast across multiple servers for scaling
io.adapter(redis({ host: 'redis-cache', port: 6379 }));
io.on('connection', (socket) => {
const userId = socket.handshake.auth.token; // From auth header
// Join a specific chat room
socket.join(`room-${roomId}`);
// Listen for private chat requests or public broadcasts
socket.on('chat message', async (msgPayload) => {
try {
// 1. Create message object (without saving yet)
const messageEvent = {
id: crypto.randomUUID(),
roomId: msgPayload.roomId,
userId: userId,
text: msgPayload.text,
timestamp: new Date()
};
// 2. Send to ALL users in that room via Socket.io (fast, in-memory or Redis)
io.to(messageEvent.roomId).emit('chat message', messageEvent);
// 3. Send to Redis Queue for Persistence (Non-blocking)
await redisClient.lPush('chat_queue', JSON.stringify(messageEvent));
} catch (error) {
console.error("Failed to process message", error);
socket.emit('error', 'Message failed to send');
}
});
});
2. The Background Worker (Persistence) This is the part usually skipped in tutorials.
const redis = require('redis');
const subscriber = redis.createClient();
const db = require('./databaseClient'); // Your Mongo/Postgres connection
subscriber.subscribe('chat_queue', () => {
console.log("Worker listening for messages...");
});
subscriber.on('message', async (channel, message) => {
const data = JSON.parse(message);
// Save to DB
await db.messages.create({
id: data.id,
roomId: data.roomId,
userId: data.userId,
text: data.text,
timestamp: data.timestamp
});
// Optimization: If you need message status, update status here
});
ws:// is a security liability. You must force wss:// (secure WebSocket) or implement your own TLS termination.| Feature | WebSockets (Recommended) | SSE (Server-Sent Events) |
|---|---|---|
| Direction | Bi-directional | One-way (Client -> Server limited) |
| Implementation | Requires Heart-beats & State | Standard HTTP |
| Browser Support | Universal | Universal |
| NAT Traversal | Needs Config (Stunner/TURN) | Works behind Firewalls |
| Best For | Chat, Gaming, Trading | News feeds, File updates |
socket.io. Go is a strong alternative if you need extreme raw TCP performance for massive concurrency.Next-gen chat architecture is moving toward Commit-Revoke Streams (used by Discord/Slack). This allows clients to send a message, get a confirmation, reply "confirm", and later "delete" it, without waiting for a second roundtrip for every modification. We are also seeing WebRTC deep integration for voice/video chat apps built on top of the text chat infrastructure.
is_read = false in the database. When the client reconnects, fetch all messages where is_read = false.stun:stun.l.google.com:19302) for WebRTC cases, or ensure the Load Balancer maintains WebSocket upgrades (sticky sessions or Redis affinity).Building a scalable chat app is a exercise in managing state and throughput. By decoupling the media (WebSockets) from the memory (Redis) and the storage (Database), you create a system that can handle millions of messages. Don't get stuck trying to rig a static HTTP server to feel real-time; embrace WebSockets and Pub/Sub from day one.
Would you like a reference architecture diagram for the "Commit-Revoke" pattern mentioned above?