PM2 Clustering in Practice: 4x Throughput on the Same Hardware
A Next.js app running as a single process was using 25% of a 4-core server. Switching to PM2 cluster mode with 4 workers per brand pushed throughput from ~100 to ~800 req/sec and uptime from 95% to 99.95% โ without buying new hardware.
The Next.js application was running as a single process on a 4-core server. Using about 25% of available CPU at all times.
During peak hours, when all 8 brands had users active, request latency spiked and occasionally the server became unresponsive. The fix was not more hardware. The hardware was already there, unused.
Problem
Node.js is single-threaded by default. One process, one core. On a 4-core server, that is 75% of your CPU sitting idle while users experience slowdowns.
When request volume spiked across brands simultaneously, the single process became a bottleneck. Requests queued. Some timed out. The server was not overloaded by capacity. It was bottlenecked by process architecture.

Restarting the process also meant downtime. Any unhandled error that crashed the app meant real service interruption until someone manually restarted it. At 8 brands in active use, that was a real operational risk.
Solution
PM2 cluster mode with 4 worker processes per brand.
The configuration is straightforward: set exec_mode to cluster, set instances to the number of CPU cores. PM2 handles the rest. Each worker runs an independent copy of the Node.js process. Requests distribute across all four via round-robin.
Key configuration settings:
- max_memory_restart at 1GB per process to catch memory leaks before they cascade
- autorestart: true so crashed processes come back automatically
- Graceful SIGTERM and SIGINT handling so active requests complete before shutdown
With 8 brands, 4 processes each, the server runs 32 PM2 worker processes total.

Result
Throughput went from roughly 100 requests per second to around 800. Four processes doing the work instead of one.
CPU utilization moved from 25% to near 100% during peak load.
System uptime moved from about 95% to 99.95%. Process crashes that previously required manual intervention now recover automatically in seconds.
The change required no application code modifications. PM2 handles the clustering transparently. The biggest effort was understanding the configuration options and setting memory limits correctly.
Found this useful?
Share it with someone who'd appreciate it.
https://wardvisual.com/blogs/pm2-clustering-4x-throughput

@wardvisual ยท ๐ต๐ญ Dasmarinas City, Cavite PH
Full-stack engineer. Business systems, database optimization, and operations software.