Handling Overload from within Node.JS code Now, let’s consider a theoretical Node.JS web application that exposes REST API to clients. We need to build into our server a mechanism allowing us to understand and notify the load-balancer that our existing server-load is high and we are unlikely to able to handle additional requests correctly.. The load-balancer would then know to use a different server instance, while our nearly overloaded one will process the outstanding requests.
Let’s use the Express.JS web application framework with the help of the npm library called “overload-protection” and build our server as below:
maxEventLoopDelay:42,// max delay between event loop ticks
maxHeapUsedBytes:0,// max used heap threshold (0 to disable)
// middleware which blocks requests when we're too busy
The maxEventLoopDelay parameter represents the maximum amount of time in milliseconds between event loop ticks, before we consider the process too busy. The more modern Hapi.JS web framework has a built-in maxEventLoopDelay option allowing us to specify at the server configuration level the maximum delay duration before requests are rejected with 503. A sample Hapi.JS connection configuration might look like:
In the above example, the maximum allowed event loop latency is capped at 1,000 milliseconds. Whenever event loop delay exceeds that limit, our Hapi.JS -based application will automatically send 503 response codes.
In order to implement event loop overload protection for Fastify framework, use the “under-pressure” Fastify plugin. To use it, register the plugin as below:
Much like the overload-protection module and Hapi.JS built-in overload prevention mechanism, under-pressure allows specifying additional limits.
Handling 503 errors
Let’s assume that we have a Node.JS web application hello.js.
This will start our web application. Now that we have our web app running locally, we need to expose it on the network. We will do this using the nginx web server as a reverse-proxy tool. The typical location of the nginx configuration file is /etc/nginx/sites-available/default. For our sample configuration, we could use the below settings:
Adjust location and other properties as necessary. To validate configuration, use:
For more information on nginx, please refer to the documentation. At this point, we configured our sample Node.JS application to run on a single node, as a system process. We could also use nginx as the load-balancer. In order to do that, we need to specify:
When an application is running behind a load-balancer, like in the above example, if an instance server is overloaded, it will send back the appropriate response code – HTTP 503. The HTTP load-balancer will in such case try the next available server, until one of them will send 2xx-4xx response. If all servers are unavailable, only then send the 503 code to the client.
In order for load-balancers to understand how long a server might be down due to a 503 error, system administrators configure the load-balancers’ vendor-specific configuration setting, such as idle timeout.
Conclusion This simple “event loop latency” policy applied to Node.JS, combined with load-balancing, allows us to create scalable Node.JS web applications resistant to resource overload, properly handling 503 HTTP responses and correctly configuring load-balancers, we are able to build Node.JS REST endpoints with virtually limitless horizontal scalability.