Four points to keep in mind when working with Node.js, guaranteed to save you a lot of time and pain.
However as we all well know, with power comes responsibility; developers must understand node’s limitations (or pay the price when they misuse it), they must to adapt their minds to the asynchronous and event-driven programming paradigms, and write robust and efficient code so their apps would scale.
Over the years, I’ve have learned some valuable lessons, lessons learned in blood as they say, which I’d like to share with you here.
Note: this post probably makes more sense when your aim is to write production grade, heavy duty, robust and scalable servers / micro-services; however even if you’re only intending on writing a simple CRUD node apps at the moment, I’m sure these will come in handy in time.
Know Your Node
Well, that is just too bad (with profound reasoning explained in depth in this article), and it is even worse when I hear that coming from professional developers. So if you remember one thing from this bullet please make it the following:
At the moment of writing this post, only about 0.17% of servers worldwide run node.js (updated chart here). If node was indeed suitable for all purposes you’d expect it to have a much more dominant market share, however it isn’t (although you can probably try and force it to be). What is it good for then? well, in a nutshell, node.js is most suitable for I/O bound natured apps, and less suitable for CPU bound apps.
So when considering node for some number crunching or heavy text manipulation application (yes, that’s CPU bound too, sorry), think again, choose well or pay a heavy price later on.
Stating The Obvious: Keep Stateless
This is probably an obvious one, however well worth mentioning here. If you’re using Node for a server, service or micro-service that needs to scale (which is probably the case), make sure to keep your instances stateless.
Stateless means (for the sake of completeness, I am sure you know what that is) keeping no state information in your instance’s memory. That includes in memory LRU caches, hashmaps, variables or any other data that might change and needs to be in sync across all your instances. Data of that type should be maintained in an external repository, e.g. Redis, Memcached, Couchbase etc. instead of in your application.
If you keep your node.js instances stateless you will be able to (very) easily scale as your throughput requirements grow.
Performance Measurements Gotcha
Sometimes we have a time critical node process, and we’d like to make it extremely efficient so that we adhere to some hard timing limitations.
In a recent project I was trying to cut down the server processing time from about 2 milliseconds to sub 1 millisecond. I started by looking for obvious costly operations (like heavy looping, inefficient hashing, text manipulations etc.) however since I designed the code for performance it merely had any of these.
I then refactored a call to a 3rd party SAAS service, and instead of using their HTTP interface switched to using their UDP interface. This alone provided a whooping 700000 nanosecond (0.7ms) saving, measured on the caller side (my node app’s client).
But I wanted to cut down more, so instrumented the code with process.hrtime measurements for various pieces of code I thought might be sub-optimal:
var hrstart = process.hrtime(); // measured code here... var hrend = process.hrtime(hrstart); console.info("Duration: %ds %dms", hrend, hrend/1000000);
This looked fairly straightforward I thought to myself, and ran the code grepping out the timing printouts, however they made no sense whatsoever – for example a simple code block that included some variable assignments and couple of log prints which should have executed in nearly zero time took up to 0.5ms.
I was baffled and it took me a couple of minutes to figure it out and award myself with a proper facepalm. I was trying to measure code which included I/O operations (in this case simple logging), which obviously preempted my code while waiting on I/O, allowing another client to be handled, until execution was regained on the next tick.
Once I figured that out, I canceled all other logging (aside from the duration printout at the end) by raising the log level to critical, and was able to properly log the performance measurements.
Conclusion time: when measuring fine-tuned performance with node, keep in mind that any I/O OPs (sometimes it is not too obvious to spot them) included in the code path being measured will impact your measurement resulting in inaccurate (higher) results. So either:
- Temporarily eliminate those I/O operations if they are non-essential to the application (e.g. logging)
- Measure the performance in an isolated single client test scenario
- Measure the performance from the client side over a large number of requests, and calculate the average throughput (for example you can use siege or ab / apache benchmark)
- Use a 3rd party profiler or node.js profiling
Clustering To The Rescue (or not…)
As everyone knows, node.js is a single threaded beast. As everyone also knows, this is not entirely true, and there is a thread pool managed by the libuv library which does the I/O heavy lifting behind the scenes, however from our own application’s perspective there is only one thread of execution (and if you need proof of that then the lack of race conditions, mutexes and semaphores is just one).
So if our node app has a single thread of execution, this essentially means we are only bound to one CPU core (which is indeed the case). In the past, that meant that even on a multi-core machine your app could only utilize a single core while the other cores were potentially just idling by.
But node.js has advanced, and utilizing any number of cores is now child’s play. All you need to do is either use Node’s cluster module, or in case you’re using PM2 as your node process manager (and I strongly recommend you do) then just run your node app in cluster mode. The former approach will require very minor changes to your app’s code, while the latter will require zero changes.
In either case, you’ll have your app transparently running on several cores in no time, and assuming your app is stateless (as recommended previously) it should also work flawlessly.
So now you might be thinking: “well, I have an 8 core machine, I’ll run my node.js app in cluster mode and get immediate 700% boost”, which is a legitimate thought. However I have to unfortunately break it to you, you probably won’t, and chances are you’ll see a total cluster boost benefit of 15% to 80% at best.
“But why? this is not fair”, you might rightfully grumble; Well, the answer might not be so obvious but painfully simple: by utilizing multiple cores you’re making more CPU resources available to your clustered application, however since most node.js applications are by nature more I/O bound than CPU bound you’re not really benefiting much from clustering your multi-core CPU.
Let me try and put it in other words: node.js’ main strength lies in its being asynchronous and event driven, and with its underlying event loop mechanism asynchronously handling all I/O operations transparently. This is why it is also single threaded – a single thread is enough to handle our code while I/O OPs are delegated to a pool of kernel threads orchestrated by the framework. So since we have a single thread we mustn’t block it, so any CPU bound operations are strongly discouraged in our node.js apps → meaning most node apps are indeed I/O bound and not CPU bound, which eventually means that I/O is our bottleneck, not CPU, thus using multi-cores won’t yield much better performance as one might naively expect.
So what do we do? there are few approaches, perhaps the more trivial one being run your app on multiple machines / VMs where each machine can be a cheap, single core instance.
Thanks for reading, share and prosper!