Wednesday, May 16, 2012

The love and hate of Node.js

I've been in the happy position lately of interviewing engineers for my new start-up. If you've read any of my previous articles or seen my talks (Talk Notes (warning: PDF download)) you know I love this stuff (Jobs page).

I'm always up for a spirited discussion about algorithms or languages with smart people, but I do consider too much technical religion to be a red flag and it seems to be a rather common affliction.

So when I started interviewing recently, I was immediately reminded of the blind loyalty some times given to pieces of technology. As if all other competing languages/frameworks/variable-naming-schemes are "crap" where "crap" is loosely defined as, well, I'm not sure - but something that the person saying "crap" sure doesn't like. It's probably safe to say that any popular technology has (or had) a useful purpose at one point, and also I think it's safe to say that same piece of technology is not always the right solution.

Along with the perpetual Hackers news debates, I right away ran into a "Node Guy". That is, a guy maybe just a tad overzealous about using Node.js for, you know, everything.

I asked him why he chose Node for his last project. The answer was "Because it scales super well".

I will say, the marketing hype around Node is pretty good. I am not saying his answer was wrong. It wasn't. But it's pretty similar to answering the question "Why do you want to buy a Porsche?" - with the answer "Because it's fast".

Likely true, but by no means a distinguishing feature.

It isn't hard to find discussions in the developer community defending the performance of Node. Node at its core, is a C++ server. That part is likely competitively fast. But after that, the focus is on the performance of the JavaScript callouts. Someone told me that "microbenchmarks" aren't fair to Node and don't show the true performance. I think they're right in both cases - because microbenchmarks likely involve only a small amount of JavaScript. Truth is, the more JavaScript your Node application runs, the more its going to lose against server frameworks built in faster languages. In other words, microbenchmarks are probably the best Node is ever going to look.

Google's V8 JavaScript engine literally enabled Node to exist at all.
There are of course another set of people (the "Node haters") that are nothing short of incensed by the idea of Node. To them, it feels rather silly to create a server framework in a language like JavaScript. I can relate to someone who has spent years eek'ing out all possible performance of a C++ server only to watch someone write one in JavaScript and claim speediness.

In the early days of any field of science - science, invention, and engineering must overlap. That is, folks think up science, try it, and piece it together to see if it works - however rickety. Eventually however, enough tools and best-practices exist to allow details to be abstracted.

When that happens, many more people can create cool things with existing and tested pieces (i.e. engineer them together). Simply, you need to worry less about the details of the science to get things done. People with no knowledge of the underlying science can glue widgets together to make something. Often amazing things - at that time, you might consider that that "science" is somewhat beginning its evolution towards being an "art".

Possibly the quintessential computer science course is something like "Algorithms & Data Structures". Do you need that course to develop apps these days? Again, by proof of existence - I think not. If you have a phone interview with me I will ask you the difference between an Growable Array (aka Vector, ArrayList, etc) and a Linked List. Both structures do arguably the same thing but with notably different performance characteristics.

It's quite hard to create any application without using some form of list, but as long you have a "list" you know you can get the job done. I promise you there are thousands of sites and apps using the wrong list at the wrong time and incurring nasty linear time searches or insertions in nasty places. Truth is, if that never shows up as a measurable bottleneck, then one could argue that despite the algorithmic transgression - that code is "good enough".

Happy or sad, "good enough" is getting "good enougher" all the time. CPUs are fast and getting faster covering the tracks of slow code. We've never lived in a better time for algorithmic indifference. Comparatively, disks are slow, which make databases slow, which make the performance of the algorithms and languages you choose in your app less important than ever (not to be confused with "unimportant"). In fact, I'd argue that the entire resurgence of interpreted, dynamic languages can be traced back to the lackadaisical 5ms seek times of spinning disk drives. That's a bold statement and probably a whole other article - but if disks/databases are basically always the bottleneck (rather true in most web apps) - who cares how fast your language runs.

(disclaimer: If you've read anything else I've ever wrote you know I'm merely an observer of this trend, not a subscriber)

The controversy over Node is that it implies that developers from the client are piercing into the server. A domain typically full of people that came up from the OS layer. Those people are asking does it really make sense to write servers in a historically (slow) client language?

Further, and possibly a bit more personally, should people who only know client languages be writing servers at all? Server code is a unique skill just like many other things. Dabbling in writing servers is like me dabbling in doing web design - trust me, it's not pretty. There's only so many lower levels left - would you want a JavaScript operating system?

On the notable other hand - People who only know or love that client language have been given a whole new freedom and ability. They'll argue (with a rather reasonable defense) that Node.js represents one of the easiest ways to create a server or web app. Even if they don't defend the performance, in many practical cases, they don't need to - like it or not at the right time it can be "good enough" (again, proof by existence). It's positively no wonder they defend Node. They are defending their newly found, wondrous abilities that can solve real problems. Wouldn't you?

So as my information-less friend said - Node will scale. But that is, indeed information-less. So can Ruby, Rails, Java, C++, and COBOL - architectures scale - languages and servers don't. Just like most web apps, a Node application will probably be bottle-necked at its database. You can fool yourself that Node itself is "insanely fast" but you'd be fooling yourself (Java/Vert.x vs. Node, Java/Jetty vs. Node, Node vs. lots) and rest assured that despite scaling, some portion of your latency is baked into your language/framework performance. Your users are paying milliseconds for your choice of an interpreted and/or dynamic language.

Should your start-up use Node? That depends on a lot of things. If history is a teacher however, massive success will likely push you to something statically typed. Facebook "fixed" PHP by compiling PHP to C++ and Twitter painfully (after years of fail whales) migrated from Ruby to Scala/Java. Square has migrated to JRuby to improve performance, I'll be interested to watch if its enough (I'm feeling yet another article on the nefarious demons upon drop-in replacing a global-interpreter-locked Ruby with a true multithreaded one).

The fight over Node is, in truth, one of the least truly technical developer fights I've seen. People on both sides are simply defending their abilities and maybe their livelihoods - the technical points are rather obvious. I'd say Node is definitely a possible solution for some non-trivial set of problems - then again, I can think of plenty of situations I'd also veer away from it. But of course - I'm not very technically religious - and I'm definitely not a "Node Guy".

All this being said - I am seriously hiring across the stack (including mobile) at my new start-up. If you have a desire to argue with me about this sort of stuff on a daily basis - email me at paul@refresh.io and attach a resume.

This article was spawned from my own comment at William Edwards blog

19 comments:

Anonymous said...

>> Facebook "fixed" PHP by compiling PHP to C++ and Twitter painfully (after years of fail whales) migrated from Ruby to Scala/Java. Square has migrated to JRuby to improve performance

Actually all these all poor examples.

I can't imagine if Facebook/Twitter still alive if they started with C++/Java at the beginning of their journey.

rbrcurtis said...

We are rewriting our web app at agilezen.com in nodejs using coffeescript. The new version is completely client-side rendered meaning lots and lots of javascript. As such, using nodejs means that we are using the same language in both contexts, which I think greatly speeds development.

David S Zink said...

I've written a bunch of top-of-class servers, and I've come to see the situation as having a couple major axes. On the one hand there's bottleneck versus slack region; if you need to reliably transact then disk is almost always the bottleneck, otherwise it's between CPU/memorySize/memoryIO/network/logging/storage which parts are slack and which are tight. The other domineering axis is talent. If you have enough talent to dredge your bottlenecks then you can do okay. There really aren't any off-the-shelf solutions for any particular bottlenecks that are generally applicable, that's how life is. The difference between a tunable commercial log-structured database and a hand-tuned application specific log-structured database—with, say, zero-cost deletes—can be orders of magnitude. It's fashionable to ignore this issue, but for a start-up, replacing a thousand database servers with one plus a hot spare means being able to negotiate with your hosting company for lower prices. But you can't assign just anyone to write that and expect good results.

If you're sure an area is performance-slack (and your whole world *may* be), then really why not go for ease of implementation, ease of maintenance, ease of modification, that sort of thing? Node.js certainly scales better than some other solutions. If you can use it and you want it, then use it. If it becomes the bottleneck, then replace it. Because in the real world hardware changes and requirements change and thus bottlenecks change. The business joy of loose languages is that you can quickly build things and easily experiment. The important thing is to have all your JS structured so that it is comprehensible and you can possibly translate it to a different system (e.g. scala). That's really the sudden fatal pain of loose languages, that they allow loose programming that is not truly comprehensible and then cannot be replicated. They create lock-in, and there really is nothing in the world so perfect you want to be locked in.

As for your client-programmer observation, I see it as there are client programmers who know the different VMs and the events and UI interactions, and there are the more business-logic programmers that might as well be on both sides of the net and moving functionality back and forth. And in the NodeJS world, I suppose the more serverish programmers who understand how to wring lower latency out of the asynchronous model, and hide most of that from those business programmers who are uninterested in such things.

David S Zink said...

Also, www.twitter.com and www.Facebook.com fail so often in so many different little ways (as they each just did for me) that I'm shocked anyone would use them as an exemplar of successful scaling; they are just better than they were.

Loading Tweets seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.

Anonymous said...

Like rbcurtis says; I don't think a major advantage of JS on the server is speed. But that doesn't matter PHP, Ruby, Python are all no speed demons compared to LLVM/GCC frontends. The fact that more and more is moved to the frontend (complex web applications, mobile apps, games) and that both sides are becoming more and complex begs for greater synergy between client and server programming. Even simple things are easier if you only have to write them once. That's the greatest power of Node IMHO; there are a lot more things to like, especially when the JS execution envs grow up. Things like Fabric are interesting to keep an eye on. V8 is fast enough to battle Ruby/Python; at least now I can write my code once (even complex jquery).

Anonymous said...

get it done in whatever makes you feel better. I love javascript but my server-side code is in sinatra. Later, when your app is successful and merits it, hire someone more knowledgeable to recode it in a better choice. Make everything understandable, well commented, etc. to ease that transition.

If you don't ship, it doesn't matter.

lerouxb said...

The times I chose to write something in node the reason was always because I'm sharing a lot of code between the browser and the server. Otherwise I tend to pick something else because node.js / JavaScript certainly isn't easy compared to the alternatives.

I also don't think I have ever had performance problems caused by my choice of language. (Python, JavaScript, C, Java, C++, even PHP). But maybe that's because I've only been programming since the 90s.

In my experience "server code" tends to be by far the easiest parts of most modern web apps. Try to do some state if the art frontend stuff some time without it becoming a mess - you might be surprised. A welcome change from the typical boring old crud.

Anonymous said...

You eked out all possible performance from your C++ server (by which, I assume, you converted it all to proper C code).

ORIGIN Old English ēacian, ēcan (in the sense [increase] ), of Germanic origin; related to Old Norse auka.

Omega said...

Definitely agree. The most important thing for Node is that it hasn't delved back into the science.

They've integrated V8 with a socket library fairly well, but when you look at their reaction to the CommonJS AMD "standard", you can tell there might be some regimentation in their thought processes.

When I see a nicely made IoC built on AMD running on top of Node, I might consider it again. But all the libraries built on Node right now just look like risky one-offs.

I say this as someone with lots of enthusiasm for the whole single-codebase idea.

Some people over at the Dojo Toolkit have been making inroads towards this. Great and very caring people over there!

Anonymous said...

In most cases with startups, the biggest problem they face is not scaling, but finding customers and creating a business model and actually finishing an MVP to test it all out. So that's why I think that you should hesitate to use Node.js for a startup unless its clearly your fastest path to market. If you're ever in a startup and you are facing serious scaling issues, you will likely have a combination of lead time, more financial resources, and a better sense of what the bottlenecks are to architect the best solution.

Developers want to focus on technical stuff, but its usually non-technical stuff like design, marketing, pricing, writing, etc that determines whether a business succeeds. Even technically focused startups can focus on this stuff just based on outsourcing some of the stuff they're not familiar with, for example there's a hundred+ companies at BuyFacebookFansReviews that do nothing but social media promotion - you don't have to be an expert to focus on some of this stuff if you are smart. Trying to scale before you have customers and a proven business model is almost always a waste of time.

Over time, NodeJS might be a better option as callback soup problems are solved and better practices are developed and more full-featured frameworks are ready. But its not quite there yet IMO in way that PHP/Ruby/Python are there.

Joran Greef said...

You can use TypedArrays in Javascript to work around many of the issues you describe.

obsurvey said...

One of the big reasons I like Node so much is the fact that your code is so much "closer to the metal". There aren't huge frameworks to learn, and building a REST API with it seems optimal, compared to other technology stacks I've tried.

Regarding beeing a "Node-guy" like you describe, I agree, he's annoying, but so is the "Java-guy", "dotNet-guy", "Scala-guy" or "Ruby-guy". I don't see anything special about node in that regard.

jmarranz said...

Hi Paul I'm surprised you are not commenting about the manual multi-threading simulation of Node.js, anyway as an admirer of your solid knowledge, sometime ago I wrote a piece of node.js criticism, you're going to find it is rooted on your investigations about threading vs magic mono-thread.

TSS article

MP said...

I remember, when I was interviewing for jobs, a lot of places were visibly disappointed that I didn't share their enthusiasm for . I think this problem exists on all sides, and employers are often just as guilty of it as employees.

Anonymous said...

The bottleneck is rarely the language. It's either the developer, the budget, the hardware or a combination thereof.

I recall observing a conversation between two Java devs about the performance of a certain algorithm and whether one approach was faster than the other. They seemed obsessed with squeezing every last bit of juice out of their solution.

When I pointed out that immediately after the algorithm completed the result would be passed through two service boundaries which would result in a glut of XML serialisation their conversation and concern suddenly dried up.

Darryl said...

I tend to agree with most of what you've said. The reason some people stick to node/js is that they can keep the same mindset developing the clientside and server side. You know if you ever have to switch languages mid step that you'll start to let errors creep in and code in the wrong language. I currently support/develop on 5 different languages and switching mental gears is a painful experience.

For a start-up, just getting a working product is hard enough, not to mention surviving a few years to get market traction. If your startup is successful, then you can spend some time and dollars fixing up the mistakes and recoding. Chances are that if you are a success, there will be a lot of redundant code lying around that could be trimmed, so a recode is usually a good idea anyway. Just make everything modular from the start, so you can recode in chunks rather than a big bang approach.

Glenjamin said...

There's a bunch of reasonable reasons for going with NodeJS, few of these are unique to Node, but I'm not aware of another ecosystem which provides all of these.

* A high level language - which makes for faster prototyping
* A well-known language that's pretty easy to pick up
* An excellent module system (both in terms of implementation, and ecosystem)
* Utilising the reactor pattern, for "simple" co-operative multitasking
* An extremely fast VM, JS is probably the only well specified language with multiple popular implementations which compete on speed. This is reason v8 is fast.
* No blocking modules (compared to twisted on python for example, where there's a whole load of vanilla python libraries you cant use)
* A networking library that abstracts away details while still giving you fine grained control

All of these and more stack up to making it an exciting proposition. "It scales really well" is clearly someone who doesn't understand the technology properly.

NodeJS has its share of bad points, including maturity, especially of the more esoteric 3rd party modules, the GC implementation (1GB heap limit, eek!) and probably some others I've run into but can't recall right now - but on balance I like having it in my toolkit.

Jason said...

My latest project is a financial planning app using a Javascript "database" called TaffyDB. I'm sure it has terrible performance issues, but Javascript is much faster now than it used to be and the dataset will be quite small. It will eliminate disk read/write time and the only network latency is retrieving the html/javascript files.

While I thought about writing a bunch of server calls so that I can hide all the data processing, I expect this project to scale much better by only using javascript... and I mostly don't want to write something that relies a lot server scripting/database performance and scalability unneeded nonsense.

I agree wholeheartedly with your analysis-- write something that works great, write it good enough, eliminate latency bottlenecks and make it so even if it is an incredible idea- nobody notices because it's fast!

alfred said...

Javascript and Ajax is something that web designers need to get comfortable with. Javascript is the basic interactive element of a web site so, quite obviously, as a good web developer, you need to master it.

javascript jobs