Wednesday, November 2, 2011

Gomemcache with multiget support

It's my pleasure to announce that gomemcache already has six contributors that have supplied changes and fixes to the original code base. A recent patch by Arbo von Monkiewitsch adds GetMulti() function that allows fetching values for multiple keys with one function call.
As a side note, one thrid of patches, as well as a few of my commits, are purely compatibility fixes for subsequent Go releases. I think that at the moment frequent changes to the standard libraries and the language core are the main blocker preventing wide adoption of Go by the IT industry.

Friday, September 16, 2011

Human factor

Having recently moved to a netbook, I realized again how important performance is for software development. I realized it even more when my Android phone became almost unusable after another software upgrade. As a kid I was a big fan of a computer demoscene where gifted coders showed to the world how to make the hardware do things unimaginable even by its creators (like a famous Commodore 64 FLI graphic mode). Now when the number of small devices connected to the Internet exceeds the number of computers and laptops wired to the global network, writing efficient software is a must: Facebook develops HipHop for PHP (written in C++), Google releases Native Development Kit for Android and makes Go the first compiled programming language available for the Google App Engine platform.

It is commonly argued that ANSI C is the "fastest programming language" that exists. No it's not. The language itself is neither slow nor fast, it's the implementations that are (compare Ruby and JRuby for example). However, it's true that software written in C can be usually compiled to the most efficient executable code (excluding pure assembly). No wonder that the most efficient pieces of software, like operating system kernels (Linux, BSD, Solaris, Windows, etc) or programming language VMs (Common Lisp, Python, PHP, Ruby, and even Java) are all written almost exclusively in C (not even C++).

Out of curiosity I started to look around for web software written in C and found a wonderful web server called G-Wan. It's very small and its performance looks really amazing as compared to other web servers - some enthusiastic reviews even call it The Future of the Internet and wonder why such a brilliant software is not popular. Well, G-Wan is not popular, because there is a problem. A big problem. And the name of the problem of G-Wan is Pierre Gauthier, its author. If you browse through the G-Wan's website and forum you can learn that he is a man with a really huge ego (he calls himself one of the best engineers available), who loves to criticize other people's work (like Poul-Henning Kamp's, the author of Varnish). But at the same time he does not want to publish his own source code nor make it open source, because he perceives other developers as inferior idiots, who would surely break it or at least bring nothing interesting to the existing code base. Moreover, evil government agents will try to introduce backdoors into his software and you are more secure when Pierre hides his code deep under his bed and tells you that you should trust him. And of course you do, don't you? And if you invest your time and money in G-Wan you don't have to worry at all about future development and maintenance, because even though the source code is closed, Pierre also gives you his word that he will not drop the software, and that he will never get hit by a bus (I'm not exaggerating, just read this post). I will not go deeper into his paranoia of his website being constantly attacked by Microsoft and NSA servers, because you already should have an idea about the way the guy thinks.

Edit: On the contrary to the information provided by its author, some security problems have actually been found in G-Wan. The affected version 2.10.6 is no longer available, and the issue was addressed by Pierre in his usual manner (one of the most funny claims is that nobody ever tried to confirm the bugs, while there is no archive of older G-Wan versions to verify it). Also, any links leading to the report are being actively removed from Wiki pages (see the comments to this post for details).

On the other pole there is a true pearl called Tiny C Compiler created by Fabrice Bellard. It's so insanely fast (it can compile the typical Linux kernel in less than 15 seconds!) that it can be used to write scripts or servlets in ANSI C (guess what is used internally by G-Wan to compile the servlets). With another open source project, libmicrohttpd, it can become a good alternative if you want to build a small, fast web server that can use ANSI C servlets (for example as an embedded router software). Libmicrohttpd is fully HTTP 1.0 and 1.1 compliant, and it offers several threading models, so you can tune your software and choose which one suites you best in your environment. If I was supposed to build a tiny embeddable web server, I would definitely choose open source libmicrohttpd + tcc + some personal coding over G-Wan.

P.S. G-Wan's author deleted forum from his website, also ensuring (with carefully crafted robots.txt file) that none of its contents is archived by search engines. The forum contained many important information for G-Wan users, but user support is obviously less important than invalidating links in this and other unfavourable posts. Of course you are still welcome to believe in "source code insurance", and that the same will never happen to G-Wan.

Friday, September 2, 2011

Another one bites the dust

A few days ago I found an interesting post on Ken Shirriff's blog, entitled Why your favourite language is unpopular. It summarizes a talk by Erik Meijer who provided a simple, yet very accurate, explanation why some programming languages gain popularity, while some other (often better in many aspects) don't. I thought about this formula in context of Reia, a very promising language for Erlang platform I mentioned about back in 2008 in my post Scripting Erlang. The author of Reia has just announced that he drops development of Reia in favour of Elixir. It seems that with the amount of software existing today there is very little space for another programming language, even if it addresses problems hard to solve with mainstream languages. With existing army of programmers at its disposal, the IT industry can deal with most of its problems using existing tools, without the need of inventing another language to rule them all. A good example of such philosophy is Cilk Plus, which originates from Cilk and allows to scale C/C++ code easily on multicore CPUs. With Cilk you can improve your existing software without rewriting it from scratch with a new computer language, which in the long run can create more problems than it actually solves.

Tuesday, August 30, 2011

Running Scala via Nailgun

Recently a friend of mine got me interested in Scala, a modern programming language for JVM. Scala reached a wide audience in 2009 when Twitter announced that they use Scala in the most critical elements of their backend. It became popular for its efficiency and scalability, and for joining in a very pragmatic manner the concepts of functional and object oriented programming. It also implements some very powerful constructs, like foldLeft, which allows you to compose your own reduction methods operating on lists (see this page for some very interesting examples).

I decided to give it a try on my Acer Aspire One netbook, which I use often to do some daily tasks. Unfortunately, with 1.6 GHz Intel Atom and 1GB RAM running Fedora 14 it took Scala REPL almost 20 seconds to start. I was wondering if I could make it faster and found out that some people recommended Nailgun for this task. In short, Nailgun preloads the whole JVM into the memory and than uses its own bootstrap loader to connect your programs to it, so that they load much faster, without JVM startup overhead. However, besides some good advice I couldn't find any working example how to do it, so I decided to do it myself. Fortunately it turned out to be a quite easy task. First, I copied nailgun-0.7.1.jar to Scala's lib directory, then I took the original bin/scala script and created a modified version. I changed line 161 from
scala.tools.nsc.MainGenericRunner  "$@"
to
com.martiansoftware.nailgun.NGServer  "$@"
and line 113 from
CPSELECT="-Xbootclasspath/a:"
to
CPSELECT="-cp "
The last change was necessary for Nailgun to load properly. Then I started regular Scala REPL through
ng scala.tools.nsc.MainGenericRunner
just to learn that it still takes 20 seconds to start. Conclusion? Modern JVM loading overhead is so small that reducing it further makes little sense, even on slow machines.
Lessons learned:
1) Using Nailgun to speed up Scala won't get you anywhere.
2) Don't listen to forum "experts" who recommend solutions which they have obviously never tried in practice.

Saturday, March 26, 2011

Need for speed

Working as a web developer maintaining a big web service can quickly make you understand that "everything counts in large amounts", and identifying and eliminating bottlenecks eventually becomes part of your daily routine. There are many ways to optimize your web application: from migrating to faster hardware platform, through reducing the number of network requests and extensive use of various caching layers, to refactoring of the code and optimizing your algorithms. Some very interesting performance tips can be found at Yahoo Developer Network - but sometimes it's just not enough, and you have to reach deep into the core of your software engine.

One way is to rewrite your application using either Java or C/C++. Another way, especially when your code base is so huge that it does not pay to rewrite it (even partially) is to try to compile the scripts to machine code. The best known engines implementing this technique are PyPy for Python, developed as a followup to the Psyco project, and HipHop for PHP, developed at Facebook. What is especially interesting about the HipHop story is that it has been announced shortly after Koen Deforche stated on his blog, that Facebook could reduce the power consumption in their data centers by 75% only by using C++ instead of PHP. What is not surprising, Koen Deforche is an author of Wt, a C++ toolkit designed for building web applications.

What makes Wt stand out among its alternatives is that it completely masks all underlying web technologies and makes Wt development look more like Qt than web. However, developing regular web services with Wt is not easy. For a last few days I tried to create a fairly simple application with Wt and I simply gave up. For me it was just like playing with Lego Technic for Aliens. Maybe it's because Wt, as its author admits, is not a toolkit for porting an existing website to C++, but a toolkit for porting an existing desktop application to the web, and I'm primarily a web developer. Or maybe it's because there is too little useful real life examples out there and the community seems to be almost non existing. In my opinion Wt is rather a niche product, best suited for developing highly scalable intranet applications for big companies.

At the opposite pole there is KLone, a lightweight web framework that at first glance looks like JSP. KLone allows creating dynamic content by embedding C++ code into HTML, much like JSP does with Java. It also provides all necessary mechanisms to deal with sessions, requests, etc. The application is of course compiled to native code and can distributed as a standalone executable, since it includes its own embedded http server. At the moment it's definitely my choice if you want to write a web service based on C++, no matter if it's because your hardware is too poor for Java or you just don't want to hog your system resources with Tomcat.