FreeMarker Blog

The official weblog of the FreeMarker project

Sunday, February 26, 2006

Musings on Wikipedia and Open Source

Recently, I have been using Wikipedia quite a bit. First off, let me say that Wikipedia really is a blast. It is informative, like a regular encyclopedia, but due to the fact that anybody can contribute, there is a lot of funky stuff there that would not be found in a conventional encyclopedia. Surfing wikipedia is just a whole lot of fun!

Of course, a naysayer would likely interject at this point that if you want to have some fun, go to wikipedia, but if you want complete, reliable information, look at Britannica. I have to admit that this would also be my instinctive reaction. However, much to my surprise (and doubtless that of many other people) a recent study that appeared in Nature magazine found that the accuracy of scientific articles was not significantly different in Wikipedia than in Encyclopedia Britannica. Here is an article on that.

Now, aside from a few very specific topics, like java template engines, or preflop strategy in Omaha Holdem poker, and maybe a couple of other things, I am not really qualified to judge the quality of information in Wikipedia. On the other hand, I have noticed that most Wikipedia articles that I come across are fairly well written. I am sensitive to this and consider myself to be a fairly good judge of it.

I have been thinking about various things regarding open source. It seems to me that Wikipedia's strengths (and weaknesses) as compared to a conventional encyclopedia are pretty much those of the open source development model as compared to conventional software development.

One of the revolutionary aspects of free software is that it drastically reduces barriers to entry. Anybody who is interested and motivated can hack the source code. Similarly, anybody can contribute material to Wikipedia. It seems likely that a conventionally published reference book would have some advantage in quality over the wikipedia model. After all, there is a paid editorial staff that would do systematic fact checking and do some needed line editing and so on. However, the advantage of the wiki model is how flexible it is, how quickly it can admit new material and updates and improvements. For example, if a major scientific discovery is made in a field that invalidates previous theories, it is likely that a wiki-based encyclopedia would incorporate this information much more quickly than a conventional publication. So there is a clear trade-off. In a field as rapidly moving as java software, for example, you might well prefer an article about java development tools that is up-to-date but may contain some errors and bits of sloppy prose over a more polished article on the subject that is a couple of years out of date.

Not long ago, I wrote a blog entry about the issues involved in (very) hypothetically joining Jakarta. Though I answered by way of a simile involving King Arthur and the Round Table, I think that the reasons I gave would be clear to anybody reading it. However, an aspect of this that I did not mention there was that I have gradually come to the conclusion that ASF's entire vision of the open source process is incorrect. Certainly, there are reasons to have severe doubts about it. In recent private correspondence, a java developer commented that, once you got beyond surface impressions and actually rooted around, one could see that over 90% of ASF projects were in some kind of state of hibernation or even severe abandonment. He worried that an open source project that he liked and used quite a bit was on the road to becoming an ASF project, and that this would likely be the kiss of death.

I am not so familiar with that many ASF projects, so I cannot vouch for the 90% figure this person gave. However, I think it's clear that there is some kind of systemic problem.

My considered view is that the root of the problem is that ASF wants to project a certain elitist idea, that becoming a committer on an ASF project is some kind of great honor. If you lurk on a given project's mailing list, you will on occasion see them announce, to great fanfare something like: "John Jones has been accepted as a FooBar committer." This kind of thing has always caused me to roll my eyeballs. The subtext is a bit like so-and-so has been admitted as a high priest who may now enter the inner sanctum and touch the holy of holies (which is the code repository presumably.)

So, until they admit you to the holy of holies, you are basically in some kind of supplicant position: "Please sirs, will you look at my patch." And, of course, since the patch in question is typically something that only that person needs at this moment (or maybe other people need it but none of the committers do) what with one thing and another, they typically don't get around to looking at the guy's patch.

Certainly, the FreeMarker project is not run this way. If somebody expresses some interest in hacking the code, I pretty much immediately offer to add them as a developer so that they can commit code. They are added with no great announcement or votes or fanfare. Note that no vetting has occurred here. I will typically have no objective proof of the person's abilities. It is enough for them to say that they are interested in doing something for me to provide CVS access. Basically, we simply assume that somebody is competent until proven otherwise.

Now, as a practical matter, there is not really much problem with people then turning around and committing poor-quality code willy-nilly. Actually, nine times out of ten, somebody expresses interest in doing something and you give them r/w access to CVS and they just never do anything -- good or bad.

So, people committing all kinds of poor quality code is not a common real-world problem. But it can occur. However, even when it does occur, how much of a problem is it? Somebody does something and you can see what they did and modify their work or completely roll it back. This is, in fact, the whole point of a versions repository, is it not? Since it is fairly easy to roll back the code to some previous known state, why should one be so conservative about letting people commit code?

But again, I think the core problem with ASF is this underlying elitist idea. And I think it's wrong; open source is not elitist by nature. It's more like: "If you can do something, then roll up your sleeves and get to it, let's see what you can do." In other words, a person is assumed to be competent until proven otherwise. The ASF approach seems, on the other hand, to assume that you are not competent to collaborate until somehow proven otherwise. What makes matters worse, though, is that it is not really obvious what people who become committers have done to prove their competence. It often seems to be more of a popularity contest with people voting +1 and so on.

So, admittedly, another take on this is that the problem is perhaps not elitism per se, but elitism that is arbitrarily applied. If you're going to be elitist, you should at least have some objective criteria.

The other aspect of this that I think is quite worthy of comment and some analysis is that, as far as I can see, the projects that line up to join ASF are not doing so because they really believe that there is any technical value in it. It is purely to leverage the Apache brand name and thus take advantage of those placement and visibility advantages. Now, if you look at the pages relating to the Apache incubator, they state the supposed technical advantages that getting in with ASF involves, access to world experts in specific domains, experts in running open source projects, etcetera. But again, I do not believe that the OSS projects that want to get on believe any of this. It's purely for the visibility. For example, when the Struts/Webwork merger was discussed on forums, the argument used was that by merging with Struts, they would get WebWork's technical superiority along with Struts's "community". I parsed "community" in this case to be code for the marketing advantages. (In general, "community" is a term that ASF people use frequently in an odd and somewhat mystical way.)

I do not recall that anybody suggested that the Struts people had anything to offer technically. Zero, zilch, squat. Of course, similarly, when Leos Literak asked me about the possibility of joining ASF, it was all about publicity, he never at any moment in the discussion suggested that ASF had anything to offer us on a technical level.

Well, all of this does introduce a real element of moral hazard. If you join ASF purely because so many people believe in this "Apache mystique" (that you, yourself do not believe in) then you now have a vested interest in perpetuating said mystique, since, after all, your whole strategy was based on the continuation of this mystique.

As a final note on this "Apache mystique", if what my correspondent said, that over 90% of ASF projects are in a sad state of neglect, a great gap has opened up between hype and reality. A very huge gap indeed. Is such a situation sustainable long-term? You know , it may be analogous to what happens in a financial boom, where something like internet stocks, say, get priced at some level completely out of line with whatever real economic value these things have. Such booms ultimately lead to a day of reckoning, a crash. When exactly such a crash occurs is all in the sort of theory of tipping points, etcetera. Or maybe there won't be any such "crash". Still, my sense of things is that this Apache mystique will ultimate end up being deflated significantly.

Well, nothing is more humbling than trying to predict the future. I have no crystal ball. We shall see.

Monday, February 20, 2006

Getting back in the Groove

Over the last couple of years, I have been doing fairly little in software development. My main activity has been online poker. I see little reason to keep this much of a secret.

Nonetheless the itch to do some coding has resurfaced (maybe it's in the blood, like a virus or something) and I decided to get back in the groove, as it were. Aside from getting more involved in FreeMarker again, I began writing some poker-related code. The current (still very green) state of things is on sourceforge.


I originally decided to use the pokersim project as a basis to explore Ruby. So I started the pokersim code in Ruby and made very fast progress despite the huge holes in my knowledge of the language.

At some point, however, I decided, on a whim, to port the code over to Java, to compare performance as well as certain code metrics. I'll cut to the chase quickly and say that the same code (algorithmically the same code, it's different code obviously) ran 10x faster in Java. To be fair, this is because of the advanced JIT technology in Java. If you run with the JIT compiler disabled, the interpreted Java bytecode runs approximately at the same speed as the Ruby code. I am not enough of a compiler wonk to know whether the Ruby bytecode is as optimizable in theory as the Java bytecode, or in other words, whether the kind of tricks that Hotspot and its brethren do could be replicated in the Ruby world. (Perhaps not, due to Ruby's much more dynamic nature, but I don't know for sure either.)

In any case, I am really a practitioner in this field, not a theoretician, and I have to work with the tools available rather than theorize about what tools are possible. I wanted to write software that could generate a lot (like maybe some tens of thousands) of random poker deals and generate statistics. While, to be honest, I'm still not sure about the exact application, it is clear that, while 2x or 3x difference might not be so dramatic, a 10-fold speed difference could make huge differences in the usability of any software that comes out of this.

I decided to switch the project over to Java. If I take another run at Ruby, it will be when I have defined some project where I know performance is not critical. (Or it could be that, at some future point in time, Ruby performance will be much better.)

Java 1.5 and Generics

So now, rather than using pokersim as a testbed for learning Ruby, I decided to use it to explore new features in Java 1.5, in particular type-safe enums, generics, and so on.

The type-safe enum introduced in java 1.5 is unreservedly wonderful. In 1999 or 2000 I remember writing a type-safe enum class that worked in a very similar way to the way java 1.5 enums work. However, it was not part of the language, so it was much much more awkward to use.

I also started using the genericized containers in java.util.*. The main advantage of generics is notational convenience. The type-safety issue is mostly an ersatz problem. The reason is that you very rarely in my experience have a heterogeneous container anyway. For example, if the cards object below is a regular untyped ArrayList from pre-1.5 java, you might write something like:

for (Iterator it = cards.iterator(); it.hasNext();) {
Card card = (Card);

The argument that the (Card) typecast is somehow unsafe is, to my mind, merely a theoretical argument. As a practical matter, the real problem with ungenericized code like that above is its extreme verbosity. In pokersim code, I have a Cards class which is basically a ArrayList with a bunch of convenience routines.

public class Cards extends ArrayList<Card> {

and now I can write instead:

for (Card card : cards) {

And of course, the card variable is already defined within the loop. The whole thing is much more pleasant. It is shorter to type, it is easier to read and see the intent. Since such loops occur all over the place in the code, it makes a huge difference to code readability. It is already not too much worse than python code such as:

for card in cards :

The above is, I guess, notational nirvana. Is there any cleaner more economical way to express the idea that you are iterating over the elements in a container? The java code still has significant visual clutter, the ( and ) and { and }, delimiters that python does without. Python (and Ruby) are still much nicer languages to write (and read) code in. The generics in java 1.5 simply reduce the gap somewhat.

But again, the practical advantage of the generics here is, in my view, entirely on a notational level. The type-safety argument that is often harped on when generics are discussed strikes me as mostly an ersatz issue. Code throwing ClassCastException when you cast taking them out of containers was never that much of a problem in pre-1.5 java code.


Another thing in that I have started using the Eclipse IDE for my java coding. Even aside from the tool's intrinsic merits, simply the fact that my main FreeMarker collaborators (Attila, Daniel, Stephan) use Eclipse is a major consideration, and I had taken various runs at using it before. However, until now, I had never been able to keep using it. I always ended up switching back to using jEdit (along with certain key plugins like XRefactory and AntFarm) as well as a whole hodgepodge of little scripts I used from the command-line, using a lot of grep and find and so on. I actually recognized that this was a more retrograde approach, but I still abandoned Eclipse each time.

But now I am using Eclipse and there is little likelihood that I am going back.

Perhaps the main reason that I have moved to Eclipse this time for good (as compared to previous runs) is that I am running better hardware. I bought a new Dell Inspiron 9300 last September. My previous memories of using Eclipse were that it was just so monolithic and slow. Also, I am simply running the Windows XP that came with the Dell and in previous runs, I was running Eclipse on Linux, where Java is much less performant. In particular, memory usage was just frightening on linux.

Eclipse itself may have got better as well, as well as the underlying JVM. But for whatever reasons, it was mainly that the thing was such a hog that, while recognizing the tool's virtues, I couldn't use it; it just drove me nuts.

But that was then and this is now. Consider me an Eclipse convert. Eclipse is also a good argument for Java over Ruby or Python, say. I do not believe the latter have development environments that are comparable.

Friday, February 17, 2006

Google doesn't index us...

Google doesn't index for years, probably because of the domain name hijacking issue earlier, where someone has run a porn linkpark on that domain. So we didn't stopped using since then, because that's at least indexed by google. But the google rank of a page is mostly calculated based on how many external links point to it. Since our official homepage is, most external links will use that domain name, hence those links are all wasted as far as we are talking about google rank. Given that how critical factor google rank is for your visibility, I see this as a serious loss.

Thursday, February 16, 2006

Some comments on joining Jakarta (or not)

A longtime FreeMarker user, Leos Literak, asked on our mailing list about the possibility of our joining Jakarta/ASF. I figured that by answering on this blog rather than on the list, I could kill two birds with one stone -- I could answer him and inaugurate this blog with something (hopefully amusing) for people to read.

I guess Leos knows that such a move is highly unlikely, if only on account of the less than cordial relations that some of us have with some of the Jakarta/ASF people. But that is only part of it, at least speaking for myself. Other FreeMarker developers may not share my views (or share them partially with some nuances.) But in any case, I am only speaking for myself.

Okay, I shall assume that the reader is familiar with the Arthurian legends (you know, Camelot, Excalibur, the Round Table) through one or more of its literary, or more likely, cinematographic adaptations. Let us say that I am a knight of that time period, known for certain knightly deeds. It is proposed that I should try to become admitted to the Round Table, to become one of the famed Knights of the Round Table.

I have to admit that the idea has certain appealing aspects. You see, there is a whole mystique around the Round Table. This is mainly because there is a whole legion of wandering troubadours who go about the land singing epic songs about their heroic exploits. This has a lot of advantages. It seems that if you are a Knight of the Round Table, you are given the choicest morsels at feasts. When you show up somewhere, they break out the best wines and spirits to celebrate that such a hero is gracing their presence. The fairest maidens of the land fall at your feet.

So it would seem to be an easy decision. However, there is a dark side to all of this. You see, suppose that you happen to know that the Round Table mystique is based on quite little. On the contrary, these Knights of the Round Table are, in reality, quite the motley crew. Some are not such bad fellows, but many others are thugs and cutthroats and others are cowardly bullies. To your mind, King Arthur himself is no more admirable than any other mafia boss trying to expand his territory. (As a side note, while I grant that I was not there, this version of things strikes me as being at least as likely to be true as the legends.)

Suppose that I don't accept Arthur as a legitimate king. (For example, I happen to know for a fact that when he pulled the sword out of the stone, it was a cheap sleight of hand trick.) However, to become a Knight of the Round Table, I must kneel before him and accept him as my sovereign. I will probably also have to retract any prior statements I made about the supposed exploits of the Knights of the Round Table. I would now have to toe the party line (sorry for the mixed metaphors...) regarding the heroic exploits of my colleagues of the Round Table. (Actually, for an example of this, see here.)

Well, in summary, like any decision it would have its pros and cons to weigh. The pros mainly involve all the troubadours wandering about singing your praises. (I guess the modern equivalent is all the computer magazines and the various people who write O'Reilly books about any load of horse manure that ASF tosses over the fence...) The cons involve compromising one's integrity by accepting illegitimate authority, and also having to be nice and behave respectfully to people you don't particularly like or respect.

I guess the reader has already inferred that, for me, given my characterological makeup, the cons outweigh the pros -- to an extent that it is basically unthinkable.

All of that said, I can respect that other people would see things differently. Certainly, many other open source projects are extremely eager to become ASF projects and I can respect their reasons. It may well be a weakness of overweened pride on my part that I am not willing to kowtow to authority that I perceive to be illegitimate. It certainly is not particularly pragmatic.

Tuesday, February 14, 2006

Feature in spotlight: Error reporting

Max Rydahl Andersen, a Hibernate developer tells us "A story about FreeMarker and Velocity" on the Hibernate groupblog. It is later covered on as well. Looks like that the one feature among all things that particularly convinced Max to abandon Velocity and start using FreeMarker was - drumroll - error reporting. Now, error reporting is not a functionality that's usually prominently displayed on a product's feature sheet, yet if done right it is invaluable for time efficient problem solving when things go wrong.

So, what's it all about?

Well, it works like this: when FreeMarker encounters an error, it'll not only pinpoint the exact column and line where the error occurred, but it will display a full template stack trace containing all macro invocations and includes that led to the point of exception. To illustrate it with an example, consider we have two templates, a.ftl and b.ftl:


<#include "b.ftl"/>

<#macro printFoo y>

<@printFoo bar/>

Then you'll see this output (assuming your error was that "bar" is a simple string instead of a hash):

Expected hash. y evaluated instead to freemarker.template.SimpleScalar
on line 2, column 3 in b.ftl.
Quoting problematic instruction:
==> ${} [on line 2, column 1 in b.ftl]
in user-directive printFoo [on line 5, column 1 in b.ftl]
in include "b.ftl" [on line 1, column 1 in a.ftl]

So you can see precisely what template execution path led to the error - invaluable to figure out what's going on when you can have macros and includes invoked from multiple locations.