FreeMarker Blog

The official weblog of the FreeMarker project

Sunday, February 26, 2006

Musings on Wikipedia and Open Source

Recently, I have been using Wikipedia quite a bit. First off, let me say that Wikipedia really is a blast. It is informative, like a regular encyclopedia, but due to the fact that anybody can contribute, there is a lot of funky stuff there that would not be found in a conventional encyclopedia. Surfing wikipedia is just a whole lot of fun!

Of course, a naysayer would likely interject at this point that if you want to have some fun, go to wikipedia, but if you want complete, reliable information, look at Britannica. I have to admit that this would also be my instinctive reaction. However, much to my surprise (and doubtless that of many other people) a recent study that appeared in Nature magazine found that the accuracy of scientific articles was not significantly different in Wikipedia than in Encyclopedia Britannica. Here is an article on that.

Now, aside from a few very specific topics, like java template engines, or preflop strategy in Omaha Holdem poker, and maybe a couple of other things, I am not really qualified to judge the quality of information in Wikipedia. On the other hand, I have noticed that most Wikipedia articles that I come across are fairly well written. I am sensitive to this and consider myself to be a fairly good judge of it.

I have been thinking about various things regarding open source. It seems to me that Wikipedia's strengths (and weaknesses) as compared to a conventional encyclopedia are pretty much those of the open source development model as compared to conventional software development.

One of the revolutionary aspects of free software is that it drastically reduces barriers to entry. Anybody who is interested and motivated can hack the source code. Similarly, anybody can contribute material to Wikipedia. It seems likely that a conventionally published reference book would have some advantage in quality over the wikipedia model. After all, there is a paid editorial staff that would do systematic fact checking and do some needed line editing and so on. However, the advantage of the wiki model is how flexible it is, how quickly it can admit new material and updates and improvements. For example, if a major scientific discovery is made in a field that invalidates previous theories, it is likely that a wiki-based encyclopedia would incorporate this information much more quickly than a conventional publication. So there is a clear trade-off. In a field as rapidly moving as java software, for example, you might well prefer an article about java development tools that is up-to-date but may contain some errors and bits of sloppy prose over a more polished article on the subject that is a couple of years out of date.

Not long ago, I wrote a blog entry about the issues involved in (very) hypothetically joining Jakarta. Though I answered by way of a simile involving King Arthur and the Round Table, I think that the reasons I gave would be clear to anybody reading it. However, an aspect of this that I did not mention there was that I have gradually come to the conclusion that ASF's entire vision of the open source process is incorrect. Certainly, there are reasons to have severe doubts about it. In recent private correspondence, a java developer commented that, once you got beyond surface impressions and actually rooted around, one could see that over 90% of ASF projects were in some kind of state of hibernation or even severe abandonment. He worried that an open source project that he liked and used quite a bit was on the road to becoming an ASF project, and that this would likely be the kiss of death.

I am not so familiar with that many ASF projects, so I cannot vouch for the 90% figure this person gave. However, I think it's clear that there is some kind of systemic problem.

My considered view is that the root of the problem is that ASF wants to project a certain elitist idea, that becoming a committer on an ASF project is some kind of great honor. If you lurk on a given project's mailing list, you will on occasion see them announce, to great fanfare something like: "John Jones has been accepted as a FooBar committer." This kind of thing has always caused me to roll my eyeballs. The subtext is a bit like so-and-so has been admitted as a high priest who may now enter the inner sanctum and touch the holy of holies (which is the code repository presumably.)

So, until they admit you to the holy of holies, you are basically in some kind of supplicant position: "Please sirs, will you look at my patch." And, of course, since the patch in question is typically something that only that person needs at this moment (or maybe other people need it but none of the committers do) what with one thing and another, they typically don't get around to looking at the guy's patch.

Certainly, the FreeMarker project is not run this way. If somebody expresses some interest in hacking the code, I pretty much immediately offer to add them as a developer so that they can commit code. They are added with no great announcement or votes or fanfare. Note that no vetting has occurred here. I will typically have no objective proof of the person's abilities. It is enough for them to say that they are interested in doing something for me to provide CVS access. Basically, we simply assume that somebody is competent until proven otherwise.

Now, as a practical matter, there is not really much problem with people then turning around and committing poor-quality code willy-nilly. Actually, nine times out of ten, somebody expresses interest in doing something and you give them r/w access to CVS and they just never do anything -- good or bad.

So, people committing all kinds of poor quality code is not a common real-world problem. But it can occur. However, even when it does occur, how much of a problem is it? Somebody does something and you can see what they did and modify their work or completely roll it back. This is, in fact, the whole point of a versions repository, is it not? Since it is fairly easy to roll back the code to some previous known state, why should one be so conservative about letting people commit code?

But again, I think the core problem with ASF is this underlying elitist idea. And I think it's wrong; open source is not elitist by nature. It's more like: "If you can do something, then roll up your sleeves and get to it, let's see what you can do." In other words, a person is assumed to be competent until proven otherwise. The ASF approach seems, on the other hand, to assume that you are not competent to collaborate until somehow proven otherwise. What makes matters worse, though, is that it is not really obvious what people who become committers have done to prove their competence. It often seems to be more of a popularity contest with people voting +1 and so on.

So, admittedly, another take on this is that the problem is perhaps not elitism per se, but elitism that is arbitrarily applied. If you're going to be elitist, you should at least have some objective criteria.

The other aspect of this that I think is quite worthy of comment and some analysis is that, as far as I can see, the projects that line up to join ASF are not doing so because they really believe that there is any technical value in it. It is purely to leverage the Apache brand name and thus take advantage of those placement and visibility advantages. Now, if you look at the pages relating to the Apache incubator, they state the supposed technical advantages that getting in with ASF involves, access to world experts in specific domains, experts in running open source projects, etcetera. But again, I do not believe that the OSS projects that want to get on apache.org believe any of this. It's purely for the visibility. For example, when the Struts/Webwork merger was discussed on opensymphony.com forums, the argument used was that by merging with Struts, they would get WebWork's technical superiority along with Struts's "community". I parsed "community" in this case to be code for the marketing advantages. (In general, "community" is a term that ASF people use frequently in an odd and somewhat mystical way.)

I do not recall that anybody suggested that the Struts people had anything to offer technically. Zero, zilch, squat. Of course, similarly, when Leos Literak asked me about the possibility of joining ASF, it was all about publicity, he never at any moment in the discussion suggested that ASF had anything to offer us on a technical level.

Well, all of this does introduce a real element of moral hazard. If you join ASF purely because so many people believe in this "Apache mystique" (that you, yourself do not believe in) then you now have a vested interest in perpetuating said mystique, since, after all, your whole strategy was based on the continuation of this mystique.

As a final note on this "Apache mystique", if what my correspondent said, that over 90% of ASF projects are in a sad state of neglect, a great gap has opened up between hype and reality. A very huge gap indeed. Is such a situation sustainable long-term? You know , it may be analogous to what happens in a financial boom, where something like internet stocks, say, get priced at some level completely out of line with whatever real economic value these things have. Such booms ultimately lead to a day of reckoning, a crash. When exactly such a crash occurs is all in the sort of theory of tipping points, etcetera. Or maybe there won't be any such "crash". Still, my sense of things is that this Apache mystique will ultimate end up being deflated significantly.

Well, nothing is more humbling than trying to predict the future. I have no crystal ball. We shall see.

8 Comments:

At Tue Feb 28, 11:24:00 AM GMT+1, Blogger Dániel Dékány said...

I think the high percentage of abandoned products is an inherent problem of any organizations that host OSS projects. Starting an OSS project is very often the question of temporary enthusiasm that then goes away. And then, since it's not your paid work, nothing will force you to continue it when you have lost that flame. An idea to cope with this problem is that new people will come and pick up the project that the previous developers have abandoned. But it require good visibility, to increase the chances. So in this sense the "community" thing does make sense. The question is, if this works as good as it should in principle, and if not, why not. In the case ASF I have the *guess*. People who used to pick up projects are usually some kind of coder geeks, you know, people who do programming because it's fun to create. And, maybe these people don't like the atmosphere of ASF. Because, the ASF people are too much of politicians, as opposed to engineers. Also, maybe some of these coders like the idea of Free software a la FSF, and after some thinking will draw the conclusion the ASF is not on that side behind the scenes. Anyway, who of the coder geeks feel like picking up a project that is technically a pile of crap? Well I just start out from myself... if I do a project for free, then usually I do it because of the beauty of it, and because of that scratch-he-itch feeling. If I want to mend big piles of crap, then here is my paid work... there is no way I do it for free.

 
At Thu Mar 02, 02:04:00 PM GMT+1, Anonymous Anonymous said...

I don't think the ASF is any different from other open source projects - I'm not familiar with Freemarker but from what I can see the original authors have abandonned ship. Fortunately other people like yourself stepped in to keep the project going.

The 90% figure seems like FUD to me - although like you I have no facts to back up this opinion. I wouldn't be surprised if this figure wasn't true for sourceforge though which is a reflection of open source in general.

I think there is an issue with the ASF and why some projects have problems - I would label it bureaucracy, rather than elitism and I agree that is a barrier to getting people invloved and releasing software. However I also think thats one of its strengths. Its because of the procedures/policies on quality, the infrastructure, the licensing, IP that many corporations will accept ASF software. Whether those corporations are mis-guided in this is another debate, but I believe the ASF brand opens doors that smaller projects may not be able to.

Projects lining up to join the ASF is nothing to do with technical ability - since the people bringing the project to the ASF are still going to be the committers after joining.

Its interesting that in one of your comments on the "Some comments on joining Jakarta" blog you laud Spring and Hibernate as part of a holy trinity, but it looks to me like its easier to become a committer on an ASF project than either of those. Do you slam them with the same criticism?

 
At Fri Mar 03, 02:34:00 PM GMT+1, Blogger Jonathan Revusky said...

First of all, to compare the state of projects on apache.org with sf.net is absurd. In general, the projects hosted on apache.org have been, at least at some point in the past, active, with a healthy community. Also, they are projects that have been fairly visible with a certain established user base. This is mostly because the projects that are eventually adopted as ASF projects have those characteristics. (At least, at the point in time when they join ASF.)

The vast majority of projects on sourceforge are basically stillborn. They never were very active to start with, and never became active. (Sourceforge is based on the idea of "let 1000 flowers bloom" and since there are no "gardeners" on staff, this also means "let 999 flowers wither and die.")

The more appropriate question would be, taking just the (small) minority of sf.net projects that had achieved a certain level of visibility and user base and so on, what percentage of them subsequently fell into a state of neglect -- as compared to the situation on apache.org.

I don't know the answer to that but that would be the $64,000 question.

If a high percentage of ASF projects are in a neglected state, this is a much more damning comment than a similar comment about sourceforge, because these are projects that were highly active at some point in the past, and then fell into their current state. Also, there is, in principle, some amount of organizational oversight that is meant to keep an eye on these things, and, at least reduce the likelihood that projects will reach an abandonware state. There is no equivalent to the Jakarta PMC on sourceforge.

If, in fact, it is the case that projects that are active and healthy on apache.org, actuarially are as likely to become neglected and inactive as a similar project on sourceforge, it basically would mean that ASF is dysfunctional.

As for my "holy trinity" comment about Spring+Hibernate+FreeMarker, I was not "lauding" any of the three actually. I was simply quoting private correspondence where my correspondent expressed the opinion that the combination of the above three was increasingly becoming a kind of "best practices" standard in MVC web development -- each one approximately corresponding to M, V, or C of the MVC architecture.

Also, I have no idea how hard it is to become a committer on the Hibernate or Spring projects. What is your basis for saying this? Did you try to get involved in either project and find that you were given the runaround? I can only speak about FreeMarker really. It is extremely easy to become a FreeMarker committer.

I have no doubt that the ASF "brand" opens doors. I also have no doubt that the current president of the United States would not be president if his last name were not "Bush". The question is whether the former is a better criterion for choosing software than the latter is for choosing presidents.

Though you list various reasons, the main practical reason for preferring ASF projects would be that you believe that something being on ASF provides a much greater guarantee (or at least likelihood) that the project will be maintained and supported in the future. If this turns out to be false, IMO, it becomes hard to avoid the conclusion that the the emperor is, in fact, wearing no clothes.

 
At Fri Mar 03, 03:48:00 PM GMT+1, Anonymous Anonymous said...

You're right I haven't tried to get involved with Spring or Hibernate - but I assume (possibly incorrectly) you need to get a job with their commercial parents to get commit access.

I don't believe the ASF makes any assertions that software is more likely to continue to be maintained and supported (please post links if you have them). If it did/does then IMO that is clearly nonsense. The ASF is made up of individual volunteers and if all the active committers on a project stop being active - there isn't some ASF resource to step in and carry on. So I think your "emporers clothes" argument is also nonsense - its based on a misguided assumption on your part.

I do think the ASF faces a problem when projects are abandonware - since at that point new committers are not going to get voted in and so there is no possibility of the project reviving (within the ASF). However, under the ASF license there is nothing to stop people from picking them up and taking them out to sourceforge.

If you believed Freemarker would get a bigger user base if it was part of the ASF - or even if it merged with/replaced Velocity, then that in itself would be a good reason to join.

 
At Fri Mar 03, 06:57:00 PM GMT+1, Blogger Jonathan Revusky said...

I shall try to deconstruct your underlying argument a little bit by way of a simile. People will believe (and I would say, correctly) that shares listed on the New York Stock Exchange (or any similar major exchange like London or Frankfurt, say) are more likely to be solid investment vehicles -- far more likely than in the case of penny stocks that are traded over the counter. That said, neither the NYSE nor the London Stock Exchange or any other, are going to intervene to rescue a company from bankruptcy. They are not committed to doing that, nor do they have the resources to do that in any case.

On occasion, a NYSE-listed company does in fact go bankrupt and the shareholders, of course, lose their money. The risk is inherently there.

Nonetheless, savvy investors do perceive that a company's shares being listed on the NYSE does mean something. This reflects reality because, for a company's shares to be listed there, they must meet certain criteria. When they commenced trading on the NYSE, they certainly met those criteria; circumstances may change. However, when certain warning signs appear, i.e. the company is not continuing to meet certain criteria, the NYSE may well delist a security.

Surely you see my point by now. By the way, I would point out in passing that the idea of ASF intervening in the running of an open-source project is less fanciful than the NYSE intervening in the running of a company. The ASF is, after all, run by professional programmers. That said, they typically will not intervene, I agree.

It is a constant theme of my conversations with people that opting for the better known technology is somehow the safer choice and that using something that may be technically better but they have not heard of is riskier. This is not irrational by any means. People who are going to invest in building systems do not want to build on top of components and libraries that are liable to become abandonware. People obviously opt for ASF projects based on this kind of reasoning. In fact, the success of ASF as an entity cannot, IMO, even be understood properly without taking this mechanism into account.

So, when I say that, if we actually determine that something being in ASF does not make it less likely to be abandoned, it would have negative implications in terms of what ASF is supposed to be offering. It would imply a huge gap between perception and reality. I do not believe that any amount of sophistry can really get around this.

Niall, I do not particularly mind that you characterize my arguments as "nonsense". It is inflammatory, some people would get angry. But actually, I do on occasion utter nonsense unintentionally (it's more frequent than NYSE-listed companies going bankrupt) and if you think I'm talking utter nonsense, feel free to tell me so. But I don't think that the basic core thing I'm saying does qualify as nonsense.

Also, I think that this idea that ASF-hosted abandonware can be picked up on sourceforge is a bit silly, and perhaps a tad disingenuous. If, given the huge visibility advantages that being part of ASF affords, a project cannot maintain a critical mass, its prospects without that ASF visibility advantage, it would seem, are quite poor indeed. Also, the scenario does not quite make sense. If, for example, somebody has a very different vision of what the next version of FreeMarker should be like from what I and my collaborators do, it makes sense for them to fork of a "Refreemarker" or something -- with all our blessings even -- and they might well take some of our user community along with them who like their vision of things. (I would comment in passing that this might even be a good development since then there would at least be some technically based competition in the java template engine space. Currently, there basically is none.) However, if I and my collaborators simply lost interest in improving FreeMarker, there would be no reason for a fork. The people who wanted to do something with it should just be passed the torch, it seems to me.

As for the last point, that if I believed that FreeMarker would get a bigger user base by being part of ASF, that would be a reason to join. Yes I do actually believe that, and yes, that would be a reason to join. (Actually, it's the only reason I can think of offhand.) It's just that there are other countervailing considerations that push me the other way, and for me, the cons dominate the pros.

 
At Fri Mar 03, 10:02:00 PM GMT+1, Anonymous Anonymous said...

I do see you're point that there may be a perception that an ASF project is less at risk, however theres nothing on the ASF page that makes that claim.

Also what are we comparing the ASF to when? I agree that comparing it to the whole of sourceforge doesn't make sense (absurd :-)) - but it is equally absurd to compare it to successful sourceforge projects - since by definition successful projects are not abandonware.

Its also the case that the ASF is evolving - new projects have to go through the Incubator and one of the exit criteria is that a project has to demonstrate an active and diverse development community. This hasn't always been the case - but it seems to me this this is something that will make it less likely that projects become abandonware.

I do agree that any ASF projects that are abandonware are a problem - but the problem with your original blog entry is its based on an unsubstantiated fact (90% of projects are abandonware) and don't believe either you or me know if its even a major issue at the ASF. It may be the case that it has been an issue in the past and thats excatly why the incubator was formed with its current policies.

I agree with your point about a project like Freemarker and "passing the torch" - given an abandonned project, easy to pass the torch on sourceforge but probably very unlikely to happen with an ASF project.

Apologies for the inflamatory nonsense comment, it was done tongue in cheek following your absurd comment - seeing if you could take what you were dishing out :-). I don't think the core of what you are saying is nonsense, but I do think its unfounded at this point. It may turn out to be completely true or untrue - but I suspect the reality is somewhere in between.

 
At Mon Mar 06, 08:16:00 PM GMT+1, Blogger Jonathan Revusky said...

Given the visibility advantage that ASF projects have, it should be much easier for them to attract collaborators -- and thus remain active.

If a high percentage of ASF projects lapse into a state of neglect, this strongly suggests that there is a problem.

In my article, I suggested that it was inordinately difficult for people to get involved and that the reason was some kind of misguided elitism. The apache website does speak continually of meritocracy, and the notion that you must prove your worth to become a committer. That is where I deduced that the problem was some kind elitist mentality.

It may be, on the other hand, that the root of the problem is just excessive bureaucracy, not elitism.

Be that as it may, I think there definitely is a problem. I said pretty clearly that I did not vouch for my correspondent's 90% statistic. There is such a thing as hyperbole. For example, Niall pointed me to a page where people were frustrated with the bureaucracy involved with ASF. They cited a "40-step" release procedure.

Probably the 40 steps is also hyperbole. That does not mean that there is not a problem.

Quibbling about the 90% figure or whether there really are 40 steps in doing a release is just that, quibbling. We understand that these statements contain hyperbole.

If you're going to talk about the "Apache Way" and that projects must demonstrate that they "get it", the "it" being the "Apache Way", there is a very strong implication here that this "Apache way" represents some kind of superior project management practices. It implies that if you follow this set of practices, your chances of having a successful project are greater than they would be otherwise. Whether this is stated explicitly or not seems beside the point. It stands to reason.

Now, one can also debate what the appropriate definition of "success" for a project is. However, I think it's safe to say that lapsing into an abandoned state would not be most people's definition of "success".

If there is a methodology or set of practices that constitute the "Apache Way", it must, like any such thing, be judged empirically on the basis of results.

The conceptual experiment in question would be whether a healthy, active project hosted on apache.org is more likely than a non-ASF project in a comparable state to remain that way -- I mean, healthy and active.

My correspondent was suggesting that the prognosis for an apache.org hosted project was actually worse. He was worried that a project he liked a lot was in the incubator and would likely go into a state of inactivity after being accepted on ASF. Quite something. If his idea is at all right, that ASF projects are actually more likely to go this route, this would be a devastating implied critique of the "Apache Way".

 
At Mon Jan 11, 01:42:00 PM GMT+1, Anonymous Anonymous said...

Your blog keeps getting better and better! Your older articles are not as good as newer ones you have a lot more creativity and originality now keep it up!

 

Post a Comment

<< Home