One of the wonderful phenomena of the past few years has been the emergence of Wikipedia as an incredible reference for just about anything. If you've heard Wikipedia's founder Jimmy Wales speak (for example watch this 20 minute video from July 2005), you've also heard about Wikipedia's vibrant community, how 50% of the edits are done by just over 500 people and 75% of the edits are done by less than 1500 people.
It appears these statistics, while true, are very misleading!
Aaron Swartz has looked at who actually provides the most content and it's very different from who makes the edits. First Aaron was just looking at specific articles, initially the article on Alan Alda. But then he wrote a program to examine the Wikipedia archives and extract statistics. He's written a very comprehensive article, but in summary:
If you just count edits, it appears the biggest contributors to the Alan Alda article (7 of the top 10) are registered users who (all but 2) have made thousands of edits to the site. Indeed, #4 has made over 7,000 edits while #7 has over 25,000. In other words, if you use Wales's methods, you get Wales's results: most of the content seems to be written by heavy editors.
But when you count letters, the picture dramatically changes: few of the contributors (2 out of the top 10) are even registered and most (6 out of the top 10) have made less than 25 edits to the entire site. In fact, #9 has made exactly one edit -- this one! With the more reasonable metric -- indeed, the one Wales himself said he planned to use in the next revision of his study -- the result completely reverses.
When you put it all together, the story become clear: an outsider makes one edit to add a chunk of information, then insiders make several edits tweaking and reformatting it. In addition, insiders rack up thousands of edits doing things like changing the name of a category across the entire site -- the kind of thing only insiders deeply care about. As a result, insiders account for the vast majority of the edits. But it's the outsiders who provide nearly all of the content.
Aaron's a smart guy. As a teenager he co-authored RSS 1.0, worked on the W3C's RDF 1.0 Working Group and wrote RFC 3870. Then he went to Stanford for a year before dropping out to become an entrepreneur. I'm impressed he has the time to work on Wikipedia. He's currently running for a seat on the Wikimedia Foundation Board.
Of course his article raises all sorts of governance issues for the Wikimedia Foundation. Today, you have to have made over 400 edits in order to qualify to vote for members of the board of Wikimedia Foundation, i.e. only the editors can vote, not the folks who are actually contributing most of the content. And not me whose contributions are very minor. What's critical is that it remains easy and rewarding for stray individuals to add content. I certainly hope the Wikipedia inner circle takes note of Aaron's data.