Reading an old post by the wonderful Jason Scott about The Great Failure of Wikipedia and then a more recent one by Megan Garber titled Why did Wikipedia succeed while other encyclopedias failed? about research by Benjamin Hill and these pieces reminded me of something I’ve wanted to write for a long time, about how Wikipedia is a wild success despite its utter failure.
Hill’s big finding is that Wikipedia has become so popular because its founders didn’t buy into the if you build it, they will come meme, so they knew they had to write and edit and evangelize to get people on board instead of fetishizing technology.
And the evangelizing worked. I couldn’t live without Wikipedia anymore, it’s that good. Frankly, I’m so in love with Wikipedia that I’d gladly pay to keep it running, and so I did: last year, I donated. You should too.
Procedural whackjobs
But Wikipedia works its magic through what is actually a massively inefficient publishing process. Important stuff gets deleted by idiots, discussions about unimportant issues drag on, pages get deleted because they’re somehow deemed not notable enough, and so on. People get pissed off and nine times out of ten they’re right to be mad. Wikipedia has accumulated its fair share of self-righteous, power-hungry people with too much time on their hands.
Wikipedia works despite its guardians and community standards, not always because of them.
Luckily Wikipedia is such a big place that each subsection has grown its own contributor culture. As a senior during my undergraduate studies, I contributed pretty actively to Wikipedia, doing research about philosophical pragmatism and related articles. Surprise: I enjoyed it, it was great. The procedural whackjobs tend to leave literature, philosophy and science alone.
Share!
As a contributor, it’s more pleasant if you actually understand that Wikipedia has a very specific purpose, and that is to make all kinds of subjects and concepts understandable to a general audience, with a (good!) bias towards perpetuating the common wisdom and mainstream ways of thinking. This is Jason Scott’s old beef with Wikipedia and it doesn’t make sense to me.
If you have uncovered fantastic new information about a certain topic, you should write an essay or a book or a blog post and then maybe that will get incorporated into relevant Wikipedia articles.
If you have a particularly zesty way of writing and are worried that *-for-brain editors will eat it all up and regurgitate it as a bland soup, you’re probably right, and you should find another place to share your knowledge.
If you’re bringing a fresh new angle to a subject, a new way of thinking about things, for Pete’s sake, don’t waste it on Wikipedia.
You’ll find that you’ll feel much better about Wikipedia if you look at it as just one repository of knowledge, rather than as a grand unifying thing.
Create your own knowledge base, answer questions on StackOverflow and Quora instead of writing about it on Wikipedia, blog about your area of expertise, release your rights-free images on Flickr (I do) instead of Wikipedia.
The important part is that you share knowledge. Wherever you want, really, as long as we can find it.
Feeding the machine
Wikipedia is about people, not technology. But here’s a less charitable interpretation: Wikipedia has to be about people because it never cared about technology.
Editors manually create endless list pages, like all people born in 1603 or people from Rhode Island, because Wikipedia’s data model, viz. no data model at all, doesn’t allow these pages to be autogenerated from simple database queries. Same thing for disambiguation pages, figuring out which pages map to which translations, and linking broader topics together with more specific articles.
We’re all complaining about our crappy CMS but our misfortune pales in comparison to MediaWiki and the way it devours Wikipedia contributors’ cognitive surplus and cajoles them into repetitive manual labor that you figure, this being the 21st century and all, computers would do for them.
Wikitext
Then there’s wikitext, which once upon a time had a Markdown-like elegance but has now spiraled so out of control that most local or topical wikis fail before they’ve started: potential contributors take one look at the syntax, decide rocket surgery might be more within their cognitive capacities and run away before contributing even a single word.
Frankly, considering how hard it is for non-techies to write in MediaWiki, I’m surprised that a local city wiki like the Davis Wiki has ever gotten off the ground. It has certainly survived against all odds. And I’m not surprised they’re looking to get rid of MediaWiki.
The WikiMedia foundation has been looking for a Rich Text Editing software dev for a long time. They try, but I don’t know if they can truly solve much considering MediaWiki is such a decrepit codebase. All improvement is bound to be the electronic equivalent of dodging landmines.
Blobs and bots
Part of the problem is that Wikipedia and its engineers are introducing ever more (confusing) wiki syntax to cope with semi-structured data. Semi-structured data are things like a person’s birth date and current residence, anything that’s not a blob of prose. Structure can give content a second life in maps and timelines, and makes it easier to find what you need, like famous people from Rhode Island born after 1972.
Getting any of that good stuff out is really hard, which is why DBpedia — DBpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web. — deserves so much kudos. Of course, if Wikipedia’s data model were anywhere near reasonable, creating an api.wikipedia.org wouldn’t take a separate project like DBpedia, instead it would be a good day’s work for a software engineer and that’d be that.
Wikipedia bots alleviate some of the drudge work by gardening and cleaning Wikipedia automatically while crawling through its pages. For example, many American city pages were created and are updated with new census information and maps without human intervention. Thank you, rambot. But these bots themselves are convoluted pieces of technology. Wikipedia’s data model means they have fudge raw text without stepping on anything real humans have written, which is not easy.
Bots help, just not enough.
Wikipedia is tragic that way: there has been no money, no strategy and no guts to take the software to the next level for ages now, so we’re stuck with a patchwork of fixes and tweaks on top of software that was already out of date when it was first released in 2002.
Inside the sausage factory
As a reader, you don’t really notice that developers have a hard time getting meaningful data out of this huge bank of knowledge, you don’t notice that professors and experts get frustrated fighting with nimwits about stuff those experts know inside-out, you don’t notice that many early contributors never return, you don’t notice the vandalism, you don’t notice how many people whose contributions we’d cherish are put off by that horrible, horrible wikitext syntax.
(What English readers also don’t notice is that local versions of Wikipedia, like the one in Dutch, are even more inconsistent in their quality than the English-language flagship. Wikipedia wins by sheer numbers, and when those numbers aren’t present, quality suffers.)
But here’s the thing: the common wisdom that garbage in means garbage out doesn’t actually apply to Wikipedia. In the Wikipedia model, you put in lots of raw material that’s decidedly less than perfect, but the stuff that comes out is actually damn tasty. In other words: Wikipedia is a sausage factory.
Wikipedia needs to knock out the bullies and improve their tech, because both are making Wikipedia less great than it could be. But while they do so, let’s also just take a minute to appreciate the enormous value of this thing that we’ve created, we, together, people from all over the world.

22 comments
This article may be written from a fan's point of view, rather than a neutral point of view. Please clean it up to conform to a higher standard of quality, and to make it neutral in tone.
;-)
(This comment may be written from a fan's point of view, rather than een neutral point of view)
I like it!
;-)
I like your thoughts, but I think you're being a little too rough on the Wikipedia engineering team. Creating an API for an arbitrary subset of human knowledge is not something you just get up one morning and do, even if you purport to be the custodian of that knowledge.
This article is written in a primarily in-universe style, but it scores an A+ on WikiProject:Good Blog Posts About Wikipedia's quality scale.
Today is Meta Meta Wednesday, it seems :-) Thanks for the comments, all.
@Court: yeah, over at Hacker News (where this post is featured) there's some good points being made about current efforts to catch up, engineering-wise: http://news.ycombinator.com/item?id=3130134.
That said, I do really think MediaWiki is absolutely craptastic. And the reason why that makes me so upset is something Wikipedia can't actually help: there is no single piece of wiki software out there — and I've seen hundreds — that is any better, so every time I need a wiki I'm stuck using something I hate or something half-assed I build myself. Neither is a good option.
Also, while I think culture has something to do with how bad MediaWiki is ("let's not change too much"), I realize there's only so much you can do with limited resources and a legacy codebase like that. I think we can and should expect more, but hopefully nobody reads my post as saying that all MediaWiki contributors (either inside or out of the foundation) are retarded. They're most certainly not.
HTML tags can have contenteditable attributes. With practical CSS and articles editable inline, wiki markup and markdown become obsolete. Inertia hinders implementation.
Death to deletionist scum!
Indeed. Try working on low-status topics like foreign pop culture, on the other hand, and 'death to deletionism' starts looking like a pretty good slogan.
What troubles me most is that a lot of articles seem to simply no longer have editors who care about them - to have been orphaned. You provide excerpted sources on the talk page, and not only does no one paste them into the article, no one even replies in any way.
@Rob: but if you've ever tried to make a proper WYSIWYG editor, you know that it's browser hell, and weird edge cases abound. And Wikipedia will likely want to keep supporting the old wikitext as well, which makes it doubly hard. Inertia, yes, but it's quite the challenge too.
@gwern: hm, I'm not a contributor anymore but while I was, there was no way to keep tabs on what's happening to articles you care about or have edited in the past... except to note them down somewhere (which many people do, on their user page) and visit them one by one. Doesn't really scale if you're an ambitious editor. Again, a great example of something that a little bit of technology could solve for Wikipedia.
@stijn: the idea of a simple, elegant solution appeals to me. Group inertia with legacy browsers, add edge cases, encoding issues and things like mathematics and it is a big fine mess.
Best to keep some content on WP and other info elsewhere, then. My own edits on WP these days tend to favour simplifying existing, excessively verbose content.
But I guess even TiddlyWiki hasn't gone that far either yet (ref. http://tiddlywikidev.tiddlyspace.com/ ).
Awesome post. Lots of good insight here. Re: WYSIWYG content editing and wikis:
@stijn said: "there is no single piece of wiki software out there — and I've seen hundreds — that is any better." Amen.
I'd make the case that Confluence is marginally better... and that's about it. It's amazing to me that a tool as important and useful as wiki software hasn't seen more progress in terms of technology and UI. This is why I've been advocating in the Drupal community to make the "example distribution/profile" that is being developed for the next version a modern take on the wiki concept.
Either way, I'm working on "the wiki problem" both at my job for the US Dep't of Energy and at my community nonprofit. If you're reading this and want to contribute/discuss, email me at davideads on gmail.
About WYSIWYG editors: The best option I've found is to use a well-established editor like TinyMCE or CKEditor and write a plugin which allows the source to remain in the lightweight markup language of your choice. So for the Dep't of Energy, we use the TinyMCE editor for WYSIWYG, but the source remains Markdown.
I've found it hard to sell this concept, and I'm not sure why: the geeks get their markup, normal people get something visual and easy, and email integration for comment threads and the like is largely trivial if you teach your community the conventions of the chosen markup language. I eat my own dogfood on this one (Drupal users can use a module I wrote, available at https://github.com/ecenter/markdownify - screenshots at http://skitch.com/eads/fn49a/townsquare-editor-04-more-syntax, http://skitch.com/eads/fn49r/townsquare-editor-02-image-list) and find I prefer the approach to anything else I've used.
Thanks for the insight, David. (And I really should get autolinking working, djeez.)
Although I appreciate Drupal (up to a point) I'd be more in favor of building an app from scratch that never ever tries to be anything else than a wiki, which would likely make the experience smoother for both developers and end-users and also cut down on the footprint and improve the performance.
Coaxing a wiki out of Drupal is certainly possible and not even very hard, but there's a certain point at which I fear you'll start stumbling on the same kinds of issues I have with MediaWiki: it becomes a convoluted mess. I'm more in favor of the WordPress model, so to speak.
But anyway, the specific technology isn't even that important. The fact is simply that there are just so many features that are missing even in commercial packages, from auditing to easy content templates to offering aggregate views of structured data on wiki pages to good discussion pages.
I think the problem with wiki software is really that it's so easy to get a simple version working that nobody ever bothers to do it right. Paradoxically, hard challenges often make us work harder. (I've managed to learn myself to play the clarinet and the banjo, but can't manage to teach myself the harmonica.)
Why don't they add videos?
Developing countries need videos, not this:
http://goo.gl/ImRys
@stijn, thanks for the very thoughtful reply.
I agree with you on a theoretical level about a special-purpose wiki tool, but on a practical level I think Drupal has serious appeal because of its popularity in the mid-sized site market. And since I'm paid to do it and I'm an expert in it, that's where I'll be doing any and all wiki work myself.
But just like you say, the choice in technological platform isn't so important. For a wiki platform to be viable, I think it needs a critical mass of developers and users fairly quickly, and needs to have features tuned to the whoever will both adopt it and benefit quickly from adopting it. So the question then becomes: where are the users and implementers of such software?
Your point about the ease of writing a wiki is right on, which is why I think a sufficiently broad and engaged community is important from the outset. Otherwise, you're just talking about Yet Another Wiki, which, even if decent and successful, will likely appeal to a fairly small audience of geeks and techies.
There may be a parallel here in email clients: The quality open source email client scene was very fractured in the early 2000s, which meant it was all too easy for Google and others to create marginally better web-based alternatives and completely colonize the market.
I'm afraid it's already happening in the world of collaborative knowledge sharing: Google Docs is becoming my nonprofit's default way to collaboratively edit documents because our Trac-based wiki stinks, and my employer may be going to a 100% sharepoint-based solution.
You mentioned auditing (what did you mean?), structured wiki data, good discussion pages. What else would you like to see? I know I'd like to see good email integration and notifications for discussion pages, great full-text search, easy media embedding, and an increased emphasis on tools for collaborative document editing.
@Jim Pruett: I think it would be wildly expensive. What if they just curated videos and promoted tools like Mozilla's "Universal Subtitles" project?
Hey David,
With auditing I mean tools for finding gaps in information and for knowing when information is out of date and needs to be updated. At the very least, this would be something like an "expiry date" widget and an associated page that gives editors a quick view of all pages or page sections that are "stale". Especially useful for people using a wiki as a knowledge base.
Few people in organizations trust wikis, because after their grand introduction they usually languish and slowly become unreliable — auditing tools would be one way to combat that phenomenon.
I like your list too :-)
Stijn -- Loved this piece! I've spent a lot of time thinking about editing forms for work (enterprise case management software) and play (I use a terrible mediawiki to document family tree research). I couldn't agree more with all your points.
Have you looked at the Aloha editor project? They're doing some cool stuff, which I used to prototype a CMS that's (a) totally based on in-place editing, and (b) handles semi-structured data in a less insane way: http://richwiki.heroku.com/articles/1 See also: https://github.com/jimlindstrom/RichWiki
Interesting stuff, Jim. Haven't looked at the Aloha editor yet; I've put it on my todo list!
I have some prototypes myself that take a slightly different tack still: every documents consists of a number of blocks, and each of those blocks is of a certain type. (Think of a block as a sort of paragraph.) The most obvious types are headings, subheadings, body copy et cetera, but the fun thing is that you can then craft new block types like "person" or "address" or "ISBN" and what-not.
The reason I'm experimenting with blocks is because it allows you to freely mingle prose and structured data, but even more importantly because it allows you to create a special-purpose UI tailored for each block type.
If you're having a hard time imagining how something like that could look, it takes some inspiration from 37signals' BackPack: http://backpackit.com/tour.
Perhaps I'll get around to showing a demo in a blogpost some day :-)
Interesting take. The assumption underlying my approach is that pages are going to be free-form and that users just want to type narrative, but then use sidebar tables to describe the narrative with metadata. The thing I am optimizing for is speed and "naturalness" of editing.
It seems like your approach integrates metadata into the structure of the page, in a modular, additive way. That seems to optimize for better representing the structure of the data as a 1st class object, rather than as tacked-on metadata. I have trouble envisioning what the UI around editing would be, but I'm intrigued. Pull those demos out of the dustbin and share =)
As a developer, I'm all for more tightly structured Wikipedia data, but as contributor I greatly appreciate the accessibility of a loose structure. I think Wikipedia's success comes from having a loose structure that resulted in a much strong cultural/social process. To become a high-level Wikipedia Editor is much less about mastering the technical system as it is about participating in the social processes. And social participation is a much lower bar to entry---as well as a more novel, interesting, and satisfying experience---than data entry. While I recognize that technical understanding is necessary to make high level contributions to Wikipedia, I think more people would be willing to participate in designing a better MS Word than in designing a better MS Access (apologies for the Microsoft metaphor).
I do agree though that at this point Wikipedia should tighten up its structure, but I don't think Wikipedia would ever have become the broad success it is today if had begun with highly structured data.
Agreed, but I'd be happy to see Wikipedia just do something smart with their existing data-like information: the one thing worse than data-entry is data-entry inside of plain-text fields with no interface to guide you and only a list of obscure wikitext commands at your disposal.