Eat It, Colbert

Stephen Colbert fans, Onion editors take note: Wikipedia is a surprisingly reliable source of information.

Three groups of researchers claim to have untangled the process by which many Wikipedia entries achieve their impressive accuracy (1, 2, 3). They say that the best articles are those that are highly edited by many different contributors.

[…] The main lesson for tapping effectively into the ‘wisdom of crowds’, then, is that the crowd should be diverse. In fact, in 2004 Lu Hong and Scott Page of the University of Michigan in Ann Arbor showed that a problem-solving team selected at random from a diverse collection of individuals will usually perform better than a team made up of those who individually perform best — because the latter tend to be too similar, and so draw on too narrow a range of options (5). For crowds, wisdom depends on variety.

The story covers at least five different studies of Wikipedia accuracy and community structure. I’m not about to choose which paragraphs to excerpt and which to leave out, so go read the whole thing. Some studies raised my eyebrows, for example one group’s apparently circular decision to measure the quality of an entry based on the Wiki community’s internal ratings. But overall the article is well worth the time.

To me the most interesting point is the way that Wikipedia turns the old rule that people are smart, crowds are stupid entirely on its head. Wiki entries, which you can think of as a form of extended conversation or debate, only get more accurate the more people jump in, and diversity seems as important as total numbers.

Bloggers ought to pay attention to this. I would even extend the point to say that a diverse commentariat should be a good measure of accurate writing, although it presents a chicken and egg problem in that increasing the number of readers who will howl if you screw up one way or the other often makes writers more careful to get things right, or stick to verifiable facts. In my view that ought to to count strongly in favor of blogs like Obsidian Wings which take considerable care to maintain a multipartisan community.

What distinguishes internet communication from crowd behavior? For collective action like bulk emailings and phone campaigns, not much. People can subsume their will to a collective just as easily online as in a noisy mob. But in Wikipedia as in blogging silent agreement is death. We’re less like a mob than an enormous coffeehouse full of well-read people, or at least well-Googled, arguing with each other. The main effect of technology is to streamline that argument and facilitate communication, cross-referencing and fact-checking on an unprecedented scale.

Testing the converse principle, intellectual monocultures like LGF and the Office of Special Plans ought to, and do, put out unreliable crap. Surprise, filtering out the noisiest critics also shuts down your most enthusiastic fact-checkers.

***

While we’re talking about Stephen Colbert, note the very next story at Nature News:

South Africa expected to propose elephant cull

Public consultation could suggest killing to control population.

Wikiality, baby.

Share On Facebook
Share On Twitter
Share On Google Plus
Share On Pinterest
Share On Reddit

33 replies
  1. 1
    dreggas says:

    The converse is also being proven in the case of Conservapedia, of course Conservapedia also is a good place to test the theory that putting one hundred monkeys in a room full of type-writers will produce Shakespeare, but that is another experiment altogether.

  2. 2
    Zifnab says:

    Assuming the statement is “If you have many editors, you will generate a reliable information source”, wouldn’t the converse be, “If you generate a reliable information source, you have many editors”? I think you might mean the inverse – “If you do not have many editors, you will not generate a reliable information source.”

    Of course, Conservapedia is kinda skewed, in that it selects the most deliberately inept authors and (inadvertently) mixes in a heavy amount of spoof.

    And wft? Too many elephants? That sounds like bullshit to me.

  3. 3
    Pb says:

    Wikipedia turns the old rule that people are smart, crowds are stupid entirely on its head

    Not really. Mobs are stupid. Committees can be stupid. But many disparate individuals working on different chunks of the same topic, sometimes at cross-purposes? Smart–or smarter, at least. The individuals end up doing the heavy lifting, and the community (hopefully in accordance with their own guidelines) does the filtering–which is probably the weakest link in the chain.

  4. 4
    Tim F. says:

    And wft? Too many elephants? That sounds like bullshit to me.

    Well, farmers hate elephants. It could be like the US having too many wolves, until we killed the last one.

  5. 5
    ThymeZone says:

    Not really. Mobs are stupid. Committees can be stupid. But many disparate individuals working on different chunks of the same topic, sometimes at cross-purposes? Smart—or smarter, at least. The individuals end up doing the heavy lifting, and the community (hopefully in accordance with their own guidelines) does the filtering—which is probably the weakest link in the chain.

    A blog or Wiki is not a crowd. It’s a forum, or a mediu, in which individuals speak and hear each other. Crowds are stupid because they stifle the individual and/or enhance bad aspects of people that would otherwise be inhibited.

    “Two heads are better than one” is not only true, it’s scaleable. Two hundred heads are better than one. And better than one hundred heads.

  6. 6
    Andrew says:

    Conservapedia also is a good place to test the theory that putting one hundred monkeys in a room full of type-writers will produce Shakespeare, but that is another experiment altogether.

    Theory disproved!

  7. 7
    James Gary says:

    “Some studies raised my eyebrows, for example one group’s apparently circular decision to measure the quality of an entry based on the Wiki community’s internal ratings. But overall the article is well worth the time.”

    Yes. Without being too snarky, I’d be really interested in knowing how the researchers measured “accuracy.” Did they use the same method for judging the “accuracy” of Wikipedia’s entry on “Buffy The Vampire Slayer” as they did for the entry on “Israeli Foreign Policy?”

  8. 8
    Mark says:

    What distinguishes Wikipedia from a mob or a bunch of comments on a blog isn’t just the amount of users or the variety of their backgrounds. It’s that in its editing structure and user moderation, it has created a mechanism by which the most accurate information is weeded out of that mob.

  9. 9
    Bubblegum Tate says:

    Conservapedia also is a good place to test the theory that putting one hundred monkeys in a room full of type-writers will produce Shakespeare

    “It was the best of times, it was the blurst of times?!?”

  10. 10
    Andrew says:

    Wait Shakespeare didn’t write anything about tree octopi, did he?

  11. 11
    Krista says:

    We’re less like a mob than an enormous coffeehouse beerhall full of well-read people, or at least well-Googled, arguing with each other.

    We’re not civilized enough to be considered a coffeehouse, Tim.

  12. 12
    demimondian says:

    I find that Wiki is a good initial source for many issues, but, at least in some areas, tends towards politicization Go look at revert wars on, say, AJAX. I know the history well, as I know many of the people on the team which invented it (as in, worked with them in a single team kind of “know”). Wiki has, in the past, gone back and forth between acknowledging their work and denying it…since, after all, they were a team of Microsofties.

    The wisdom of crowds is a great thing, until it is transformed into the tyranny of the mob. I don’t see Wikipedia as having adequate protections against that transformation, and I do see that the site has made it several times.

  13. 13
    ThymeZone says:

    I do see that the site has made it several times.

    Egads, wrong information “several times?”

    Oh, the humanity!

    Look away, it is too hideous!

    Fuck me. We have instant access to a galaxy of useful information unknown to our parents, unimaginable by our grandparents. How AWFUL that it’s not perfect.

    I am weeping …. I have to go and blow my nose now ….

    { sobs }

  14. 14
    pharniel says:

    actually for general knowledge information or strong fandoms wikki does pretty well.

    it’s when you get into areas where only a few people actually understand the topic where it breaks down.

    Look at Websnark for a listing of the most egregious –
    someone requesting (and almost being granted) removal of Deconstructionist critque as being to vauge and not having a solid definition when that’s pretty much what defines the style.

    also the vaurious inteiontally mauled bibliographys where the response from the wikki editors (and attendent mob) when attempted to be corrected by the Individual in Question was that that they were violating no autobiography rule and thus could not correct thier own entries.

    Wikki sucks balls for reliablie information. For material for conspiracy theory games or a classic case study in mob rule with a hidden olgiarchy it rocks.

    Just because the French Revolutionary government got some things right doesn’t mean that there wansn’t a metric fuckton of wrong calls.

  15. 15
    pharniel says:

    errr, biographies…can’t think. need more coffee.
    or gin.

  16. 16
    RSA says:

    Three groups of researchers claim to have untangled the process by which many Wikipedia entries achieve their impressive accuracy (1, 2, 3)

    Interestingly, I happen to know one of the authors of the second reference. It’s not a conventionally peer-reviewed article, but my friend does good research and the paper is pretty interesting.

  17. 17
    ThymeZone says:

    Don’t know how much of your complaint is a valid appraisal .. no, that came out wrong. Don’t know how much of Wikipedia is junk and how much is good.

    What counts it whether people are catching on to the fact that they have responsibility for their own information streams. Just as you choose your nutrition diet, so do you choose your info diet. And you are responsible for processing the info that you consume, for figuring out what is true and what is not.

    As people learn to do that, the bad effects of bad information become diluted and less hideous. Wiki may be 30% trash (I dunno, just picked the number out of thin air) but if I know that, then I can deal with it. I’m the consumer of information, it’s my job to get it right, not to whine about bad information.

    This is true whether we are talking about Wikis or cable news or blogs.

    When people learn to ascertain what is true, then the incentives to provide good information will go up, and there will be more of it. As long as people are content to settle for bad information, then there are plenty of providers out there more than willing to give it to you.

    Caveat emptor. That’s the only scheme that will work, and as a consumer, I have no problem with it whatever. I know how to cross check things.

  18. 18
    goof says:

    This article is why conservatives are not good on the internet.
    FIrst, they are in denial about what happened last November. The election was so far beyond any polls expectations, on the left & right.
    What scared the GOP most, was the loses in statehouses, we are talking thousands of seat and 8 more Governorships. They blame the internet.
    They have to blame somebody, this is what they’ve choses as the method to explain, what goe’s wrong sine 1994. It wasn’t the internet, or the media or anything. It was them who were at fault, in dozens of different aspects,
    But, what really was unexpected was what voters actually did.
    It was the voter’s fault, and the voter’s blamed the GOP.

    Secondly, as they, in that denial, with this article, are trying to use that same method of attack, (we are still right mode-we just need to prove it someway)
    the internet, aka: Wikipedia is the taret today.
    Implied to be end-all source of liberal mis-information, which apparently 130 Million voted dedicated their lives for the months before last November to. If we are to believe the conservatives rationale for losing the election.

    Finally, this only supportes the voters beleif the GOP is, has been, and likely to be making false attacks and swift boating everything they can think of.
    On Feb 28, it’s Al Gore’s Oscar win. Gore didn’t win any Oscar, he wasn’t nominated. He was the announcer in a film, made by others who won the award. But the right cannot accept the fact, they have to make up their own.
    It’s stupid to do that, voters know better and this ony increases the GOP’s chances of being further marginalized in 2008.

    The internet does have some benefits to everyone, it is a good place to finds sources, prove facts, etc. The cons, in their own narrow targeting attacks, can only blame one cource at a time. This time it’s Wikipedia. In 2008, who knows maybe Google will be the bad guys. Google was probably a bigger srource than Wikipedia for voters in 2006, and Democrats were certainly first out of the gate in using it for their gain. Howard Dean proved that with his fund raising over the internet. A concept the GOP couldn’t grasp, and still hasn’t.
    The internet is a big thing, and it will grow for two simple reasons.
    People can get information anytime, not just at 6PM with the news or just from TV. They can also, within moments validate that information they found, as true, false or something inbetween. that is a fact of the internet user.

    Now the conservatives need to learn a few things, not from the Democrats and especially not from their so called advisors. They lost big-time the last time; and the GOP needs a blank slate after Bush is gone.
    1) They have to start accepting and admitting mistakes, even the small ones.
    2) They have to be open with information, not offer narrow made up facts.
    3) They have to show far more respect for the voters intelligence, and continue fighting them as being duped by the left, swayed by the media or as several conservaatives have called voters…Complete idiots.
    4) Allow the moderates within to do all the talking, so far we’ve only seen the most radical of the fartest righ faction. The militant religious evangelicals.
    5) Learn how to make press releases that encourage questions and internet research, that get people talking. that get you suggestions from those people.

    There is one fatal aspect of the conservatives view. They are 100% right all the time. The voters are 100% right all the time, and the GOP has to learn how to adapt and accept different ideologies and policies, if it will be part of the national poliitical future. Think of it this way, the customer is always right, and the GOP is selling us it’s wares and widgets.
    Don’t tell us that there is only one way to use or buy that widget, because we will walk right our of your store, and find the vey same widget, at the very same price whose best feature, is we can do this AND that with the widget.

    The internet is where we voters play, With the truth, with the faked.
    But we get to play with it, not conservatives or liberals.

    So what did you post here. Some great left wing conspiracy that took over Wikipedia and the idiot voters got sucked into believing it, and if the Wikipedia people had as high of ethics as conservatives do, this election would have never been stolen from those who know what’s right for America.

  19. 19
    Kent M says:

    ” Wiki entries, which you can think of as a form of extended conversation or debate, only get more accurate the more people jump in”

    And it’s worth noting that this is entirely true, or not at all – that Wikipedia is accurate depending on the time that an article is read. It seems that the chances are that an entry will become more correct over time and with more input, but the information that appears in the entry is entirely dependent on the latest edit.

  20. 20
    Kent M says:

    ThymeZone Says: “…as a consumer…”

    …but what about as a Human Being? :-)

  21. 21
    ChrisO says:

    Don’t be so surprised about the South African cull. Didn’t you know the population of elephants had tripled recently?

  22. 22
    Marcus Wellby says:

    Wow, I guess my 1996 Encarta CD isn’t much good anymore. But I still miss the days of flipping through the old Worldbook encyclopedia while also watching cartoons on rainy weekend mornings.

  23. 23
    ThymeZone says:

    …but what about as a Human Being?

    “I am not an animal! I am a human being!”

    — John Merrick (the Elephant Man)

  24. 24
    Richard 23 says:

    We’re not civilized enough to be considered a coffeehouse, Tim.

    I’ll drink to that.

  25. 25
    Zifnab says:

    Wow, I guess my 1996 Encarta CD isn’t much good anymore. But I still miss the days of flipping through the old Worldbook encyclopedia while also watching cartoons on rainy weekend mornings aftering bonking the neighbors wife.

    This way you sound like less of a nerd.

  26. 26
    Marcus Wellby says:

    This way you sound like less of a nerd.

    Ha! Indeed. Well, I don’t go much for nerdy behaviour as an adult, I would never think of such things — like posting comments on blogs.

  27. 27

    Not really. Mobs are stupid. Committees can be stupid. But many disparate individuals working on different chunks of the same topic, sometimes at cross-purposes? Smart—or smarter, at least.blockquote>

    Fine. Let’s see YOU storm the castle and kill the monster by yourself, asshole.

  28. 28
    Fledermaus says:

    To me the most interesting point is the way that Wikipedia turns the old rule that people are smart, crowds are stupid entirely on its head.

    Ah but you are only looking at one side of things. It is also misphrased: Consumers are dumb. But here we have highly specialized subjects and a large user base. Why do people write and edit entries – well they know a lot about the subject at hand and therefore enjoy talking about it.

    Once you wander off the beaten trail it’s hard to find someone you has read a lot about Pakistani cooking or the history of all those southern Russian republics. But with a large existing base it becomes much more likely that you will find someone who knows what they are talking about.

    Of course this all breaks down when you have entries for evolution or the book of revelations. Here emotions are too strong so everyone considers themselves an expert and entitled to edit whatever facts they disagree with.

    But for the vast majority of entries, let’s face it the number of people running to Wiki to make up lies about, say, The Siege of Hikida is pretty close to 0.

  29. 29
    dslak says:

    The articles on non-Western religions and philosophy are often full of fluff or else fundamentalist takes on issues. I’ve considered attempting to edit the ones on which I have some expertise, but it would pretty much be me alone against a group of fundie Hindus or Buddhists, so it probably wouldn’t be a good use of my time.

  30. 30

    “Two heads are better than one” is not only true, it’s scaleable. Two hundred heads are better than one. And better than one hundred heads.

    Not necessarily. Try expanding a programming team from two to two hundred…

    Seriously though – the wisdom of crowds works best when you have a lot of people who share expertise (not opinions or perspectives, but specific knowledge in a field) in a particular domain all working on a problem in their particular domain. Add in non-experts, and their accuracy rate starts to plummet. Present the same cohort of experts with a problem outside their domain, and you see the same decrease in accuracy. The other circumstance under which crowd wisdom works well is when a particular subject isn’t specialized – for instance, if 1000 people are asked to guess the number of jelly beans in a jar, the average response will be very close to the actual number of beans.

    Wikipedia’s disdain for ‘experts’ probably does drive down the accuracy of any articles with more mainstream appeal, ie. any subject on which Joe PseudoRandom feels entitled to pontificate. It will also suffer, as others have mentioned, where subjects attract sharp partisan divisions. It’s a weird kind of regression to the mean, whereby you produce a ‘truth’ that’s acceptable to a broad range of people. Doesn’t mean that it’s accurate, just acceptable.

  31. 31
    demimondian says:

    The other circumstance under which crowd wisdom works well is when a particular subject isn’t specialized – for instance, if 1000 people are asked to guess the number of jelly beans in a jar, the average response will be very close to the actual number of beans.

    Not true. There’s a canonical counterexample to this, and I’ll bet you use it every day.

    It’s called “Page rank”, and it’s the base of the Google search index. Larry Page’s insight was that the set of web authors is a huge crowd of local experts — linking from page a to page b shows a “value” connection between the two pages. Each link by itself reflects one person’s wisdom, and, by averaging among many of them, you get a source of information about the latent structure of the web.

  32. 32

    Not true. There’s a canonical counterexample to this, and I’ll bet you use it every day.

    Actually, I’d look at Google’s page ranking scheme as combining lots of ‘expert based’ samples (specialized pages providing domain-based sets of links) with huge numbers of generalist pages providing links of overall interest – in effect, combining the knowledge of lots of sets of domain experts with a larger, undifferentiated mass of jelly-bean counters.

  33. 33
    jenniebee says:

    Finally got onto Conservapedia.com & checked out their litanny of complaints against wiki. My absolute fave:

    Robert McHenry, former Editor-in-Chief for the Encyclopedia Britannica, wrote about Wikipedia’s bias and included this observation:[18]

    One simple fact that must be accepted as the basis for any intellectual work is that truth – whatever definition of that word you may subscribe to – is not democratically determined.

    Encyclopaedia Britannica’s truth arbitration process is that scholars decide which topics are important and should be included, and which topics are not not worth the paper. They then select a bevy of other scholars and ask each to contribute a thousand words on a narrowly defined topic. This is truth oligarchicly determined. Conservapedia has 58 homeschooled pre-teens copying and pasting their book reports into a wiki, which is then proofread by Andrew Schlafly. This is truth on a permanent festum fatuorum

Comments are closed.