Algorithm Rates Trustworthiness of Wikipedia Pages

Slashdot is powered by your submissions, so send in your scoop

Algorithm Rates Trustworthiness of Wikipedia Pages 175

Posted by CowboyNeal on Friday August 31, 2007 @07:38AM from the getting-it-right dept.

paleshadows writes "Researchers at UCSC developed a tool that measures the trustworthiness of each Wikipedia page. Roughly speaking, the algorithm analyzes the entire 7-year user-editing-history and utilizes the longevity of the content to learn which contributors are the most reliable: If your contribution lasts, you gain 'reputation,' whereas if it's edited out, your reputation falls. The trustworthiness of a newly inserted text is a function of the reputation of all its authors, a heuristic that turned out to be successful in identifying poor content. The interested reader can take a look at this demonstration (random page with white/orange background marking trusted/untrusted text, respectively; note "random page" link at the left for more demo pages), this presentation (pdf), and this paper (pdf)."

This discussion has been archived. No new comments can be posted.

Algorithm Rates Trustworthiness of Wikipedia Pages

Load All Comments

Search 175 Comments Log In/Create an Account

Comments Filter:

Light Bulb Moment (Score:5, Funny)

by dsginter ( 104154 ) writes: on Friday August 31, 2007 @07:41AM (#20422839)

Someone should make a wikipedia entry for this algorithm to see how trustworthy it is.

Share
twitter facebook
- algorithmic argumentum ad verecundiam (Score:2)
  
  by bareman ( 60518 ) writes:
  
  It's practically an automatic with people so codifying it for machine should be no surprise.
  
  http://en.wikipedia.org/wiki/Appeal_to_authority [wikipedia.org]
- Re:Light Bulb Moment (Score:5, Interesting)
  
  by marcello_dl ( 667940 ) writes: on Friday August 31, 2007 @09:33AM (#20423683) Homepage Journal
  
  Sounds crappy. Let's say you expose some important misdeed. You're likely to be edited out by an army of paid staff who keeps an eye on the 'net. (don't tell me I'm paranoid because i saw it happening and read about stuff like that in the news, even slashdot). You are not contributing much else to wikipedia because you simply wanted to expose what's in your knowledge, so you'll end up with a low karma.
  
  Anyway, i guess it'll be another pagerank or slashdot filter affair. People trying to beat it, devs trying to make it better.
  
  The plus is, there is not only wikipedia. You can always search the rest of the web.
  The minus is, you search the rest of the web with google which is equivalent if not worse.
  
  We need a good search engine on top of a tor network, and bandwidth to make it run smooth. Not many other way to achieve real net freedom.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by MrNaz ( 730548 ) writes:
    
    "We need a good search engine on top of a tor network, and bandwidth to make it run smooth. Not many other way to achieve real net freedom."
    Can you explain yourself a little more? I don't see how Tor would improve the quality of information being searched for. (Not arguing, just interested in your ideas)
    - Re: (Score:2)
      
      by marcello_dl ( 667940 ) writes:
      
      It's not about the quality of what is searched for (unless we get a little paranoid and consider the case where different content is returned by a site depending on the geolocation of the request IP).
      
      It's about the mere act of searching being reported as suspect activity. Or avoid profiling. Needless to say the potential for abuse of this freedom is huge.
  - Re: (Score:2)
    
    by eh2o ( 471262 ) writes:
    
    This is just one in a long series of "quality" ranking algorithms based on analysis of information networks. The next logical step is a meta-algorithm that ranks the quality of quality ranking algorithms.Metrics will include vulnerability to attack / manipulation, bias or skewness of results, logical consistency of quality estimation, logical / semantic consistency of content judged to be trusted or coherent, and reaction time to incorporation of new information. Its only a matter of time before some poor
  - Re:Light Bulb Moment (Score:4, Insightful)
    
    by PingPongBoy ( 303994 ) writes: on Friday August 31, 2007 @04:01PM (#20428229)
    
    Sounds crappy. Let's say you expose some important misdeed. You're likely to be edited out by an army of paid staff who keeps an eye on the 'net
    
    Nope. If you post one misdeed and that gets edited out, such is life but shouldn't affect your credibility that much because everyone is always getting edited out a few times in the long run.
    
    However, if you edit hundreds or thousands of different articles and people leave you alone, o great guru, you're good.
    
    Wikipedia's ultimate strength depends on the community's desire for good information, readiness to stomp on crap, and will to contribute. Conversely, Wikipedia would decay if people didn't give a rat's ass about Wikipedia and let it go to ruin like an unweeded garden. This mechanism of quality control needs to be applied down the hierarchy of categories, subcategories, and articles. It's understandable that certain areas will have more pristine content overall while other areas will be populated with childish and wanton ideas. Thus, a contributor evaluation program can be tested.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by marcello_dl ( 667940 ) writes:
      
      > Nope. If you post one misdeed and that gets edited out, such is life but shouldn't affect your credibility that much because everyone is always getting edited out a few times in the long run.
      
      That makes sense. But if you're one of the bad guys then all you need is a big provider with lots of ip ranges, clean up of previous cookies and altering the user agent, and you have a similar weighted counterattack for the guy who expose a misdeed.
Seems a bit dangerous (Score:5, Insightful)

by fymidos ( 512362 ) writes: on Friday August 31, 2007 @07:45AM (#20422871) Journal

>If your contribution lasts, you gain 'reputation,' whereas if it's edited out, your reputation fails

And the editor wars start ...

Share
twitter facebook
- Re:Seems a bit dangerous (Score:4, Insightful)
  
  by N!k0N ( 883435 ) writes: <dan&djph,net> on Friday August 31, 2007 @07:52AM (#20422921)
  
  Yeah, that is a bit of a "dangerous" way to go about rating the content, however I think it could be a step in the right direction. If this can be improved, perhaps the site will gain a better reputation in the eyes of professors. Now, I don't doubt that there is a lot of misinformation on the site (intentional or otherwise); however, a good deal of the information I have used for research papers or to quickly check something seems to be confirmed elsewhere (texts, journals, etc).
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by xappax ( 876447 ) writes:
    
    If this can be improved, perhaps the site will gain a better reputation in the eyes of professors.
    
    No, it won't gain a better reputation in the eyes of professors (at least decent professors) for two reasons:
    
    1) It's an inherently flawed algorithm and easily gameable. It's useful as a very vague unreliable data-point, and not much else.
    
    2) Wikipedia is not a source for academic research, and never will be. If it's anything to academics, it's a place to go to get some clues on how to proceed with their
    - Hmmmmmmm (Score:2)
      
      by mcmonkey ( 96054 ) writes:
      
      2) Wikipedia is not a source for academic research, and never will be.
      Your comment got me to thinkin'. (and on a Friday! Damn you!)
      The big thing in academic research is peer review, and what is Wikipedia but the extension of peer review to the larger community? I'm certainly not a fanboi and don't use Wikipedia as a source for anything work related, but I'm not too quick to add "never" to the end of that statement.
      When I go to a peer-review journal, either as a source for research or an outlet of pub
      - Re: (Score:2, Insightful)
        
        by skoaldipper ( 752281 ) writes:
        
        What's prevents Wikipedia setting up a foo area moderated by a panel of foo experts known to the foo community?
        Define experts.
        
        Wikipedia does an extroadinary job from a wide variety of peer resources, both professional and layman alike. So called "experts" like academia are just as political in their research and analysis as well - specifically, in the social sciences. Peer review never really amounts to much more than a consensus, but not necessarily an accurate one. Objectivity is the holy grail which I
      - Re: (Score:2)
        
        by xappax ( 876447 ) writes:
        
        The big thing in academic research is peer review, and what is Wikipedia but the extension of peer review to the larger community?
        
        Wikipedia does use peer review, but it's a different kind than what we see in the academic community. If something is peer reviewed in Wikipedia, it means that other people are able to confirm that all the listed information has been published in reliable sources. "Verifiability, not truth" as they say. If something is peer reviewed in the scientific community, it means tha
    - Sounds familliar (Score:2)
      
      by COMON$ ( 806135 ) * writes:
      
      obg quote references: (Thanks for most of them goes to www.av8n.com)
      We will never make a 32-bit operating system, but I'll always love IBM. -Gates
      What, sir, would you make a ship sail against the wind and currents by lighting a bonfire under her deck? I pray you, excuse me, I have not the time to listen to such nonsense. - Napoleon
      I watched his countenance closely, to see if he was not deranged ... and I was assured by other senators after he left the room that they had no confidence in it. - U.S. Senato
  - Re: (Score:2)
    
    by l0b0 ( 803611 ) writes:
    
    Sounds to me like this should really be made into a GreaseMonkey script or Firefox extension, to avoid having an "official" algorithm that everybody will try to appease.
- Re: (Score:3, Informative)
  
  by ajs ( 35943 ) writes:
  
  Editor wars are an old thing. The real concern I'd have would be how you deal with old editors who don't contribute anymore (but were "trustworthy" when they did) vs. new editors. Overall, I think it's a good idea, and I would go so far as to say that MediaWiki should offer a feature that performs this analysis for you.
  
  -~~~~
  - Re: (Score:3, Insightful)
    
    by fymidos ( 512362 ) writes:
    
    > Editor wars are an old thing
    
    but they get a whole new meaning when it makes sense to find all edits by an editor, delete them, and then rewrite them as your own...
    - - Re: (Score:2)
        
        by fymidos ( 512362 ) writes:
        
        They keep the authorship of all text, but if i delete somebodys text, copy it, and submit it as mine, the authorship changes. So, yes, it does affect his reputation because the text that belongs to him will be deleted.
        
        Re: (Score:3, Informative)
        
        by ajs ( 35943 ) writes:
        
        You're not actually reading the text that they linked to, are you?
        
        We're not talking about Wikipedia's concept of authorship, here, but the tool's. The tool tracks who first wrote something and doesn't re-assign authorship because it was removed (e.g. by a vandal) and then restored.
        
        You would have to remove what they wrote and then restore it in your own words in such a way that your edit was good enough to be retained by the community. In which case, the system worked.
        
        Overall, I think it would be an excellen
- Re: (Score:2)
  
  by Bongo Bill ( 853669 ) writes:
  
  And the editor wars start ...
  You misspelled "continue."
- - Re: (Score:2)
    
    by NickCatal ( 865805 ) writes:
    
    If you read it it does say that reverting vandalism will improve your reliability.
    
    Only problem is, if I continually revert vandalism, am I not also inflating my own rep when I decide to go make an edit that turns out to be incorrect?
    
    It is not only a really good idea, it is a GREAT idea, but as a Wikipedia editor who has introduced some incorrect facts (and changed them back later or had them changed for me, thankfully) into the site I am a little worried on how much trust it gives each user. I have a lot of
Hmmm... A reputation metric... (Score:4, Funny)

by Colin Smith ( 2679 ) writes: on Friday August 31, 2007 @07:46AM (#20422881)

It'd be nice if it could be generalised to other sites...

Share
twitter facebook
- Re: (Score:1)
  
  by PJ1216 ( 1063738 ) * writes:
  
  I doubt it could be. The metric itself would be inherently different on various pages. The metric here is longer lasting implies a trustworthy source. On other sites, this metric may be worthless or the exact opposite (sites that constantly update or change, etc.) This metric is tailored to a wiki, so maybe other wiki-sites, but not other sites in general.
  - - Re: (Score:2)
      
      by shani ( 1674 ) writes:
      
      I think he was joking about the long-established /. karma system, Mr 7-digit ID.
      
      Don't be too hard on him. The humor of us old-timers is often missed on you kids today.
- Re: (Score:2)
  
  by zeromorph ( 1009305 ) writes:
  
  Ssssh. I've know something better: reputation_algorithm 2.0, just let people do it, call "reputation" "karma" (just for the geek factor) and I predict it will be a great success at least in the stranger corners of the internet.
  - Re: (Score:2)
    
    by Anonymous Brave Guy ( 457657 ) writes:
    
    I wonder whether nominating an editor on Wikipedia a "karma whore" will result in a net increase or decrease of "reputation" for the nominee. :-)
Godwin's Second Law (Score:3, Insightful)

by Anonymous Coward writes: on Friday August 31, 2007 @07:47AM (#20422887)

Every paper touting automatic adjustments for gaming the system becomes obsolete the moment it is published.

(Godwin didn't publish this, but I might get around to editing his Wikipedia entry to say that he did).

Share
twitter facebook
7 years??? (Score:3, Interesting)

by Anonymous Coward writes: on Friday August 31, 2007 @07:49AM (#20422895)

I've been noticing some of the edit histories for articles that are 5 years old on Wikipedia stop well before 5 years ago. Were some of the edit histories been lost or deliberately truncated?

Share
twitter facebook
- Re: (Score:2, Informative)
  
  by Anonymous Coward writes:
  
  Some edit histories are also completely messed up in random order. Look at the weird edit history for Wikipedia's article on Pi "Revision as of 21:54, 8 September 2002" precedes older revision "Revision as of 06:17, 5 December 2001" [wikipedia.org]
  How can we trust the Wikimedia software if it corrupts the edit database?
  - Re: (Score:2, Informative)
    
    by Stooshie ( 993666 ) writes:
    RTFA!
    
    The demo is based on the Wikipedia dump dated February 6, 2007. The demo contains pages that are contiguous in the dump; pages were not selected manually or individually. The demo contains the last 50 revisions of each page (or fewer, for pages with fewer revisions).
    Occasonally, the coloring breaks the Wikimedia interpretation of the markup language. We are trying to resolve all such issues by locating the coloring information appropriately.
    The algorithms are still very preliminary.
    No, you cann
    - Re: (Score:2)
      
      by Goaway ( 82658 ) writes:
      
      Perhaps you should, instead, try to read and understand the post that you are replying to.
- - Massive truncation of edit histories (Score:2)
    
    by FeatureBug ( 158235 ) writes:
    
    I guess you meant "But other admins can still see it [any edit deleted by an administrator]"?
    Is ordinary admin (non-oversight) deletion used frequently compared to oversight deletion? I've seen articles where the entire edit history before a certain date containing several years' worth of edits was erased.
    What could be causing some edit histories to get out of chronological order as mentioned in this post [slashdot.org].
Doesn't take into account common myths (Score:5, Interesting)

by Cryophallion ( 1129715 ) writes: on Friday August 31, 2007 @07:51AM (#20422905)

So, if there is a myth that a lot of people believe is true, then it will stay up there as it is not challenged. So, it still gets reputation, and therefore more credibility, making it more likely that the myth will be perpetrated.

Also, if someone hasn't noticed something that is wrong on an esoteric entry, it will also be given credibility, and once again be more likely to be considered to be fact.

While you could add voting to the algorithm to have people vote on whether it is true, that still gets destroyed by someone who just votes because they think it's true, not because they have verified it.

Either way, it potentially gives additional credibility to something that may be very wrong.

Share
twitter facebook
- It doesn't have to be perfect (Score:5, Insightful)
  
  by KingSkippus ( 799657 ) * writes: on Friday August 31, 2007 @08:14AM (#20423069) Homepage Journal
  
  No algorithm, except maybe personally checking every single article yourself, will ever be perfect. I suspect that the stuff you talk about will be very rare exceptions, not the rule. In fact, one of the reasons that it is so rare is because people who know what the actual truth of a matter is can post it, cite it, and show it for all to see that some common misconception is, in fact, a misconception. This is much better than, say, a dead tree encyclopedia where, if something incorrect gets printed, it will likely stay that way forever in almost every copy that's out there. (And, incidentally, no such algorithm can exist, since dead tree encyclopedias generally don't include citations and/or articles' editing histories.)
  
  The goal wasn't to create a 100% perfect algorithm, it was to create an algorithm that provides a relatively accurate model and that works in the vast majority of cases. I don't see any reason this shouldn't fit the bill just fine.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Insightful)
    
    by duggi ( 1114563 ) writes:
    
    Why bother with an algorithm in the first place. Wikipedia is good for learning facts. If someone wants to know what Mary's room experiment was, they can find it. But if they want to know who did it and what kind of a person he is, should they not be referring to two or more sources? I guess the problem with credibility arises only when there is an opinion involved. It might work , sure, but when you come to know that the article is one big lie, would you not do some more research on finding out what is rig
  - Re: (Score:2)
    
    by Colin Smith ( 2679 ) writes:
    
    I suspect that the stuff you talk about will be very rare exceptions, not the rule.
    Not necessarily the case. All it takes is a good propagandist. Knowledge is power, if you can fool most people into believing a lie then they'll maintain the lie for you. It can take decades for the truth to come out, even if it's relatively obvious that the lie doesn't work.
- Re: (Score:2)
  
  by SQLGuru ( 980662 ) writes:
  
  Another way to increase your standing is to invent pages of content noone would ever go to (Xpi - a specific hovercraft model) or to just make small grammatical shifts so that your updated content ages while someone more reliable loses credibility.
  
  Layne
- Re: (Score:2)
  
  by mcrbids ( 148650 ) writes:
  
  So, if there is a myth that a lot of people believe is true, then it will stay up there as it is not challenged. So, it still gets reputation, and therefore more credibility, making it more likely that the myth will be perpetrated.
  
  Yep. There are lots of these. Snopes [snopes.com] is full of these - "everybody knows it's true" but yet it's false.
  
  Also, if someone hasn't noticed something that is wrong on an esoteric entry, it will also be given credibility, and once again be more likely to be considered to be fact.
  
  Oh, you
Seems to work ... (Score:5, Funny)

by Purity Of Essence ( 1007601 ) writes: on Friday August 31, 2007 @07:51AM (#20422909)

Seems to work, the entire page turned orange.

Share
twitter facebook
- Re: (Score:2)
  
  by Alsee ( 515537 ) writes:
  
  What does blue mean?
  
  -
hmmm... (Score:5, Funny)

by PJ1216 ( 1063738 ) * writes: on Friday August 31, 2007 @07:52AM (#20422917)

They should just call it wiki-karma.

Share
twitter facebook
- Re: (Score:2)
  
  by Magada ( 741361 ) writes:
  
  Ac'lly, they shoyld call it wiki-staleness. Pages/sections which are not edited for AGES should be marked in a sickly green and flagged for editing, as the information is likely to have been obsoleted in some way (yes, even historical information).
#REDIRECT (Score:5, Insightful)

by Chris Pimlott ( 16212 ) writes: on Friday August 31, 2007 @07:56AM (#20422943)

It appears they include #REDIRECT pages; the very first page the random link took me to was Cheliceriformes [ucsc.edu], with the #REDIRECT line in orange. Seems an easy way to gain trust, once a redirect is created it is hardly ever changed.

Share
twitter facebook
- Re: (Score:2)
  
  by Chris Pimlott ( 16212 ) writes:
  
  Whoops, I misread the summary; I thought orange was trusted, so maybe they have special consideration for redirects. Or maybe that one redirect is a fluke; I can't tell now that the /.ing has begun.
- Re: (Score:2)
  
  by UnHolier than ever ( 803328 ) writes:
  
  That's only if the redirect points to the correct page. If it is vandalism, or if it points to article with little relevance to the term searched for, the redirect will be removed, hence losing reputation.
I dunno about this system. (Score:5, Insightful)

by Wilson_6500 ( 896824 ) writes: on Friday August 31, 2007 @07:57AM (#20422949)

Does it take into account magnitude of error corrections? If major portions of someone's articles are being rewritten, that's a good reason to de-rep them. If someone makes a bunch of minor spelling or trivial errors, then that's not necessarily a reason to do so.

And, of course, there is the potential for abuse. If the software could intelligently track reversions and somehow ascribe to those events a neutral sort of rep, that would probably help the system out.

As it stands, they're essentially trying to objectively judge "correctness" of facts without knowing the actual facts to check. That's somewhat like polling a college class for answers and assigning grades based on how many other people DON'T say that they disagree with a certain person in any way.

Share
twitter facebook
- Re: (Score:2)
  
  by IBBoard ( 1128019 ) writes:
  
  I was thinking the same thing. Surely it would also penalise you if you didn't take a sufficiently approved writing style and it was re-written, including the same facts but different words, or if the article was restructured and that caused some rewording.
  
  I guess it's a start, but pure longevity of content isn't the best metric for trustworthiness.
- Re: (Score:2)
  
  by ajs ( 35943 ) writes:
  
  I would expect it to consider the percentage of original text that remains. For example, if you write 2 paragraphs and over time 20 words are edited by others, that's a pretty decent rate.
- Re: (Score:2)
  
  by shird ( 566377 ) writes:
  
  Although vandalism and repeat offendors etc often make small edits, such as inserting a 'not' or repeatedly inserting a url or something. Someone who makes a large edit has gone to a lot of effort, and this is probably more trustworthy than soemone who makes a quick hack. People who insert intentionally misleading information wouldn't go to too much effort to just have their edits reverted or easily noticed, so would just make small edits.
  
  So I don't think someone who goes to a lot of effort to insert a larg
  - Cut and paste edits (Score:2)
    
    by benhocking ( 724439 ) writes:
    
    I've seen some very large, profanity-laden edits. Many of these more than double the size of the article. These are not the majority of vandalisms, but they are a significant fraction.
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  They're trying to assess trustworthiness, not correctness. They're different.
  
  They've probably fallen slightly short of trustworthiness, but they've hit truthiness straight on.
I suspect this heuristic measures.... (Score:5, Insightful)

by Anonymous Coward writes: on Friday August 31, 2007 @07:57AM (#20422957)

the relative controversy of the item being edited.

If I edit a history page of a small rural village near where I live, I can guarantee that it will remain unaltered. None of the five people who have any knowledge or interest in this subject have a computer.

If I edit an item on Microsoft attitude to standards, or the US occupation of Iraq, I'm going to be flamed the minute the page is saved, unless I say something so banal that noone can find anything interesting in it.

But my Microsoft page might be accurate, and my village history a tissue of lies....

Share
twitter facebook
- AfD: nn (Score:3, Insightful)
  
  by tepples ( 727027 ) writes:
  
  If I edit a history page of a small rural village near where I live, I can guarantee that it will remain unaltered. None of the five people who have any knowledge or interest in this subject have a computer.
  If nobody else who has a computer cares, then it's less likely that your edits can be backed up with reliable sources [wikipedia.org]. In fact, people might be justified in nominating the article for deletion on grounds of lack of potential sources [wikipedia.org].
- Re: (Score:2)
  
  by zoney_ie ( 740061 ) writes:
  
  A small rural village? Unless you've put a reasonably decent bit of well-presented detail up about it, there's every possibility it'll be deleted as unverified, non-notable, etc., etc.
Tuned for Subject Matter (Score:5, Insightful)

by erroneous ( 158367 ) writes: on Friday August 31, 2007 @07:58AM (#20422961) Homepage

Sounds like a worthy start to the process of introducing more trustworthyness into Wikipedia entries, but this maybe needs tuning for content type too.

Afterall just because someone is a reliable expert at editing the wikipedia entries on Professional Wrestling [wikipedia.org] or Superheroes [wikipedia.org] doesn't necessarily mean we should trust their edits on, for instance, the sensitive issues of Tibetan sovereignty [wikipedia.org].

Share
twitter facebook
- Re: (Score:2)
  
  by xappax ( 876447 ) writes:
  
  Sounds like a worthy start to the process of introducing more trustworthyness into Wikipedia entries
  
  It's not, and the reason is that any attempt to introduce more "trustworthiness" into Wikipedia is a waste of time. People distrust Wikipedia because of its most basic, core concept: anyone can contribute. In order to get these people to trust Wikipedia, you'd have to eliminate that core concept. A trust system like this algorithm will just prompt "nay-sayers" to point out how it's not reliable either -
- Re: (Score:2)
  
  by ASBands ( 1087159 ) writes:
  
  While you're absolutely correct, you must also factor in how people generally behave. How often is a reliable expert on Professional Wrestling going to edit the issue of Tibetan sovereignty? Sure, somebody could, but the goal isn't to get a black and white objective analysis of right and wrong, it's to get a gray area subjective rating of "trustworthiness." Better yet, it appears to work - in my article on the politics of Djibouti, all the "facts" were highlighted.
- Re: (Score:2)
  
  by rm999 ( 775449 ) writes:
  
  That's the whole point - he isn't trustworthy if he is going around editing things he doesn't know about. Theoretically, his edit to Tibetan sovereignty will be removed if he adds something untrue, which will hurt his trustworthiness. Additionally, if he wants his trustworthiness to remain high, he won't be editing too many things he doesn't know about. This trustworthiness number attached to him will pressure him to edit only things he knows about, which is a win for the site.
  
  This metric reminds me a littl
Unpopular but neutral points of view? (Score:5, Interesting)

by Knuckles ( 8964 ) writes: <knuckles@dantiEULERan.org minus math_god> on Friday August 31, 2007 @08:01AM (#20422981)

I realize that an encyclopedia by definition will always emphasize the established majority opinion about any given subject. But it seems that this tool might strengthen majority opinions beyond what is reasonable. If you happen to edit an article by adding valid but unpopular dissenting points of view, and the other contributors are sufficiently boneheaded, you lose karma (or whatever the tool calls it) for no good reason. This might then easily develop a life of its own, and you are screwed.

Share
twitter facebook
- Re: (Score:2)
  
  by ajs ( 35943 ) writes:
  
  Generalize that to controversy of any form. I have spent some time editing articles that focus on bigotry, genitalia and other topics which get a lot of vandalism... I wonder how that would be dealt with....
- - Re: (Score:2)
    
    by Knuckles ( 8964 ) writes:
    
    Both are good points, I think. But while articles indeed should have links to sources, by far not all do. And by far not all information that isn't backed up by source links right now is worthless or wrong.
    - Re: (Score:2)
      
      by tepples ( 727027 ) writes:
      
      And by far not all information that isn't backed up by source links right now is worthless or wrong.
      One of Wikipedia's core policies is that if you can't find a reliable source for an assertion, you should probably delete the assertion from the article. Wikipedia wants verifiability, not truth [wikipedia.org].
      - Re: (Score:2)
        
        by Knuckles ( 8964 ) writes:
        
        I know, we still have to deal with reality though. Saying "this is not a problem because all statements should have sources" is not helpful, when a huge number of statements don't.
    - Re: (Score:2, Funny)
      
      by gplus ( 985592 ) writes:
      
      And by far not all information that isn't backed up by source links right now is worthless or wrong.
      I want that sentence taken outside and shot. :)
      - Re: (Score:2)
        
        by Knuckles ( 8964 ) writes:
        
        Funny. But you know what I mean: often you come across a statement that you know is true, but is tagged to need a citation. Now, I agree that such statements are not particularly great in an encyclopedia, but the point still stands that the lack of a link does not per se make the statement untrue.
        
        This is made more difficult in case of topics that are (a) not well documented in general, and (b) happened before the internet became mainstream. Anyone who has ever searched for little-known, pre-internet stuff t
        
        Re: (Score:2)
        
        by SL Baur ( 19540 ) writes:
        
        Yeah, that's a good point. Sometimes a valuable reference disappears too.
        
        One time when I was really bored, I went through the Wikipedia checking all the entries I could find on baseball players who had played professional baseball in Japan. In general they were pretty bad, often times not even mentioning specific teams let alone any statistics.
        
        There used to be a really excellent site that had rosters and statistics going back at least into the 90's. It was even translated into English and unfortunately,
  - Re: (Score:2)
    
    by oni ( 41625 ) writes:
    
    All wikipedia entries should have some reference to source material to be considered valid.
    
    That's true, but that's not what this tool is looking at. I might go edit the conservapedia, adding valid, multiply-sourced facts - but then immediately get a revert. And this would happen again and again.
    
    Now this tool comes along and says, "ah ha! everything this guy writes gets reverted. He's obviously not trustworthy." And now my supposed untrustworthiness is used as an excuse to remove everything else I contri
    - Re: (Score:2)
      
      by Knuckles ( 8964 ) writes:
      
      Thanks, you expressed my point much more clearly than I could.
- - Re: (Score:2)
    
    by Knuckles ( 8964 ) writes:
    
    Funny, but I don't see that in reality. I read /. at threshold -1 to watch moderation misuse, and despite the frequent bitching about groupthink, it is extremely rare that anyone is erroneously modded to even 0, much less -1.
Tyranny of the majority (Score:5, Insightful)

by G4from128k ( 686170 ) writes: on Friday August 31, 2007 @08:06AM (#20423015)

Although this method will certainly help filter pranks and cranks, it won't help if the "consensus" among wikipedia authors is wrong. If a true expert edits a page, but the masses don't agree with the edit, they will undo the expert's addition and give the expert a low reputation. Thus, the trust rating becomes a tool for maintaining erroneous, but popular ideas.

That said, I can't help but believe that this tool is a net positive because it makes points of debate more visible. One could even argue that it literally highlights the frontiers of human knowledge. That is, high-trust (white) text is well known material and highlighted (orange) text represents contentious or uncertain conclusions.

Share
twitter facebook
- Re:Tyranny of the majority (Score:5, Insightful)
  
  by Anonymous Brave Guy ( 457657 ) writes: on Friday August 31, 2007 @08:35AM (#20423221)
  
  Yes, this system demonstrates the correlation between the content and the majority opinion, not between the content and the correct information (assuming such objectively exists).
  
  Of course, if you take as an axiom that the majority opinion will, in general, be more reliable than the latest random change by a serial mis-editor, then the correlation with majority opinion is a useful guideline.
  
  Something that might be rather more effective, though perhaps less practical, is for Wikipedia to bootstrap the process much as Slashdot once did: start with a small number of designated "experts", hand-picked, and give them disproportionate reputation. Then consider secondary effects when adjusting reputation: not just whether something was later edited, but the reputation of the editor, and the size of the edit.
  
  This doesn't avoid the underlying theoretical flaw of the whole idea, though, which is simply that in a community-written site like a wiki, edits are not necessarily bad things. Someone might simply be replacing the phrase "(an example would be useful here)" with a suitable example. This would be supporting content that was already worthwhile and correct, not indicating that the previous version was "untrustworthy".
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by rhizome ( 115711 ) writes:
    
    Yes, this system demonstrates the correlation between the content and the majority opinion, not between the content and the correct information (assuming such objectively exists).
    
    This objectivity does not and can not exist. Godel proved this one in mathematics before Derrida popularized it in literary criticism.
    - Re: (Score:2)
      
      by Anonymous Brave Guy ( 457657 ) writes:
      
      I think you're being a little too clever. If you're talking about an axiomatic system, it's pretty objective to state the axioms, for example...
- Re: (Score:2)
  
  by Ctrl-Z ( 28806 ) writes:
  
  Maybe the article heading should have said "Algorithm Rates Truthiness of Wikipedia Pages". Isn't that what's happening here?
- Re: (Score:2)
  
  by costas ( 38724 ) writes:
  
  This is just a start; you can go a long ways trying to determine who's an expert by checking the extent and trustworthiness of their contributions within an article "cluster" (where clusters can be determined through link graphs or content correlation).
  
  Wikipedia can also try using implicit "voting" on articles by tracking how many of their users have read a page and "approved" it by not changing it. Here your vote can also be linked to your trustworthiness. And of course you can have explicit voting /.-st
Algorithms are handy (Score:2, Offtopic)

by rinkjustice ( 24156 ) * writes:

when things can be quantified and measurable. I've always wondered about the algorithm of a brand's worth. What is the logo's value, in relation to the slogan, and the consumer experience?

For instance, Google has a strong brand, despite their hideous logo and "Don't be evil" slogan, because the consumer experience is so good. Coca-Cola, on the other hand, score big with their logo's distinctive cursive script, despite ongoing critisms of its health effects and numerous allegations of wrongdoing by the compa
- - Re: (Score:2)
    
    by rinkjustice ( 24156 ) * writes:
    
    Alot of products taste good, and yet don't dominate the market like Coke does. You have to admit there is more than simply taste that's involved.
    - - Re: (Score:2)
        
        by rinkjustice ( 24156 ) * writes:
        
        I gotcha, and I think you've made a good point. But we come back to the marketing of it, and much of a company's marketing success relies on the image, the personality if you will, of the product. It's referred to as branding, and there are many aspects of branding that need to be considered (I'll point to an article [printpusher.com] I wrote on the very subject). What I'd like to know and haven't been able to get an answer to, is, is there a branding formula or algorithm out there?
A reasonable first step... (Score:2, Funny)

by dbolger ( 161340 ) writes:

...but call me when there's a tool to measure the truthiness of an article.
Goddamn... (Score:5, Funny)

by gowen ( 141411 ) writes: <gwowen@gmail.com> on Friday August 31, 2007 @08:10AM (#20423049) Homepage Journal

How did they pass up the chance to name this algorithm "Truthiness"? [wikipedia.org]

Share
twitter facebook
Don't Care. (Score:2, Insightful)

by pdusen ( 1146399 ) writes:

I might give a damn if Wikipedia editors had any actual interest in keeping articles truthful.
- Re: (Score:2)
  
  by shutdown -p now ( 807394 ) writes:
  
  I might give a damn if Wikipedia editors had any actual interest in keeping articles truthful.
  And you believe that none of us have such an interest? Not a single of ... I dunno... how many are there these days? several thousand on en-wiki alone?
I do not trust wikipedia on any "divisive issue" (Score:2)

by Shivetya ( 243324 ) writes:

unless it is consistent with what I already know to be true or have had time to verify against other sources.

too many zealots rule certain categories and unfortunately too many of the same are the very powers that be.
- Re:I do not trust wikipedia on any "divisive issue (Score:2)
  
  by Xtifr ( 1323 ) writes:
  
  > unless it is consistent with what I already know to be true
  
  Absolutely. I keep trying to replace all their lies about quantum mechanics with my truth about the Electro-Flux Aether and Spiritual Gravitation, and I keep getting reverted.
  
  > or have had time to verify against other sources
  
  Ah, so you do understand how Wikipedia should be used. Good on yer, mate. :)
  
  > too many zealots rule certain categories
  
  Yeah, like those bastards who keep trying to insist that the Holocaust actually happened, that ev
This will promote one thing (Score:2, Insightful)

by Daimanta ( 1140543 ) writes:

Groupthink.
Maybe in the future (Score:2, Funny)

by Unique2 ( 325687 ) writes:

What we really need is some sort of algorithm that compares new information to that which is already stored. It then could test hypotheses to gain further understanding. Unfortunately a machine with enough processing power to run this "critical thinking and understanding" algorithm would be impossible to build with today's technology. We would need a new type of processor that has maybe billions of "organic neurons", it would need to be equipped with highly sophisticated sensors, a method of self transporta
Should be called "stability" (Score:3, Insightful)

by Random832 ( 694525 ) writes: on Friday August 31, 2007 @09:16AM (#20423551)

"trustworthiness" doesn't enter into whether something gets edited out, for precisely the same reason a need for this is perceived at all: it can be edited by anyone!

Share
twitter facebook
Isn't this old news? (Score:2)

by ta bu shi da yu ( 687699 ) writes:

For a site that prides itself on being up their with announcing new things, this is really pretty much old news.
Algorithm doesn't prove what it thinks it does (Score:3)

by MSTCrow5429 ( 642744 ) writes: on Friday August 31, 2007 @10:41AM (#20424569)

That algorithm is a model that does not match real world data. It might be useful to measure who has protection from the bureaucracy, but it won't and can't decipher how true something is simply by how many times and at what frequency people scribble over it. This algorithm is psuedo-scientific, by assuming a premise without investigating the veracity of said premise, and then running away with it as if it were a proven one.

Share
twitter facebook
It's progress over edit counts (Score:3, Interesting)

by Animats ( 122034 ) writes: on Friday August 31, 2007 @10:54AM (#20424823) Homepage

One big problem with Wikipedia has been that editor status, and promotion to "adminship", is based on edit counts, the number of times someone has changed something. The editors with huge edit counts don't generally write much; it takes too long. Slashdot karma is a more useful metric than edit counts, but Wikipedia doesn't have anything like karma.
I'd suggested on Wikipedia that we needed a metric for editors like "amount of new text that lasted at least 90 days without deletion". This UCSC thing is a similar metric.

Share
twitter facebook
- Re: (Score:2)
  
  by The One and Only ( 691315 ) * writes:
  
  I don't think that's been a pressing or urgent problem for a long time--it's more common for adminship candidates to be judged based upon the number of Featured Article development pushes they've been involved in than upon edit count, and it has been for months if not years.
Never confuse popularity with factual truth (Score:2, Insightful)

by presidenteloco ( 659168 ) writes:

They might be somewhat correlated, on a statistical basis, over
many cases, but there are many individual cases and times
when the currently popular view is wrong and the lone
wolf opinions are later proven to have been correct.

This algorithm would seem to be more of a popularity contest
than a truth finder. I think we have to be very wary of
the truth by mass agreement theory.

Hint: Remember the "weapons of mass delusion" ?
I bet someone commenting that the US government is lying
through their teeth about it would
PageRank (Score:2)

by 12357bd ( 686909 ) writes:

The algorithms looks very similar to the Google's Pagerank. Take edition time as inverse of links to/from, and the whole concept looks very similar. The question is, PageRank was terribly biased once people started to automate cross linking, will this algorithm performs better against biased editors?
Do they track popularity of topics? (Score:2)

by Bluesman ( 104513 ) writes:

I could submit nonsense on a variety of obscure topics, with low odds that anybody will find and correct them, thereby building up a great reputation. I wonder if their system accounts for that.

This is starting to sound like Karma for wikipedia.
Compliance vs Compression (Score:2)

by Baldrson ( 78598 ) * writes:

This algorithm is measuring compliance with the Wikipedia dispute processing norms -- not "trustworthiness". A better measure of "trustworthiness" of a passage is its consistency with the rest of the body of human knowledge -- which is most strictly measured by the degree to which it is not a special case within a compressed representation of that knowledge. This is the basis of the Hutter Prize for Lossless Compression of Human Knowledge [hutter1.net]. The Hutter Prize is currently using a 100M sample from Wikipedia
Spelling Mistakes? (Score:3, Insightful)

by logicnazi ( 169418 ) writes: <gerdes@iMENCKENnvariant.org minus author> on Friday August 31, 2007 @03:26PM (#20427967) Homepage

What I want to know is if it is smart enough to distinguish edits that correct spelling and grammar mistakes from those that change content.

In particular I'm worried that the system will undervalue the information from people whose edits are frequently cleaned up by others even if that content is left unchanged.

Share
twitter facebook
Pseudonyms? (Score:2)

by chris_sawtell ( 10326 ) writes:

Wouldn't one find an entry, for example, about the early history of the WWW by the genuine Sir Tim Berners-Lee to be considerably more trustworthy than one by signed by some anonymous WikiWonderBoy?

I don't think the algorithm takes that into account.
- Re: (Score:3, Funny)
  
  by Tribbin ( 565963 ) writes:
  
  Whereas the implementation of "+1 funny" will be the end of the information age.
- Re: (Score:3)
  
  by Synic ( 14430 ) writes:
  
  No, actually you could use a dummy account to smear anyone's reputation by constantly re-editing their page back to whatever you want. By fighting over the content of a page, you effectively decrease the reputation of both parties. Since the dummy account isn't a real person, you are safe to throw it away after you are finished.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Light Bulb Moment (Score:5, Funny)

algorithmic argumentum ad verecundiam (Score:2)

Re:Light Bulb Moment (Score:5, Interesting)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re:Light Bulb Moment (Score:4, Insightful)

Re: (Score:2)

Seems a bit dangerous (Score:5, Insightful)

Re:Seems a bit dangerous (Score:4, Insightful)

Re: (Score:3, Insightful)

Hmmmmmmm (Score:2)

Re: (Score:2, Insightful)

Re: (Score:2)

Sounds familliar (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Hmmm... A reputation metric... (Score:4, Funny)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Godwin's Second Law (Score:3, Insightful)

7 years??? (Score:3, Interesting)

Re: (Score:2, Informative)

Re: (Score:2, Informative)

Re: (Score:2)

Massive truncation of edit histories (Score:2)

Doesn't take into account common myths (Score:5, Interesting)

It doesn't have to be perfect (Score:5, Insightful)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Seems to work ... (Score:5, Funny)

Re: (Score:2)

hmmm... (Score:5, Funny)

Re: (Score:2)

#REDIRECT (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

I dunno about this system. (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Cut and paste edits (Score:2)

Re: (Score:2)

I suspect this heuristic measures.... (Score:5, Insightful)

AfD: nn (Score:3, Insightful)

Re: (Score:2)

Tuned for Subject Matter (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Unpopular but neutral points of view? (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Tyranny of the majority (Score:5, Insightful)

Re:Tyranny of the majority (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Algorithms are handy (Score:2, Offtopic)

Re: (Score:2)

Re: (Score:2)

A reasonable first step... (Score:2, Funny)