Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Natural Language Processing for State Security

Posted by Zonk on Sun Sep 24, 2006 08:39 PM
from the your-ipod-can-tell-what-you-mean dept.
Roland Piquepaille writes "Obviously, computers can't have an opinion. What computers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security. Right now, a consortium of three universities is for the U.S. Department of Homeland Security (DHS) which doesn't have enough in-house expertise in NLP. Read more for additional references and a diagram showing how information extraction is used."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Natural Language Processing for State Security 50 Comments More | Login /

 Full
 Abbreviated
 Hidden
More | Login
Keybindings Beta
Q W E
A S D
Loading ... Please wait.
  • tinfoil hat... or is it? (Score:5, Interesting)

    by macadamia_harold (947445) on Sunday September 24 2006, @08:46PM (#16179939) Homepage
    What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information. This branch of natural-language processing (NLP) is called 'information extraction' and is used for sorting facts and opinions for Homeland Security.

    Yeah, because we need AT&T giving wide-scale, undocumented wiretaps to the NSA, who use voice recognition to generate transcripts of everyone's phone calls, and then DHS can run NLP on those transcripts to compile a list of "persons of interest", who are then automatically added to the TSA no-fly lists.

    Yeah, I can envision the future, and the future sucks.
    • Re: (Score:3, Interesting)

      Sorting facts from opinion by use of language, how amazingly pointless and stupid. Now lets see if the program can sort BS facts from real facts. This just seems like another scheme cooked up by incompetant political appointees, who don't have any idea abo
    • Blow the dust off all those AI research papers left over from the 1970's/early 80s.

      Of course universities will be scrambling to help. Big dollars, imprecise goals..... and many of the professors would have done research in related fields.

      • Re: (Score:3, Insightful)

        Especially since the system, whilst it will have some quite interesting applications and the research will yield interesting results, can't work. A computer cannot distinguish between a fact and a lie told as fact...garbage in, and all that.

        Let me rephrase
  • Moo (Score:5, Funny)

    by Chacham (981) on Sunday September 24 2006, @08:47PM (#16179945) Homepage Journal
    What comptuers are very good at, though,

    .... is spell-checking.....

    ....something, apparently, the editors are not good at....
    • Re:Moo (Score:4, Funny)

      by ceoyoyo (59147) on Sunday September 24 2006, @09:06PM (#16180103)
      Maybe Roland had a stroke over the weekend. Sure he's self serving, but at least he's usually literate. That sentence about the universities didn't even make sense!
      [ Parent ]
    • That doesn't stop the really determined idiot though. Oh no.

      I have a spelling checker,
      It came with my PC.
      It plane lee marks four my revue
      Mis
  • Sigh. (Score:5, Insightful)

    by Renraku (518261) on Sunday September 24 2006, @08:49PM (#16179965) Homepage
    The slippery slope to being automatically flagged as someone to watch out for. No human control in the process, but one day when you go to apply for a loan or get your drivers' licence renewed, you might get a surprise.
      • Re:Sigh. (Score:5, Insightful)

        by BiggerIsBetter (682164) <richard@@@vems...co...nz> on Sunday September 24 2006, @11:06PM (#16180967) Homepage
        I'd rather have a computer flagging me than a human who may judge me by the color of my skin

        If they can flag based on what you said, I'm sure they can flag you based on the skin tone in the photo on your drivers license or passport too. Or by your just family history or name. Or where you live. Or where your parents live.

        Anyways, odds are the computer won't be doing the flagging per se, it'll just be following the parameters and policies entered by those humans controlling it. I'm not sure they'd trust "national security" to a self-learning neural net without some sort of bias in it.
        [ Parent ]
  • Number 891224 (Score:5, Insightful)

    by bky1701 (979071) on Sunday September 24 2006, @08:52PM (#16179981) Homepage
    Number 891224 has expressed a dislike of Emperor Bush, incident reported to FBI and Homeland Security.
    • Re: (Score:3, Funny)

      Number 979071 has expressed an interest in Emperors, incident reported to George Lucas
  • ... I want to see this functionality in Internet search engines!
    • Re: (Score:2)

      We [technorati.com] are slowly working towards that, but we are not at the point where this can be done both fast and well. Unless you have FBI/NSA/CIA/government resources, of course.
  • Alias-i's ThreatTracker (Score:5, Interesting)

    by otisg (92803) on Sunday September 24 2006, @08:56PM (#16180011) Homepage Journal
    There is a great little company in Brooklyn, NY called Alias-i [alias-i.com]. Some years ago they built this interesting "tool" called....guess....ThreatTracker [upenn.edu]. Information Extraction, Named Entity Recognition and other interesting stuff, if you are into this.
    No, I don't work for them, but their LingPipe toolkit has some cooooool stuff.
  • really? (Score:2, Insightful)

    "Obviously, computers can't have an opinion. What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information."

    I would say that comptuers (sic) aren't very good at deducting human opinions yet. They _may

  • Is that it could be used to train a true AI (uh... not "artificial insemination"... the other kind). Just what do you think you're doing, Dave?
  • A really difficult problem (Score:5, Insightful)

    by MarkWatson (189759) on Sunday September 24 2006, @09:04PM (#16180089) Homepage
    I have, in agregate, spent about 3 1/2 years in the last 20 years working on using NLP for semantic information extraction.

    Possible? Yes, given very narrow domains of discourse and lots of work.
    • That's why they're problems and not inconveniences.
      • Re: (Score:2)

        Some problems are more difficult to solve than others.

        Can we have a competition for inane comments?
    • Re: (Score:3, Interesting)

      I agree with the 'lots of work' part, but believe it is possible to achieve good results on wider domains outside of toy worlds. One key - from my own research - is to use (massive) databases of culture-related knowledge (belief systems) to build alternati
    • Re:A really difficult problem (Score:4, Interesting)

      by constantnormal (512494) on Sunday September 24 2006, @11:05PM (#16180961)
      It's the "narrow domains" that is the crux of the problem.

      When used successfully over said "narrow domains", the human tendency (especially that set of humanity which makes the high-level choices for groups and organizations) will be to expand the domain in hopes of applying it to ever greater numbers of items.

      Of course, as the search domain is expanded, the effectiveness of the results decline, with no warning to the clueless idiots driving the search. False positives eventually exceed true positives by greater and greater margins.

      In the end, the strategy collapses, as a great many victims are shown to be wrongly targeted -- but until that point, the system does a LOT more harm than good.

      Thank Goodness our leaders are such wise and contemplative souls that they would never, ever misuse such a tool.
      [ Parent ]
  • A boon to research (Score:5, Interesting)

    by JanneM (7445) on Sunday September 24 2006, @09:04PM (#16180093) Homepage
    It's clear "national security" has become what "the internet" or "the cold war" were in their prime: an all-purpose catchphrase to get funding for any research whatsoever, no matter how tenuously connected.

    Look at the two project proposals below and imagine which one will have an easier time getting funding:

    "An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf"

    or

    "An epistemological metaanalysis of object-subject interrelations and conflict avoidance in Beowulf to better understand threats to NATIONAL SECURITY"

    • Re: (Score:2)

      A bit of an unfair example (any true slashdotter care to decode the first bit?), but a good point nonetheless. Of course, CHILD PORNOGRAPHY would have worked better, but that's aside the point. Giving an explicit reason, no matter how flawed/shitty/sucku
      • Re: (Score:3, Insightful)

        With all due respect, that is inaccurate.

        DARPA, Defense Advanced Research Projects Agency, is a gigantic agency that funds a large proportion of academic research. The political hot button of child pornography, on the other hand, has no large funding sour
  • by Anonymous Coward on Sunday September 24 2006, @09:06PM (#16180105)
    Wow, thanks for another waste of time. And you people stop linking to his blog in comments, he exists for nothing but ad clicks.
    • Re: (Score:2, Insightful)

      by Anonymous Coward
      Not to mention, he linked to the almost EXACT same blogs he did last night in his Hydrogen junk article, tisk tisk Roland. You can mod us offtopic all you want man, just checking the last article proves he is scamming Slashdot (and it's users) for ad click
  • Man... (Score:2, Insightful)

    There goes a promising career path. I know any technology can be used for good or for evil, but in today's political climate, it seems especially irresponsible to be aiding and abetting what may wind up becoming the pretext for torture of some 16 year old
  • well (Score:2)

    Of course, stuff that is stated as fact could be opinion, conveniently made to look like fact. Hence Orwellian doublespeak. Given how far AI is at current, I would say that such an algorithm would not really be able to alert flag doublespeak.
  • Sounds like GALE (Score:5, Interesting)

    by Dr. Eggman (932300) on Sunday September 24 2006, @09:23PM (#16180233)
    Sounds kind of like DARPA's Information Processing Technology Office's GALE [darpa.mil] Program:

    " The goal of the GALE (Global Autonomous Language Exploitation) program is to develop and apply computer software technologies to absorb, analyze and interpret huge volumes of speech and text in multiple languages, eliminating the need for linguists and analysts and automatically providing relevant, distilled actionable information to military command and personnel in a timely fashion. Automatic processing "engines" will convert and distill the data, delivering pertinent, consolidated information in easy-to-understand forms to military personnel and monolingual English-speaking analysts in response to direct or implicit requests."
  • abuse? (Score:3, Insightful)

    by mr100percent (57156) on Sunday September 24 2006, @09:30PM (#16180297) Homepage Journal
    Why do I immediately assume this will be abused?

    DHS officer: Mr. 100%, I'm afraid we'll have to take you into custody. Our information extraction search on your blog concluded you are anti-American.
    Me: From my blog? Is this about my criticism of the Iraq war?
    DHS officer: Our results are classified, but please accompany us to GTMO for further "information extraction" to confirm the results of our investigation...

    Ok, I know I'm taking a very cynical view here and that's pretty full of FUD, but why else does State security need this? Is this for them to monitor every chat room and blog?
  • Aha! (Score:3, Funny)

    by suv4x4 (956391) on Sunday September 24 2006, @09:46PM (#16180435)
    Obviously, computers can't have an opinion.

    Welcome the new opinion-based CAPTCHA-s!
    • Re: (Score:2)

      "We're sorry, your liberal slant is too far left to access this site."
      -- Fox News
  • NLP (Score:2)

    I was a little confused how they used the link between human brain activity on different wave lengths to extract opinion from written text, but Neuro-linguistic programming is apparently not the most popular term with NLP as an acronym.

    This could be a doub
  • Can do or will do? (Score:5, Insightful)

    by Dan East (318230) on Sunday September 24 2006, @10:08PM (#16180581) Homepage
    What comptuers are very good at, though, is scanning through text to deduct human opinions from factual information.

    Funny, because neither of the articles state that. In fact, they don't even say that software can do that at all yet: A new research program ... aims to teach computers to scan through text and sort opinion from fact. Or, We're interested in seeing how we would extract information about opinions.

    So yeah, it would be nice if they could sort opinions from facts. Why they're at it, why don't they just recognize lies from truth too, because wouldn't that be doing the exact same thing? Then we can just run statements made by people suspected of committing a crime through the software, which can then sort out all the facts from the opinions, and we'll no longer need judges, juries or attorneys.

    Roland, next time save yourself some time and just make the whole freaking thing up from scratch.

    Dan East
  • one thing (Score:2, Funny)

    another thing Rolands computer is not very good at is spell checking his posts!
  • screw national security (Score:3, Interesting)

    by argoff (142580) * on Sunday September 24 2006, @10:28PM (#16180723)
    Screw national security, how about search, how about for business and commerce, how about for for culturial exchange and global interaction. The chances of me getting attacked by a terrorist are less than getting hit by lightning, the chances with dealing with foriegn cultures, foriegn business and commerce are rapidly approaching 100%. There are 4 billion people out there who have the potential to mutually benifit from clean communication. Please don't patrinoze me, I'm not too worried about getting nailed by terrorists, but am very bothered by the possibility of having my individual liberties nickeled and dimed to death.
  • "Right now, a consortium of three universities is for the U.S. Department of Homeland Security (DHS) which doesn't have enough in-house expertise in NLP."

    If one of these NLP "expert" systems can extract fact or opinion from that sentence, we should delete
  • Knowing the general quality of the average programmer, it stands to reason that this code will only be validated to function in the usual case; thus, the 3l33t coder immediately realizes that simp1e substitutions present an initial defense against the naiv
  • This story fits in the broader context of a developing "surveillance state" in the USA. Forget about wiretaps and such, I just want to focus stuff that is out in plain view.

    The 4th amendment says:
    The right of the people to be secure in their persons, hous
  • Bushed (Score:3, Funny)

    by Anne Thwacks (531696) on Monday September 25 2006, @02:53AM (#16182213)
    You mean it was not the computers that voted for George W Bush? Then who the hell did?
    • Re: (Score:2)

      It's supposed to be a way to identify an article based on keywords. It's not an opinion poll. Keywords like "yes", "no", and "duh", are completely irrelevant!

      Whatever the orignal intention, users have noticed that if enough of them use a particular tag,

      • Re: (Score:2)

        Personally, I find a "yes" or "no" tag somewhat more informative than a list of the words in the story title... (which seems to be the usual situation)

        Really? How is it informative when the same, single article has the following associated tags: "Yes", "N
        • Really? How is it informative when the same, single article has the following associated tags: "Yes", "No", "Maybe", "duh"

          I hope you haven't been relying on those "Yes" or "No" tags to tell you if a story is right or wrong. The point of the tag system is t
          • Re: (Score:2)

            >> Really? How is it informative when the same, single article has the following associated tags: "Yes", "No", "Maybe", "duh"

            > Good point - if it has all of them at once, it's probably a waste of time. Although it could be a good indicator of whet
      • Re: (Score:2)

        Not his site. This time(?) it's actually quite legit, not a blog and no advertising.
    • It's important that we gather our intelligence from computers, because computers cannot form an opinion. If they could, they wouldn't help us for long, or they'd start lying to us.

      That's the basis of our overreliance on technology in intelligence gathering