Forgot your password?
typodupeerror
The Internet Data Storage IT News

Looking for Answers in the Age of Search 95

Posted by timothy
from the wrong-answers-are-the-best-kind dept.
prostoalex writes "James Fallows, in a New York Times article, notices that search engines are getting pretty good at providing information for simple keyword-based queries. However, when it comes to the actual information, such as finding the necessary data and statistics, they're not doing a great job. The article talks about the NSA- and CIA-sponsored Aquaint project that aims to deliver answers to questions that might be expressed with a variety of keywords, and need to be 'understood' by the search engine before providing the answer."
This discussion has been archived. No new comments can be posted.

Looking for Answers in the Age of Search

Comments Filter:
  • No matter which search engine I use, still none of them can help me search for my missing car keys or missing left socks.
  • by Colin Smith (2679) on Sunday June 12, 2005 @06:57PM (#12797936)
    Ho yes, there's a good idea. Give Google the ability to understand.

  • by Chowser (888973) on Sunday June 12, 2005 @06:59PM (#12797942)
    One thing missed in so many search engines now is finding information on a particular company quickly and efficiently. Often when I type a particular company into google or another search, I get a bunch of other hits before the actual company itself. Now, bigger companies come as the first hit often (i.e., apple, dell, canon, etc.) Try finding for that lesser-known company though and you'll encounter a lot of crap first. The company listing should always come first.
  • by xyzzy (10685) on Sunday June 12, 2005 @07:02PM (#12797957) Homepage
  • by rah1420 (234198) <rah1420@gmail.com> on Sunday June 12, 2005 @07:02PM (#12797958)
    ...like the Semantic Web? [semanticweb.org]

    No, I don't know why it's being relaunched. My guess is that it's probably one of the answers that we are looking for in the age of search that didn't quite cut it. But isn't that what all these different meta-searches are talking about? The ability to get semantic meaning imbued into the web?
    • by TuringTest (533084) on Sunday June 12, 2005 @07:25PM (#12798074) Journal
      Yes, the Semantic Web goals are exactly the idea stated in the article.

      Regular Slashdot crew don't get it because of the overly complicated status of the current S.W. standards, but in the future some lightweight implementation of the Semantic Web idea will take off and we will have search engines that somehow "answer questions" instead of just "finding words".
      • But right now, even seasoned web developers/programmers don't want to go near it because of things like OWL (it sure took me a long time to figure it out, and I have a background in the logic and technology that it's built on). The first step towards making the Semantic Web (which is really a great idea) usable is to make creating a semantic webpage easier. You can't just say "put up with it now because it will get easier later" - that's not how to get widespread adoption.
        • Have you heard of "lowercase semantic web"? Things like del.icio.us, Flicker, Xhtml Friends Network and other open API lightweight services will be the first tools that will spread the idea of easy-to-use, always-available semantic services.
    • These sort of things always reduce to the problem or natural language processing and recognition. The question that is really asked is "Can computers understand human language and act accordingly?" that is not new and it has been the topic of research for decades. To answer a natural language query, the system has to have common sense among other things (stuff that everyone of us knows and we know that others know it). When we ask the computer a question most of the time we assume the computer has the same
      • by Anonymous Coward
        To answer a natural language query, the system has to have common sense among other things (stuff that everyone of us knows and we know that others know it).

        The computer only has to appear to have common sense; it doesn't actually have to have common sense. This is the key to how many Aquaint systems work.

        The great thing about the web is that there are billions of web pages out there, many of them created by humans, and many of those humans have common sense. If you're looking for an answer, there'

  • Cluster Searching (Score:4, Informative)

    by LordMyren (15499) on Sunday June 12, 2005 @07:08PM (#12797989) Homepage
    Google really lacks at filtering out noise. I was looking for Gran Turismo tuning stuff yesterday. Gran Turismo tuning -"release date" -cheats -faq, &c &c &c. The list of restrictions to filter out noise kept getting bigger and bigger, but it was still just the big agencies that were getting hits, nothing about the game itself.

    Clusty [clusty.com] on the other hand is no sucker for a press release. I find its much smarter at locating actual content.

    Myren
    • I don't know ... I googled for "go fuck yourself" as a test when I read the article, and it returned this [amishrakefight.org], this [phaeba.net], this [amishrakefight.org]

      Seems to understand pretty well what the average /.er is looking for ...

  • I agree (Score:5, Funny)

    by nxtr (813179) on Sunday June 12, 2005 @07:12PM (#12798012)
    Just try to google a website for these guys [foundrymusic.com]!
  • Homunculus (Score:3, Interesting)

    by headkase (533448) on Sunday June 12, 2005 @07:13PM (#12798016)
    I know where I'd like to see this first: A digital librarian for Wikipedia. An agent that would recommened articles based on your preferences and maybe store the articles in some language neutral format where articles could be expressed into a target language or parsed from a language into neutral format. Too bad nobodies publicly demonstrated anything close to the level of machine intelligence that would be required to do it.
  • by Anonymous Coward
    ...she's the only one who does...
  • by oneiros27 (46144) on Sunday June 12, 2005 @07:27PM (#12798085) Homepage
    Than finding 50+ people asking the same question you are, and not a single answer. (or even only one person asking the question, but because the mailing list or newsgroup was being archived on more than one website, you find the same question over and over again).

    It's even more annoying when you had the same question a couple of months before, and had found the answer, but can't remember what the answer was, where your found the answer, or what search terms you had used. (and it's even worse if that site has gone down in its rankings, and something else with people asking the question, but no answer, now ranked higher).
    • I found something slightly more annoying..

      Asking a question, only to find that the only answers you get are those 50+ people being told to "Go Google for it. Sheesh!" .... and then not a single answer.

      I run into that on occasion, like with my current mod_perl problems. :|
    • Well there's one thing more annoying than that. Finding the same question, no answer and a message by the original poster that he/she has found out how /what/when/where, without posting the solution.
  • We use Mindmeld (Score:4, Informative)

    by MichaelPenne (605299) on Sunday June 12, 2005 @07:41PM (#12798151) Homepage
    for an 'intelligent' FAQ.

    It uses more of a human based system, it 'learns' as folks type in different questions (and versions of the same question)and tell it whether the answers it gives are helpful. As uses 'teach' it, it gets better at providing relevent results to natural language queries.

    Worth a look:
    http://mindmeld.sourceforge.net/mmsf/index.php [sourceforge.net]
  • Comedy. (Score:3, Funny)

    by Kerago (811845) on Sunday June 12, 2005 @07:48PM (#12798186) Homepage
    What a surprise. http://www.ai.sri.com/search/ [sri.com]
  • by mister_llah (891540) on Sunday June 12, 2005 @07:56PM (#12798229) Homepage Journal
    1. Google needs to find and kill keyword-filled spam/malware/whatever pages. Not just remove them from their search list, but murder the people who started them.

    2. Google needs to filter responses based on ad content. If there are a ton of ads, chances are, the site is bupkiss and its priority should be massively downgraded.

    3. Google needs to filter based on ownership by holding companies. These cybersquatters should be downgraded in response priority and their pets should be sterlized or neutered to control the pet population.

    4. Google needs to get back in the kitchen and make me a pie.

    ===

    Fix those things, then perhaps we should worry about statistical analysis... (but hey, thats just me)

    IMHO, if you want accurate stats and information, go to a library...
  • and this? (Score:1, Interesting)

    by Anonymous Coward
    http://mindset.research.yahoo.com/ [yahoo.com]

    seems to be a good crap filter
  • by suitepotato (863945) on Sunday June 12, 2005 @07:59PM (#12798258)
    Can you say Dogpile?

    Google is the number one search agent for me as more often than not with a short list of carefully chosen starting terms, and a little refinement from sleuthing, I can find what I need pretty quickly.

    Do the search engines have to be so smart they find what we meant to find or even what we think we meant to find as opposed to what we literally asked for? They're tools, like library cards, not servants there to do our work for us and stop us from thinking about the search process. Are we complaining because this all isn't as brainless as AOL?
  • I think that there will always be a need for knowledge specialists (professional researchers), whose job it is to develop useful results from requests for information, using whatever tools are at hand.

    Tools like Google and MSN Search are not the only thing you need to find information. There are still places for other information, and 'because Google said so' is not a valid reason for accepting information as relevant, or factual.

    Although these tools will continue to improve, the application of wisdom will still require human input to make the results useful.

  • by houghi (78078) on Sunday June 12, 2005 @08:03PM (#12798276)
    is the ability to filter out certain types of sites. Like the sites that are webinterfaces to Usenet or sites that sell stuff.

    When I now look for a digital camera, I get hunderds of sites trying to sell me one, then a lot of sites that talk about it till I get to the makers homepage. an example [google.com] The page I am looking for is this one [concord-camera.com]
    Vivisimo [vivisimo.com] makes it a bit easier, but not completely.

    A9 also failed to produce the correct page.
    • I checked into this. It turns out that the product name is 5062AF, and the page you wanted only has "5062AF" on the page--not 5062 by itself surrounded by whitespace. If you do the query [concord 5062af] then the page that you wanted shows up at #2, after a PC Magazine review, which would be a pretty solid result too.

      It's interesting to think about indexing 5062af as 5062 as well, but some searches would probably become less precise because we added in more general matches.
  • <whine>The stats aren't online, and I can't be arsed to go to the library or ring up someone for some help. Google suxx0rs!!!1!</whine>
    • Library stats are outdated. For example, go to any library and find me recent Google statistics. She (the libarian) will probably point you to a computer with internet access.
  • works surprisingly well. even better is their answer bar which is a simple download. i can alt click on any word in any program and i usually get the information that i am looking for. google usually gives me too much crap.
  • Data isn't information, information isn't knowledge, knowledge isn't wisdom.
  • Here's what I do: (Score:2, Interesting)

    by Hosiah (849792)
    As with *all* things technical when they get too popular for their own good, I actually find Google's hit-quality to be going down-hill. Now, I use multiple engines, but they're all unifying to one standard, which will make them all mediocre in the future.

    If you run Linux, you have a decent tool-kit on hand to enhance search engine performance. Use lynx from the command line, with either the -source or -dump option, and pipe it through sed and such to filter it however you like. A recursive check of each

    • Interesting ideas. That's a lot of work for the standard user but the programmer may attempt it. You could also use something like a spreadsheet or database viewer program that has an HTTP query feature. Some spreadsheets let you view, sort, and display data right inside the software without ever touching your web browser. But how do you actually visit and click the links from there on? Link enabled software?

      How 'bout some web browsers with better widgets built in to them? Gridsheets, sorting features, fi

  • Understanding is seldom achieved by anyone. How do we model it ? At the very minimum "understanding" implies a consistant context. This means that the engine needs be tailored culturaly and demographicaly to be of use. The CIA/FBI/paranoid right wing fanatic context is probably fairly straight forward to model, one dimensional definitions with a few weighted variants. But what about real people asking questions not related to the investigative context ?
  • what the Symantec Web was all about?

  • If you have to ask, you won't get away with it.

    The insanely talented and successful people I know with extensive mods would never think to ask. They simply wouldn't take a job that required them to change their appearance.

    If that's not where you are in your career, well, suck it up.

    Everyone else in this thread babbling about how "unprofessional" and "childish" mods are, well, they can suck it up too. There are people out there who are good enough to come to work every day with eye patch, a jester'

  • yeah! i was using google to find a fish medicine for asthma - hyderabad website. but google did not return correctly the result. but yahoo did. google's page rank might move the real info pages to the back and may not be visible to the user. in that way, they are lacking.
  • The linked article states,

    "Recently, for example, I was trying to track the changes in California's spending on its schools. In the 1960's, when I was in public school there, the legend was that only Connecticut spent more per student than California did. Now, the legend is that only the likes of Louisiana and Mississippi spend less. Was either belief true? When I finally called an education expert on a Monday morning, she gave me the answer off the top of her head. (Answer: right in spirit, exaggerated in
  • Frankly I think that searching and the results from searching, are limited by the searcher. If someone wants to search for something complex then they should learn to use the rules of the search engine. Using quotes properly (terms inside of quotes, i.e., "prime numbers"), or putting period marks at the end of words, they're simnple rules and can completely change the way you search. But it's not in the search engine's interests to help, as the more searching gets done, the more advertising can be delivered
  • QuASM (Score:2, Informative)

    by mr. mulder (204001)
    I would have to disagree with the poster - significant researching efforts have been put into question answering and factual data retrieval from the web. Visit the University of Massachusetts Center for Intelligent Information Retrieval website [umass.edu]. For a more specific project, check out QuASM [acm.org]

I'm a Lisp variable -- bind me!

Working...