Online Search Engines Lift Cover Of Privacy 460
Rican writes "MSNBC has an interesting article about how 'Googledorks' are using the powerful search engine to do searches across the web for sensitive and/or private information. Some of this information includes 'Medical records, bank account numbers, students' grades, and the docking locations of 804 U.S. Navy ships, submarines and destroyers.'"
The worst example.. (Score:5, Informative)
Nothing new (Score:4, Informative)
Re:Kazaa and Gnutella are cooler (Score:2, Informative)
Then again I don't have a WP that'll run those scripts.
Re:FUD Story to pump MSN Search? (Score:3, Informative)
Re:Um. (Score:5, Informative)
.htaccess anyone?
That, along with an appropriate robots.txt file should be all you would need to prevent a crawl, right?
Re:Um. (Score:5, Informative)
http://yoursite.com/temporary/hidden/dontreadth
And it is not linked to ever.
I realize this is redundant, and you were likely trolling, but Google will leave you right the fuck alone, so long as you put another little file at:
http://yoursite.com/robots.txt
That contains the text:
User-agent: *
Disallow:
I realize this is opt-out rather than opt-in, but there's just one place you have to opt, and there isn't another way that Google could possibly do their job. Everybody else seems to understand that the internet is a publicly accessible network.
So who's to blame? You. You put a sensitive document in a publicly accessible location on the internet, and took no precautions to keep it secure. Not linking to it is not a precaution.
Re:Why Google? (Score:5, Informative)
2) This is an article from MSN. This information was available long before Google, but it is, at the very least, curious to see this sort of article from Microsoft when they have been going to the press lately about how Microsoft intends to develop their own search technology...
Re:Kazaa and Gnutella are cooler (Score:5, Informative)
Other examples are ".dbx", the file name extension for mail folders in Outlook Express. Or ".pwl", the Windows 9x system password file (supposedly easily crackable with the correct tool).
There are unfortunately clueless users who share their whole hard drive. File sharing programs have however started getting better in discouraging or preventing the users from doing this.
What I like (Score:5, Informative)
What I like to do is go on gnutella or kazaa and search for "DSN" or one of a number of similar prefixes. Why? Because most digital cameras save their files in a specific hardwired format, and the kind of people who leave their entire hard drive shared on kazaa are the kind of people who don't rename their digital cameras.
You can find the most random, interesting, occationally personal shit that way.
I'm trying to remember the other common prefixes besides DSN and failing.
-- Super ugly ultraman
Re:Hard to hide (Score:2, Informative)
one of the central tenets of computer network security: If it is connected to the Internet, it can be accessed
"""
That's not one of the central tenets of computer network security.
If it's not connected to the internet, it cannot be accessed, but that doesn't imply what you've said.
If it's connected to the internet, and there's a daemon which answers requests with the information requested, then it
can be accessed. There's a subtle difference though - namely the daemon which answers the requests. Without that there's no access, and there can never be any access.
YAW.
Re:Um. (Score:2, Informative)
Get a clue (Score:5, Informative)
Noindex (Score:1, Informative)
Enough of the bullshit! (Score:4, Informative)
Re:Hardc0re hax0r. (Score:2, Informative)
It appears to be a buzzword that Johnny Long just kinda made up. I used Google to "hack" away and find his website: http://johnny.ihackstuff.com/ [ihackstuff.com]
It appears his definition of googledorking (?) is not just finding private info, but just anything wacky/weird/different, private is just one of those things.
Do we now call it g00g|3?
Re:Uh-huh. (Score:5, Informative)
> existance of that page get from Opera to Google such that it
> could pin-point (not crawl) that page?
Opera submits URLs browsed to by users, to google, when advert support is turned on.
http://www.opera.com/adsupport/ [opera.com]
From that page:
--------
What is the connection between the Web page and the relevant ad displayed by Google?
Opera's interaction with the Google ad system:
The Opera browser sends Google the URL of the web page you are visiting and your IP address (with the exceptions Opera filters out -- see below)
--------
Exceptions are https, forms, passwords, cgi, and non-http URLs.
As an example from my apache log file last night, when I gave a friend a URL to a photo: It's surprising how many Opera users will deny this happens, despite the evidence. That's a 5 minute delay, google is pretty quick with its crawling. Personally, I don't mind. I put things up in my temporary directory and pull them down fairly soon after. I know nothing is secure if it's just an unprotected URL, so I'm not worried like the grandparent poster. However, Opera does send URLs to google, and google does come back and check them out.
Re:Now to use it for good (Score:3, Informative)
Re:Enough of the bullshit! (Score:5, Informative)
Opera's interaction with the Google ad system:
visiting and your IP address (with the exceptions Opera filters
out -- see below)
IP address, to better target the ads
is on that page
and the Web page accessed
finding out whether something has leaked about you (Score:3, Informative)
Keep in mind, however, that Google queries are not encrypted and are not guaranteed to be private or secure, so, for your search, don't use the full SSN or anything else that shouldn't be disclosed.
It's quite clear if you actually read properly (Score:3, Informative)
Comment removed (Score:4, Informative)
Military Records (Score:3, Informative)
Some clues for you (Score:4, Informative)
b) Opera always has the name "Opera" in it's UA string, even when masquerading as IE.
c) Mediapartners-google doesn't feed the Google search engine. It is only used for Google adverts.
Re:Could happen to you (Score:2, Informative)
Google does retain information on search queries in some form. If you go and check the Google Zeitgeist (Weekly Version [google.com] & the Annual Version [google.com]) they have statistics on most searched terms, time graphs showing, for example the spike in search queries after the California Quake, and lots of other interesting information.
For the week ending February 2, the top search terms in the US were:
Re:Plagiarism (Score:1, Informative)
Re:There's good stuff out there not on Google (Score:3, Informative)
User-agent: * /Archives /Archives/bin /Archives/dev /Archives/etc /Archives/ftp /Archives/gopher /Archives/tmp /Archives/usr /cgi-bin /bin /oursite/previews
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Disallow:
Re:Fuck that shit (Score:2, Informative)
Re:Cited MSNNBC web page severely crippled (Score:2, Informative)
if they put it there themselves, yes, but... (Score:3, Informative)
Re:Fuck that shit (Score:3, Informative)
Sure, you could do
Disallow:
Disallow:
Or, if you want to be simpler, you could just do
Disallow: