
New Method of Spam Filtering 326
Alephcat writes "A simple and easily implemented scheme for combating e-mail spam has been devised by two researchers in the United States. P. Oscar Boykin and Vwani Roychowdhury of the University of California, Los Angeles use their method to exploit the structure of social networks to quickly determine whether a given message comes from a friend or a spammer. The method works for only about half of all e-mails received - but in all of those cases, it sorts the mail into the right category. The article was published on Nature magazines website earlier today."
My favorite filter (Score:2, Funny)
Re:My favorite filter (Score:3, Insightful)
How it works - clustering coefficients (Score:5, Informative)
From what I can make out, this system graphs correspondent pairs into correspondence maps, and notes that while normal people all email each other and thus have dispersed graphs, (high clustering coefficient) spammers have a distinct pattern, e.g. 1 person emailing a few million others (low clustering coefficient). There are figures in the article that make this point well.
The system would be ideal for implementation at a fairly high level, (e.g. the ISP level) where systems can aggregate email headers across many different users in order to come up with meaningful graphs. The advantage it claims of no false positives means that it would be feasible at this level.
I'm impressed; it looks like a very clever idea. My only question concerns how this would deal with mailing lists, which must appear to it like spam?
Sorry: that link is the full pdf, here's abstract (Score:5, Informative)
We provide an automated graph theoretic method for identifying individual users' trusted networks of friends in cyberspace. We routinely use our social networks to judge the trustworthiness of outsiders, i.e., to decide where to buy our next car, or to find a good mechanic for it. In this work, we show that an email user may similarly use his email network, constructed solely from sender and recipient information available in the email headers, to distinguish between unsolicited commercial emails, commonly called "spam", and emails associated with his circles of friends. We exploit the properties of social networks to construct an automated anti-spam tool which processes an individual user's personal email network to simultaneously identify the user's core trusted networks of friends, as well as subnetworks generated by spams. In our empirical studies of individual mail boxes, our algorithm classified approximately 53% of all emails as spam or non-spam, with 100% accuracy. Some of the emails are left unclassified by this network analysis tool. However, one can exploit two of the following useful features. First, it requires no user intervention or supervised training; second, it results in no false negatives i.e., spam being misclassified as non-spam, or vice versa. We demonstrate that these two features suggest that our algorithm may be used as a platform for a comprehensive solution to the spam problem when used in concert with more sophisticated, but more cumbersome, content-based filters.
Re:How it works - clustering coefficients (Score:3, Insightful)
Well mailing lists are, by definition, identical to spam, so far as an automated program looking at each messagae is concerned. Whenever there's a test of spam-filtering programs the "false positives" are mailing lists that the tester forgot to tell the spamfilter about.
It would be useful to have some way of publishing a list of mailing lists who have permission to send you email -- I'll leave it up to t
Re:How it works - clustering coefficients (Score:3, Insightful)
Re:How it works - clustering coefficients (Score:3, Interesting)
Before I saw your posting, I was thinking that perhaps one way to deal with it would be for a similar approach to the "social networks" and "web of trust" ones to be applied to the servers and networks themselves: each network could keep a list of mail servers on other net
Re:How it works - clustering coefficients (Score:5, Insightful)
Yeah, but I'd consider a high-level analysis of my email headers (either sent or received) to be a violation of my privacy. Whether or not I'm mailing to kinky@alterate.life.styles.com, fringe.politcal.groups.require@free.speech.too.or
Someone will undoubtedly argue that since headers are sent in the clear anyway, it shouldn't matter, but keeping a database of who mails what to whom only makes abuse -- by freelance busybodies or government spies and censors -- that much the easier.
This is a case, I think, were the threat inherent in the cure is worse than the disease.
Re:How it works - clustering coefficients (Score:3, Insightful)
And in reply to myself.
Since the whole point of this is to build social-connection-webs, it's ideal for government crackdown via the guilt by association angle: not only can you find everybody who is emailing to dump.ashcroft@new.american.revolution.org, you can also find -- and investigate -- all the friends of the dissenter, too.
And for anyone who isn't worried that the FBI o
Re:How it works - clustering coefficients (Score:3, Insightful)
Re:How it works - clustering coefficients (Score:3, Funny)
HOW SPMAMMERS CAN BEAT THIS FILTER (Score:5, Interesting)
The first is trivial and certain to succeed but has a Drawback to spammers: only send e-mail to single recpients. The drawback is this puts a much higher load on their servers since every message is sent individually.
The second method is to always include dummy addresses in the mailing list that the recpients probably have in their address books. For example, add the following names to the to-field: notifications@paypal.com and list-notication@ebay.com.
Any recpieint that of the spam message that also has recieved e-mail from e-bay or pay-pal will trust the message.
One can do even better by planning ahead when harvesting e-mails. For example, if you harvest a set of e-mails from a pqarticular bulliten board you can make note of message cliques at the time of harvesting, and send messages in the same groupings. for good measure you also send the addresses of the buliten board admins as well.
Third, all the spammer really has to do is to know is one recipient you have gotten messages from. Thus either buy mailing lists from legitimate companies people actually do bussniess with. Or create your own loss-leader messages. For example, send out some political action alert or anything that has some vlaue or use to most people, maybe a lottery drawing for a prize, or a discount subsciption to time magazine, so they will accpet the message. the sender does not have to be the same as your spammer address. Now you know someone in the adress book of the victim. Now you spam the crap out of them while including the trojan address in the to: field.
Re:HOW SPMAMMERS CAN BEAT THIS FILTER (Score:4, Interesting)
True this method is strongest against dictionary spam and does not work against non-dictionary spam.
[i]The second method is to always include dummy addresses in the mailing list that the recpients probably have in their address books. For example, add the following names to the to-field: notifications@paypal.com and list-notication@ebay.com.
Any recpieint that of the spam message that also has recieved e-mail from e-bay or pay-pal will trust the message.[/i]
Um, did you RTFA? (And perhaps most importantly, did anybody modding this article RTFA.)
The algorithm has nothing do do with addressbooks. Instead, it looks at friend of a friend networks as identified by mail headers.
For example, I work on a project with Bob, and Susan. A typical email message about the project will include my address, and their addresses in the header. The algorithm assumes that three first degree relationships exist:
me-bob
me-susan
susan-bob
There are also three second-degree (friend of a friend relationships.
me-susan-bob
me-bob-susan
susan
The high ratio of second-degree/first-degree relationships gives susan and bob a higher score (3/3=1), and puts them on the whitelist.
With paypal.com, there is only one first-degree relationship: (paypal.comme) and no secondary relationships. The algorithm handles single relationship networks as a special case, and defines them as ambiguous.
With a typical dictionary attack, a spam comes with 50 email addresses in the header. However, because a dictionary attack relies on sequential or randomly generated usernames, the number of recipients who are part of my social network is low. So we have 50 first degree relationships, and lets say the spammer gets lucky and nails Susan and Bob as well. It still gets a low score. (2/50=.04)
One can do even better by planning ahead when harvesting e-mails. For example, if you harvest a set of e-mails from a pqarticular bulliten board you can make note of message cliques at the time of harvesting, and send messages in the same groupings. for good measure you also send the addresses of the buliten board admins as well.
This is a slightly better strategy. However, this only works if you use email from a member of the clique, and limit the recipient list to members of the clique.
But there is a serious problem with the strategy. The stated goal of the authors (did you RTFA?) is to increase the costs of spamming to the point where spamming is no longer economically profitable. Such a strategy would require research which is expensive.
Or create your own loss-leader messages. For example, send out some political action alert or anything that has some vlaue or use to most people, maybe a lottery drawing for a prize, or a discount subsciption to time magazine, so they will accpet the message. the sender does not have to be the same as your spammer address. Now you spam the crap out of them while including the trojan address in the to: field.
Once again RTFA. The algorithm has nothing to do with addressbooks. But you did raise one possible threat: spoofing. A spammer could not get integrated into my social network by offering a loss-leader (for the same reason that messages from ebay.com would not be whitelisted). A spammer could spoof a member of my social network. (For example, using Bob's address.) However, the problem here is economics. Bob would probably only be auto-whitelisted by 50 people. Thus spoofing Bob would only get you access to a small population, which defeats the entire economic rationale for spamming.
Mailing lists / newsletters (Score:5, Insightful)
Not necessarily, indeed most professional ones avoid this. While many spams do contain multiple people in the To: field (but also many don't). One way or the other, I don't think this is relevant if we are trying to compare the graph of a mailing list to that of a spammer. To take an example, user slashdot-headlines@newsletters.osdn.com sends thousands of emails to people *who don't know each other*. User enlargeyourdong@hotmail.com has exactly the same pattern. How do you tell these apart?
Re:Mailing lists / newsletters (Score:3, Interesting)
For something based on statistics, the difference would likely be very noticeable.
Most newsletters are one-way (Score:5, Insightful)
Most mailinglists and newsletters are one way - I'm not talking about discussion lists or listservs, but rather about the bot that sends me Slashdot headlines, Jakob Nielsens' Alertbox, Fred Langa's newsletter, and even commercial speech that I am signed up to and want to hear such as Komplett's weekly offers, or Ryanair's cheap flights, etc.
Re:Most newsletters are one-way (Score:3, Informative)
Re:Mailing lists / newsletters (Score:3, Insightful)
However, the whitelist that this algorithm generates would still be valid. To me, this is the real strength of the algorithm, to be able to generat
Everytime you filter spam... (Score:5, Funny)
Cleaning up the gene pool (Score:5, Funny)
Scorched Earth:Cleaning up the gene pool (Score:3, Funny)
The Spam Gene is actually a regressive gene, not likely it appeared in the parents or ofspring. It's affect is similar to fouling the nest or pissing on food before eating.
Bugger Off! (Score:5, Interesting)
You know darn well that this will only increase employment in the Spam Technology sector and is a good thing.
Seriously, Spammers are often a step ahead and lately a lot of spam I'm getting is masked to look like Amazon orders or closed ebay auctions. I haven't ordered anything from Amazon (USA) in ages, but I till have to peek to see if someone has cracked my account and ordered something. Just expect the harder they are pressed, the harder spammers will press back by sinking to new lows.
Vwani Roychowdhury (Score:5, Funny)
Re:Vwani Roychowdhury (Score:3, Funny)
Interesting (Score:5, Interesting)
Easily spoofed? (Score:5, Insightful)
Re:Easily spoofed? (Score:4, Informative)
Re:Easily spoofed? (Score:3, Insightful)
Addressed, not send by (Score:4, Informative)
I send you and your sister a spam. While both of you are getting the spam, to both of you I am an unknown and therefore the system would flag me. ONLY if I send the spam to you while pretending to be your sister would the system break. I would need to know both your email and the email of someone you know. This would not be impossible to harvest with virusses stealing addressbooks but is not what is currently happening. Currently email address lists used by spammers are very simple flat text files. Of course nothing complex would be needed. Simply a similar text file but now with two emails per line. The first the recipient, the second the person to forge as the sender. Simple but more work.
So it looks like a pretty clever idea. Especially for work place email where most mail is by people you know and very little email from outside usually arrives. And even when it is done it is usually from a known domain namely a client or supplier.
Will it work? Who knows. Gotta be worth a try. Unless you want to wait for Bill Gates to fix it. We all know how well the security problems in windows were fixed eh?
There is not going to be a magic bullet that fixes spam. We will just have to use a lot of ordinary lead ones. Don't worry Bush says they are safe.
Re:Easily spoofed? (Score:3, Interesting)
And virus-infected machines are being used to send spam, they're also capable of swapping email address details between machines?
Coincidence? You'd better hope the spammers think so.
Re:Easily spoofed? (Score:3, Informative)
Every wonder why worms use their own SMTP engine? Because those of us that are competent have one mail relay that only accepts messages from the internal domain. We prevent the worm's SMTP engine from working by having MX wildcard records to a logging box only for internal DNS - this ensures that any message sent from an internal box that gets out goes through the relay, which authenticates the user.
Re:Easily spoofed? (Score:4, Interesting)
"We prevent the worm's SMTP engine from working by having MX wildcard records to a logging box only for internal DNS -"
Say what? Why wouldn't you just block outbound port 25 from anyone expect YOUR SMTP server's address? If a worm has it's own SMTP engine (many do, yes), then what's to stop it from doing it's own MX look-ups? It would take about 4 extra lines of code to accomplish this.
Re:Easily spoofed? (Score:2)
Re:Easily spoofed? (Score:2, Informative)
Re:Easily spoofed? (Score:3, Interesting)
So no, this certainly isn't a solution all by itself. It's the best one I've seen so far that doesn't involve more laws, though.
Most of the other ideas surrounding DNS lookups are to enforce accurate From: lines. But then the ideas break down, with the best suggestions to be new laws to punish the sender of the spam. With the proposal here today, it can be done with technology
Re:Easily spoofed? (Score:5, Informative)
A typical message would look like this:
From spammer@baddomain.com
From: Your friend <yourfriend@gooddomain.org>
Subject: Re: your mail
Buy our crap ! Click below to be removed. Blah blah.
The first From field is the 'envelope sender' and comes entirely from the servers that have touched the mail. The rest of the fields are just a freeform part of the message, which by convention most (all?) MUA's treat in a special way to add convenient features like having the 'real name' next to your mail address in the visible From: field.
Re:Easily spoofed? (Score:5, Informative)
If you do as most spammers do and connect directly to the receiving server, then you can feed it whatever you like in the envelope sender, and it has no way of checking whether it's genuine or not. This is what stuff like SPF can help with, but as things are currently implemented just about everywhere, the envelope-sender addresses on spam and viruses are generally forged.
Re:Easily spoofed? (Score:3, Insightful)
> whatever you like in the envelope sender, and it has no way of checking whether it's genuine or not.
Isn't it typical for the receiver to reverse-lookup the sender's IP, or at least forward-lookup whatever you hand it in the HELO to make sure you're legit ? I could be mistaken here, but that's always been my perception.
Re:Easily spoofed? (Score:4, Informative)
Some systems do this, but any sensible system will not reject solely on this basis because it breaks delivery of some legitimate messages. In particular, nowhere does it say that mail "from" a particular domain has to emanate from a particular host (there's no analogue to MX for *sending* hosts). That's what SPF and similar techniques are trying to impose - registered "senders" for a particular domain.
Erm, not (Score:5, Informative)
Simply : untrue. It's as easy to fake the envelope sender as it is the From: header. I think you're getting confused with "Received" headers, where each mail system inserts its own bit of tracking information. The envelope-sender is completely under the control of the sender, and (usually) propagates un-modified as an email is handed between systems (indeed, one of the criticisms [pobox.com] of SPF is that by modifying the envelope sender you break forwarding).
Volume (Score:4, Interesting)
Re:Volume (Score:3, Insightful)
And how does this allow email from internet transactions or other non-social sources through? The article didn't seem to address that so clearly.
Re:Volume (Score:3, Interesting)
Then when I get a random e-mail from a friend, of a friend that isn't on my white list, it's a lot more likely to show up in my filtered mail. It's an easy way of having a white list built for you. Besides, I hate maintaining a white list. Anytime someone changes e-mail addresses, I have to go play with the white list. It's not terrible convienent. I'd
Re:Chain of Trust (Score:3, Interesting)
Essentially, that is a short description of how a "Chain of Trust", or better named a "Web of Trust" works in GPG. You have people who verify that person A knows the private key A_1 the corresponds to public key A_2.
Even if they don't bother encrypting everything, but just digitally sign it. It's also just an ant
huh? (Score:5, Interesting)
Of course one huge downside to this "friend of friends" approach is all the virus spam I get that's sent using someone's address book (thanks Outlook!) Guess what... all those addresses are probably whitelisted because it came from someone I "know."
Re:huh? (Score:5, Interesting)
On its own it doesn't sound like it works well, but you can couple it with already-existing systems to boost accuracy.
Re:huh? (Score:5, Funny)
So it's just a very good rule, how is that bad? (Score:5, Informative)
So you could throw this as a rule into SpamAssassin with a 100 weight on Spam results and a -100 weight on non-Spam results. That could only help your filtering. With zero false-positives.
Re:So it's just a very good rule, how is that bad? (Score:5, Interesting)
Re:So it's just a very good rule, how is that bad? (Score:4, Interesting)
Very interesting indeed!
Only 50%, but no false positives (Score:3, Interesting)
Re:huh? (Score:3, Interesting)
No, it works PERFECTLY on that half.
Important distinction. Now instead of needing need to troll through for spam yourself to generate the Bayesian filter you can set this to automatically generate your Bayesian filter. Not only would this be easier, but it would reduce false negatives/positives by 50%.
More fodder for the mill (Score:3, Interesting)
This is clearly an independent estimate, and a good mechanism to improve the overall detection probability.
What we need is a "meta-Bayesian" process that appropriately weights and combines other spam prediction estimates, not just word counts.
It wouldn't be meta-bayesian. (Score:4, Informative)
Re:More fodder for the mill (Score:3, Informative)
hm.. (Score:2, Interesting)
Viruses? (Score:4, Interesting)
(OT sig response) (Score:5, Funny)
Right, from now on, it's "micros~1" for me.
Re:Viruses? (Score:3, Insightful)
> virus, trojan and spyware-oriented methods of
> spamming?
Fine by me.. that puts them soundly into the lawbreaking category. Which means that after you track them down and actually find someone operating inside the borders of your country, you can DO something about it.
Since the laws being passed in the US are clearly indicative that spam is and will always be in an impossible to regulate grey area, the next best solution is to make spamming so dif
Re:Viruses? (Score:3, Funny)
Screw that; if they send even one spam to an FBI agent, they're interfering with his ability to do his job, and thus providing aid and comfort to terrorists.
Re:Viruses? (Score:4, Insightful)
Sounds interesting... (Score:2)
Re:Sounds interesting... (Score:5, Insightful)
-
Implementation?? (Score:2)
Good idea (Score:5, Interesting)
It should be interesting to see how this method plays out. (Now, I don't know why I even bothered with that last sentence. Everyone says that about every new spam-filtery thing. ((Don't know why I bothered with that last sentence either. Work is slow today I suppose.)) )
this doesn't address spoofed email (Score:3, Interesting)
Worms, from infected email systems?
The researchers didn't address this.
A two tier system? (Score:5, Interesting)
Happy Trails!
Erick
Re:A two tier system? (Score:2)
email still has to get to user (Score:4, Insightful)
If I understand the technique correctly, it relies on information specific to individual users. Unless there is a way for users to export their information, that means that the filtering can only be done after the email reaches its destination, not by the ISP or central mail server. So it may be helfpul to individual users, but unlike some proposed techniques, it won't cut down on total email traffic.
End user's access is not the issue. (Score:3, Insightful)
Spam filtering (Score:5, Funny)
If it doesn't use bullets, I don't want to hear about it.
I don't always like my friends' friends (Score:5, Funny)
It might not be "spam" but I filter it now. I'll stick with my procmail filters.
Good Start (Score:3, Interesting)
I guess it seems this is where the focus has become. While some spam can be blanketed and deleted, it's really up to the RECIPIENT to judge whether its spam or not.
But then again, do we trust the user? Do we trust Joe and Jane (our loving SixPack couple) to make the right decision? Sure, it might be prudent in a company of 5-50, but what about 500-5000? Deploy and manage copies of these program to see if it's going right or not?
I'm a sysadmin and I prefer the server based solution. Blacklists, SpamAssassin, et. al. Easier to fix one machine than 5000 desktops.
Comments?
'Blacklist of spammers' ?? (Score:2)
Also, this kind of solution will ONLY work if it's not widely used. Once it DOES become widely used, the spammers will simply update their huge network of zombie machines so that the spamming software on those machines sends spam from friends to friends, utilising the available address books and previous recipient list on the infected machine.
In o
Heading the wrong way (Score:5, Interesting)
Pretty soon, you will have to send an MD5 hash of your DNA from a static IP address that is reversible and supply 5 refrences all in a PGP encrypted letter, along with a copy of your passport and birth certificate.
When it's more work to block spam than stop it, you have to ask what is going wrong. Maybe if we somehow figured out wonderful technologies to *stop* spammers instead of blocking them, we'd be getting towards the ultimate goal. This is much like throwing money at a problem to bandage it, not fix it. The solution, however, also has to be easier for end users, who are doing nothing wrong. Why is every solution harder for end users, but just a 'bump in the road' for spammers? Am I missing something?
My own method (Score:2, Interesting)
I use a super-extra-secret e-mail that I give only to my friends.
Spam from Co-workers? (Score:3, Insightful)
These idiots have forgotten the basic rule of dealing with spammers (and other mail miscreants) which is:
They lie in the HELO, they lie in the MAIL FROM:, in the headers, etc. etc. etc.Any method that depends on this kind of data is doomed to a quick failure in the real world.
Re:Spam from Co-workers? (Score:3, Funny)
I foresee a nasty counter-measure to this (Score:2)
The key is the cost (Score:3, Interesting)
Though I'm no fan of Microsoft or Bill Gates, the solution proposed by them - one where a complicated math calculation is required for every mail they send - is on the right track because at least, in theory, it becomes expensive to send mail and therefore spammers are at a disadvantage. If this is to be a really workable solution, only time will tell - and given the MS tradition of hype ... who knows.
Schemes that make it expensive for the handlers (networks, ISPs) or the recipients, are not the way to go. After reading the article, it seems that this is just another one of those.
Re:The key is the cost (Score:2)
New math? (Score:3, Insightful)
That has to be one of the most ridiculous statements I've heard in a while. That's like saying I've got a great new burglar alarm system. Now, it only works about half of the time, but when it does work it catches the crook with a 100% success rate!
Who's buying?
Spammers already defeat this (partially) (Score:5, Interesting)
In fact, this has provided me with a kind of "honeypot", since I now check for the addresses of several people who are long gone from my site. If I see their address its gotta be spam!
- Dave
This method will ruin a cool part of the net (Score:5, Insightful)
Now, if we only have emails from our (already existing) friends or friends of friends, then how will we ever meet anybody new?
Bigger Issue... (Score:4, Insightful)
Link to the Research Paper (Score:4, Informative)
for a MUCH more interesting read... (Score:3, Funny)
Sniffing stools speeds diarrhoea diagnosis
19 February 2004
http://www.nature.com/nsu/040216/040216-13.
I guess that pigs have wings. (Score:4, Interesting)
I think that their idea is good from a technical point of view, but very bad from a privacy point of view. I am of the opinion that gathering social network information is extremely dangerous. A pertinent example: If your friend is branded a "terrorist," then "they" can exploit the information that you have voluntarily provided to then put you on a "terrorist" watch list.
Another example: Say that someone who knows someone that you know actually buys something from a spam. If the spammer can access the social network information, suddenly your little niche of the network is going to be aggressively spammed. After all, like minds congregate.
There is no doubt in my mind that the black hatters will infiltrate the social network communities and use that information to spy on potential viewers. See this bugzilla [spamassassin.org] thread where the folks from Atriks Professional Email Deployment Service follow SpamAssassin's development and adapt their "ratware" tool accordingly.
The biggest problem with collecting social networks is that once the data has been gathered, it is very hard to control. Those of you using Orkut should think long and hard about it.
In conclusion, I think that this is technically a good idea but it opens a Pandora's box.
*Sigh* (Score:3, Insightful)
Frankly, a series of filters is probably the worst approach at stopping SPAM. It's a game of "make the filter, defeat the filter, and risk not getting important mail." Why bother? The solution lies in a different approach. Authorization. There needs to be authorization layers in order to defeat spam. We need buddy lists, we need blacklists, we need the ability to request authorization, etc.
I realize that fixing this problem isn't a simple one given the scale in which it's used. But man, I really wish somebody'd figure out how to do the transitory work. I'm almost completely reliant on ICQ and Private Messaging on forums in order to keep up with everybody.
Reverse MX DNS querying (Score:4, Interesting)
I've been thinking about this method for a while - basically, you configure your SMTP server to do this:
This idea is cleary too simple to have not been thought of before - but I have yet to find a good explanation as to why it won't work. Verizon.net uses this exact method - try sending a SMTP message from a host that isn't listed in your domain's MX records, you get a 550 Sorry, you aren't allowed to mail for this domain". or something comparable. How come this method isn't more widely used? Going through my own SMTP server logs show that the vast majority of SMTP servers sending legit mail are also listed in the domain's MX records. The only price is that you require the sender and receiver to be the same within a domain - hardly an unreasonable requirement.
Re:Reverse MX DNS querying (Score:3, Informative)
In fact one of the rules I use blocks messages that claim to come from the MXes of certain large service providers because such messages are 100% spam from spammers who already thought of your idea.
Re:Reverse MX DNS querying (Score:3, Informative)
Basically, as part of your DNS entry, you have a record containing a list of all of the addresses that are allowed to send email on your domain's behalf. I think there was a story on Slashdot a few weeks ago about it as AOL has starting using it.
I once had an evil idea (Score:4, Interesting)
I would ve harvested the emails of as many members of the ruling communist party as possible, and used those relays to spam them with anti-communist propaganda. I believe the consequences would've been swift and ruthless.
Unfortunately I cant read/write Chinese, and this idea wouldnt work in less repressive regimes...
bcc to all! (Score:4, Insightful)
A spammer could manipulate the To and CC headers as necessary to fool filters that analyze them, without affecting the ACTUAL list of email addresses to which the email is sent.
I don't think spam can be stopped without replacing or overhauling SMTP, and then ceasing to support "old" SMTP. But that ain't gonna happen anytime soon. (sigh)
Some of us rely on e-mail from strangers (Score:5, Insightful)
However, some of us can't avoid having a publically available e-mail address. For example, writers such as myself rely on feedback from readers who are, in nearly all cases, strangers (and sometimes strange, but that's another story...) Avoiding false positives from strangers is very important to me. I want their messages. But, since my e-mail address is published frequently (hence no reason to hide it here), I obviously receive a ton of spam.
For the past few months I have experimented with a plug-in called BayesIt! for the Windows email reader The Bat!. As the name implies, it's a bayesian filter. The nice thing about BayesIt is that I could point it to my already-stuffed spam folder and train it on thousands of messages in one go. So far it has worked out rather well. No false positives, and only about 10-20 false negatives per day (out of approx. 400 spams).
Still, in the long run I support proposals that shift the economics of e-mail in ways that have minimal impact on human beings while making spam unprofitable. Changing the economic model of spam is the only sure solution; relying solely on technology will simply keep us locked in an ongoing arms race.
-Aaron
Plaxo Revealed? (Score:3, Interesting)
Isn't this scheme the perfect use for the wide-ranging social network information being collected by Plaxo?
It makes sense - they certainly haven't annouced a revenue stream yet, and "keeping your address book up-to-date," even in a wireless and multiplatform world just doesn't seem like a big enough idea to justify the huge amounts of data collected.
So is that the annoucement that's coming from Plaxo, the unveiling of a broad Spam solution that used 'degrees of separation' data from your address book and the address books of your friends to implement a spam filtering solution?
If I may say, it does seem like the killer app for their unique data set.
-------
Seems like a good use for FOAF (Score:3, Informative)
A lot of sites like Tribe.net [tribe.net] and my own project SongBuddy [songbuddy.com] are working on integrating FOAF into the site, so that you won't have to worry about the mechanics of it unless you want to. Seems like an easy way to build these kind of white lists.
everything has a weakness... (Score:4, Funny)
Re:Random number generator is just as good (Score:2)
Re:Huh? (Score:3, Funny)
Am I the only one who read this sentence and said "huh??"
Oh, no -- makes perfect sense to me. I applied that logic to quite a few exams when I was in college: "My score on this exam is perfect...I could only come up with answers to half of the questions, but every one that I answered was correct! a+ for me!"
My professors were the bastards who didn't understand...