Anonymity of Netflix Prize Dataset Broken 164
KentuckyFC writes "The anonymity of the Netflix Prize dataset has been broken by a pair of computer scientists from the University of Texas, according to a report from the physics arXivblog. It turns out that an individual's set of ratings and the dates on which they were made are pretty unique, particularly if the ratings involve films outside the most popular 100 movies. So it's straightforward to find a match by comparing the anonymized data against publicly available ratings on the Internet Movie Database (IMDb) (abstract on the physics arxiv). The researchers used this method to find how individuals on the IMDb privately rated films on Netflix, in the process possibly working out their political affiliation, sexual preferences and a number of other personal details"
Anonymity broken by stupidity (Score:3, Interesting)
did it work? (Score:3, Interesting)
{tongueincheek}Yeah, but the question is, will knowing those personal facts generate better movie recommendations?{/tongueincheek}
When there's a significant prize at stake, researchers can try all sorts of slimy tricks to win. (I'm not saying that's the motive behind this report, but there are many "researchers" going for the prize.) And when there's significant profits at stake, a corporation will damn-fire-certainly use whatever means they can use to maximize those profits, regardless of whether it might be "ethical."
Data-mining and the actual problem (Score:4, Interesting)
The second problem is that by deanonymizing the NetFlix data, you can start to cheat on the NetFlix prize. The requirement to win $1 million is that your recommendation engine is 10% better than the one they are currently using. However, if you can learn the exact preferences of some users in the dataset (i.e. by finding the rest of their ratings on IMDB) then you can hardcode that into your recommendation engine and get the recommendations for these users exactly right. This can boost your score even though your actual system is no better than the existing one. This is known as over-fitting to the data.
Finally, this paper is over a year old. Can we please have some new news?
Re:Sexual preferences? (Score:5, Interesting)
Re:What are you rating in IMDB vs Netflix (Score:3, Interesting)
Just out of curiosity, why don't you want to see those films again? both of them are really good films and although I would not see them every weekend (as for example Sin City), I enjoy watching them from time to time. The plot is interesting, the photography/drawing is nice and the screen writing is well done.
I find it difficult to understand your statement, "favorite movies I never want to see again", if you do not want to see them again, then you do not enjoy watching them... unless you dislike enjoyment and only watch films that make you cry or have a bad time (I would suggest you United 93... worst film I have seen in a looong long time... or Broeback Mountain, a 1 hour marlboro country ad).
I not not know about the netflix scoring algorithm but I have found criticker.com quite reliable for my tastes.
Am I insane in thinking that you can see a movie as being a great artwork and still not liking it or viceversa?
It might be akin to the "La Gioconda" painting. Everybody says it is the best piece of art of all the time, yet, after having watched it *twice* live in the Louvre I have yet to find something special about it (I prefer for example, paintings from Giovanni Paninni, which is relatively unknown)
Re:Probabilities (Score:3, Interesting)
Re:only a matter of time (Score:5, Interesting)
Simple as you said, I do NOT enjoy watching them (Score:3, Interesting)
The comment "favotire movie I never want to see again" is one I got from a review of Grave of the Fireflies that I just happened to totally agree with. Don't read the reviews, just watch it yourselve and if you are not into Anime just set that aside for the duration of the movie, then ask yourselve again, if you can understand that comment.
It is powerfull movie, like Shindlers List, but not a happy tale. I am not talking a tear jerker movie here, I am talking a "we will all burn in hell for this" movie. Tear jerkers I can take, Christmas in August is one. Sad tale, nicely told but ultimately human. It makes you sad, not sick of humanity.
Perhaps I am just too emotional about this kinda stuff, one reason might be that I grew up with halfunderstood tales of "that was were your great-uncle was picked up". When you realize just why your grandmother had 9 brothers and sisters yet you never met any. I got one aunt, my grand-parents had 3 kids, a starvation story like GotF hits a lot closer with a history like that. (The dutch hunger winter)
I enjoy all kinds of movies and would NOT have NOT watched these two, but that doesn't mean I want to see them again. There are some people who list Shindlers List as a feel good movie because it 'ends well'. I suppose you might see it that way, I don't.
I can regonize your statements that the photography is nice and the screen writing is well done, but the plot is intresting? To you it is a plot, to me it is a sickening part of history that I am far too close to.
Perhaps it is a bit like how Richard Pryor's monologue about the 200th celebration of the US was not exactly all that cheerfull.
Terry Pratchets Nanny Ogg describers at one point the difference between merry and mirth (or something like that) she describes how she was joyfull when her child was being born but she wasn't exactly chuckling at the time. Enjoying a movie and enjoying it are two different things, at least for me. I can't describe it any clearer.
More woe for HMRC then (Score:3, Interesting)