Anti-Terrorist Data Mining Doesn't Work Very Well 163
Presto Vivace and others sent us this CNet report on a just-released NRC report coming to the conclusion, which will surprise no one here, that data mining doesn't work very well. It's all those darn false positives. The submitter adds, "Any chance we could go back to probable cause?" "A report scheduled to be released on Tuesday by the National Research Council, which has been years in the making, concludes that automated identification of terrorists through data mining or any other mechanism 'is neither feasible as an objective nor desirable as a goal of technology development efforts.' Inevitable false positives will result in 'ordinary, law-abiding citizens and businesses' being incorrectly flagged as suspects. The whopping 352-page report, called 'Protecting Individual Privacy in the Struggle Against Terrorists,' amounts to [be] at least a partial repudiation of the Defense Department's controversial data-mining program called Total Information Awareness, which was limited by Congress in 2003."
The actual report (Score:5, Informative)
Paradox of the False Positive (Score:5, Informative)
I realize this is likely starting to sound old, but Cory Doctorow's Little Brother should be required reading for people doing something like this. His writings about the "Paradox of the False Positive" are enumerated there, but also in other sources:
http://www.guardian.co.uk/technology/2008/may/20/rare.events [guardian.co.uk]
(emphasis mine)
And, as others have pointed out, this system is likely to have a false positive rate higher than 1%.
Re:I'd run on that platform. (Score:5, Informative)
The no fly list doesn't identify people, just names, and it's very exact, so changing charles to chuck will defeat it.
No, actually it won't. The newspapers are full of stories of people who were detained or forbidden from flying because their name was similar to a name on the list, or a nickname of a name on the list, or a possible alternative spelling of a name on the list, or names that had once been used as an alias of names on the list.
for example, the name "T. Kennedy" was on the list. Senator Edward Kennedy (whose name does not begin with "T", but who is nicknamed "Teddy") was stopped:
from Wikipedia [wikipedia.org]
Re:Seems (Score:3, Informative)
I'm actually well aware of how intelligence works. Merely cultivating contacts is an arduous process, because pushing it too fast can cause them to become suspicious and either stop talking to or actively turn on the recruiter. Some are eager to provide what the recruiter wants, and some take years to provide any useful information.
Your 80/20 assertion is at least partially incorrect, because if it were, the US would have been far less worried about Soviet space program in the later part of the 1960s, and we'd be spending less effort protecting certain sensitive technologies from getting out to various other entities. We wouldn't spend billions on the NRO, and NSA wouldn't need to keep upgrading their SIGINT capabilities each year.
There are situations where you have to interface with informants that are part of the entity being watched, and some of those informants aren't people with whom the US government wants their dealings public. Congress had a small fit about that in the 1990s, and it made life difficult for field agents.
Re:I'd run on that platform. (Score:5, Informative)
It doesn't matter, because the only place where you have to get your ID checked is at the TSA checkpoint, and they don't check it against any databases.
So, the easy recipe for bypassing the no-fly list is:
I flew as recently as last month and was not subjected to anything which would defeat this scheme. It fails if you need to check luggage, but I doubt a terrorist is going to be doing that. The no-fly list is such an obvious joke.
Re:Seems (Score:3, Informative)
Yes, I know about OSINT. It still doesn't replace SIGINT, which cannot replace HUMINT. They're all interlocking pieces of the intelligence realm. HUMINT is more expensive than OSINT, and SIGINT is more expensive than HUMINT. Costs for all of them reach points of diminishing returns. A satellite that shows movements in real time at 1m resolution is better than nothing. Improving that to .5m may cost ten times as much but deliver only five times the value. Improving it to .1m may cost 100 times as much but deliver only 20 times the value.
Any good intelligence network makes use of everything that it can, whether newspapers, forum posts, criminal contacts, or radio intercepts. All of it is important.