More PDF Blackout Follies 309
georgewilliamherbert writes "The latest installment of "As the PDF Blackouts Turn" hit today, with a U.S. government apparently releasing a redacted version of their court filing in the Balco grand jury leak case
which merely stuck a black line over the text, which remains available in the document. As with prior documents, entering text cut/paste mode in a normal PDF browser such as Acrobat allows a reader to access the concealed text. Previous incidents include an AT&T filing in the NSA case." This works with Xpdf and KPDF, too; for KPDF, use the selection tool (under the Tools menu) around the redacted section, copy to clipboard, then paste into the text-manipulator of your choice.
Maybe (Score:5, Funny)
Re:Maybe (Score:4, Insightful)
Re:Maybe (Score:4, Informative)
This is a confusion over the way the Adobe Imaging Model works, not white-on-white or black-on-black. In Adobe's model, you start with a blank page, and you essentially paint on it; newly drawn things cover previously drawn things. Basically, despite what the previous commenter said, it really is like a Sharpie.
When you physically draw over something with a black marker, the previous text may be impossible to see, but it's still there. In the PDF, you'd only have to skip the instruction drawing the box to get the text out. Even if Acrobat didn't let you get at the text by cutting-and-pasting, someone familiar with the PDF format could still get to it with some work.
Re:Maybe (Score:5, Interesting)
Real redactors use razors. You hold up one of those redacted documents and it looks like a punch card.
Re:Maybe (Score:5, Funny)
Re:Maybe (Score:3, Informative)
Skip to pages 6-16 of the PDF for the not-so-hidden goods
http://www.easy-sharing.com/528126/BALCO_quash_sub poena_sfchronicle_unredacted.pdf.html [easy-sharing.com]
P.S. I did it in FoxIt PDF Editor Pro, which I wouldn't really recommend to anyone
Re:Maybe (Score:3, Funny)
--jeffk++
(... a fool and his money are best parted... )
Re:Maybe (Score:5, Funny)
It was good enough back in the days of wood-burning computers;
it should be good enough now.
Re:Maybe (Score:3, Funny)
It was good enough back in the days of wood-burning computers;
it should be good enough now.
Definitely! Then we can redact things with fancy ANSI terminal codes ^[[30;40mlike this super secret hidden message[[m!
w00t! No one will EVER figure how to defeat that!
Re:Maybe (Score:5, Funny)
It was good enough back in the days of wood-burning computers
Oh man, that brings back some memories! Late nights cranking out code on my Bunyan 2500 - that puppy went through three cords of oak a week, and it kept the place warm to boot. And we didn't need any of that fancy book learnin' to make it work either; if you were a good hand at whittling, you could be a programmer. Never had a lick of trouble with the Bunyan, except for the occasional splinter. Oh sure, you had to keep some kindling around to get her started, but once she got goin' she could do anything - add, multiply, and of course, branch.
Internet? Pfft. We modulated the smoke exhaust by opening and closing the flue - you could see it for miles, unless it was raining, or windy. Hell, we had peer to peer networks back before most of you guys were even a swimmer in your dad's testicals.
There's still a few Bunyans around, if you know where to look. Auditors like them, since they're so good at logging, and keeping a paper trail. I think the Vatican still has one, though they only fire it up when they elect a new Pope. Ah, the good old days...
Re:Maybe (Score:5, Funny)
Re:Maybe (Score:3, Funny)
Re:Maybe (Score:3, Funny)
Re:Maybe (Score:5, Insightful)
Sometimes I wonder if these incidents are really "accidents" or somebody's way of feigning ignorance of technology to get the facts out to the public.
Re:Maybe (Score:4, Insightful)
NSA? Since when does the NSA redact subpoenas for the District Attorney?
Re:Maybe (Score:5, Funny)
Re:Maybe (Score:4, Insightful)
Re:Maybe (Score:3, Informative)
This is why... (Score:2, Funny)
Or... they could just find a better technological solution. Seems like a no brainer to me.
Re:This is why... (Score:3, Funny)
Are you saying "boo" or "Boo urns"?
</oblig>History repeats itself (Score:5, Funny)
Re:History repeats itself (Score:5, Insightful)
Re:History repeats itself (Score:5, Insightful)
Circumvention (Score:3, Insightful)
Re:Circumvention (Score:3, Insightful)
Re:History repeats itself (Score:3, Funny)
Train them to use the blackout method, but to replace the redacted text with "If you can read this, you're under arrest!"
Re:History repeats itself (Score:5, Insightful)
My grandfather used to say that there is one irreducible requirement for training a dog: you have to be smarter than the dog.
Re:History repeats itself (Score:5, Insightful)
I can easily train people that are smarter than myself, if the conditions are right. For instance, I know a fair bit about statistics and data analysis, and would be perfectly comfortable training certain folks in the field, as long as they didn't know more than I do. Even then, it perfectly possible for me to come up with a unique idea that someone smarter than myself hasn't (note that I didn't say couldn't) considered.
In the public schools there are frequent cases of a teacher training a student more intelligent than themself. It is unavoidable, although it could be reduced by making sure only the smartest teachers were highered.
Smarter? Not a requirement. More experienced? Having unique knowledge? Yes, that is required, but maybe not irreducibly.
HAND
Re:History repeats itself (Score:3, Informative)
Comment removed (Score:5, Insightful)
Re:History repeats itself (Score:4, Insightful)
Comment removed (Score:4, Interesting)
Re:History repeats itself (Score:4, Insightful)
Barring that, PLEASE don't educate them, or make it easier for them to really redact anything.
Re:History repeats itself (Score:4, Funny)
Re:History repeats itself (Score:4, Interesting)
People...learn...? (Score:3, Interesting)
--
"And the geek shall inherit the earth."
Re:People...learn...? (Score:5, Insightful)
You would think that people would have learned after the first time around. Apparently not.
You're giving people too much credit; as has been noted in this forum many times, the average computer user is not exactly bright and doesn't read Slashdot, so they would have no idea that this is a problem. People just assume that if something appears to work a certain way, it in fact works that way.
Re:People...learn...? (Score:5, Insightful)
Re:People...learn...? (Score:5, Insightful)
You're giving people too little credit. Most people who use computers are probably fairly bright -- they're lawyers, doctors, accountants, and all sorts of things most people on Slashdot can't do. Reading Slashdot doesn't make you bright (in fact, given much of hte drivel, just the opposite.)
But, they expect computers to work like a friggin' toaster, and to them, if the text it blanked out, it's not readable. They're not going to realize the 'black' is a representation of a rectangle in a different document layer, and that the actual internal tree of the PDF still contains the actual text. Really, how could they?
They understand computers by metaphor and analog to the real world. They don't know or care about the actual internal stuff. Since the paradigms have been done to look like the real-world, these people assume that the rest of the things also apply.
Many people use computers who don't have a full grasp on all of their intricacies. However, I haven't looked inside of a TV in 20+ years, but I'm comfortable using one.
Cheers
Acceptance of Risk (Score:5, Insightful)
If you were somebody who made your living in television, but didn't understand anything about it, you would likewise be taking a great risk. You might, for instance, look like a big idiot when you show up to work at your anchor desk wearing a horizontally pinstriped shirt (which looks like ass on TV because of the Moire effect between the lines on the shirt and the TV scanlines). If you had understood the technology a little better, you might not have done that. That's a trivial example -- undoubtedly if you were a TV anchor, you'd learn or be told at some point not to wear a shirt like that without having to learn about scanlines -- but I hope you see my point.
Whenever you use a technology without learning about it, you accept a certain amount of risk. Sometimes, you gamble and win: you just use the technology, get your job done, and nobody's the wiser. You're faster, more efficient, more competitive, you look like a hero to your boss, whatever. But if the technology doesn't work, then you're SOL -- but that's the price you pay for not understanding it. That's the risk you accepted when you said to yourself "eh, I don't really care what goes on inside there."
In the case of PDF, we have a lot of people using a certain technology without knowing anything about how it works, and thus -- like the TV anchor in his pinstriped shirt (or a weatherman wearing chroma-key blue or green) -- you get these gaffes.
I'm not saying that everybody needs to learn about how everything they use all day works, down to the bare metal. Virtually nobody needs to know that, except perhaps people who are doing things that are so dangerous that they can't afford to fuck up. However, people should be aware of the tradeoff they're making and the risk they're accepting when they forgo figuring out the internal details of a system and simply accept it as a whole, on faith that it will always work a certain way. As long as people are aware of that decision, and make it consiously, and accept the results, you can't ask for more.
Generally speaking: faith is a fine thing, as long as you know when you're relying on it. It's when you thought you were relying on something else, and find out that you had nothing but faith, that a problem has occured.
Re:Acceptance of Risk (Score:3, Insightful)
Millions of Americans hitch the physical "wagon" (or SUV, or sedan, or minivan) of their livelihood to a bundle of "horsepower" that they don't know shit about every single day, and then they drive that wagon at 75 MPH.*
In the case of their cars, the consequences for misuse are serious injury or death. In comparison, the consequences for learning next to nothing about their computers seem slight
Re:Acceptance of Risk (Score:3, Insightful)
And, I'm equally amazed at how many people are too damned ignorant and intent at driving at Max 0.6 to realize I'm in the middle of fscking passing this guy (as evidenced by the fact that I'm going faster than him), and t
Re:People...learn...? (Score:3, Interesting)
This represents a fundamental difference between how geeks/nerds think, and how the population at large thinks. Those technically inclined, whether they're gear-heads, pencil-pushe
which? (Score:5, Funny)
Which U.S. government?
Re:which? (Score:3, Funny)
Works in Safari directly (Score:5, Informative)
Re:Works in Safari directly (Score:2)
It will paste into any text editor, even vi-inside-an-xterm-window.
works in older acroread too (Score:4, Interesting)
i hate the new acrobat reader. some claim it calls home to the mothership(Adobe) which i dont approve of either (spyware)...
Re:works in older acroread too (Score:2, Informative)
Then you should try Foxit Reader [foxitsoftware.com]. Apart from being free, light-weight and best for everyday use, it also has got a 'Fox' in its name. :)
Even more shocking (Score:5, Funny)
Redacting right is HARD (Score:5, Informative)
Re:Redacting right is HARD (Score:5, Funny)
This page intentionally left blank.
I was going to say, those guys are goooood.
Re:Redacting right is HARD (Score:2)
Re:Redacting right is HARD (Score:4, Funny)
Funniest post ever!!! (Score:2)
Im reading the instructions and skimming through them and what do I see?? A bretheren of clippy. At one point it seems like she/he is writing down all the secrets. Either one of two things is going on. The document is a fake or I should join the government because they need all the help they could get.
Re:Redacting right is HARD (Score:3, Interesting)
Re:Redacting right is HARD (Score:3, Insightful)
Because management and clueless users will demand that there be an "unredact selection" menu option, also. I'll let you sort out the implications of that. Either that or original copies of documents everywhere will have text permanently blocked out by the above-mentioned clueless users and management types.
Re:Redacting right is HARD (Score:5, Funny)
Yeah, cut them some slack (Score:2)
Seriously though, if the government gets TOO embarrassed about this sort of thing, they'll do something even more stupid, and mandate all official documents to use some proprietary DRM/TPM/HDCP/BVD format that only Windows Vista can display.
Re:Redacting right is HARD (Score:3, Informative)
At least it's obvious that the folks who know what they're doing, know that MS products aren't the best solution. From the doc:
Re:Redacting right is HARD (Score:3, Informative)
Re:Redacting right is HARD (Score:5, Informative)
There is a much Simpler Solution.
1.)Print Document.
2.)Locate and uncap Sharpie.
3.) Blackout Text.
4.) Scan to DocRedacted.pdf
Wow less than the average government paragraph. Seems like the way they have been doing it for years why change now?
Cache (Score:4, Informative)
Anyone into mirroring it?
PDF Redaction (Score:4, Informative)
How does this keep happening?
I wonder how long it'll be... (Score:3, Insightful)
Re:I wonder how long it'll be... (Score:2)
While not a bad suggestion, there is a major problem with it. Many offices will use Paint for this process, with the final image saved as a bitmap. Ever tried making a PDF file out of 8.5x11 inch bitmap images? The resulting filesize tends to be pretty nasty. Of course, there are ways around this, but the requisite knowledge of graphics is far beyond the knowledge necessary to understand that white text is still text--e.g., if you can properly
Disability guidelines prohibit rasterized docs. (Score:4, Insightful)
They would be a lot better off going through the document in Word (or Notepad/Textedit/vi/EMACS/whatever) and just selecting the regions of text that they want to remove, and replacing it with [-- TEXT REMOVED --] or even [REDACTED]. If they were really slick, I'm sure somebody could write a little macro to replace the text with an equivalent number of characters of whitespace or random text or dashes, to preserve formatting. (Okay, so to really preserve the formatting it would have to be replaced with characters that have the same amount width as the deleted characters; maybe there's a font-set containing various widths of whitespace characters that they could use? In TeX it would be trivial.)
The results would be ugly (but really, were black bars ever very beautiful?) but at least it would actually remove the information, and wouldn't result in an inaccessible, rasterized document.
Copy? Paste? (Score:2)
Just select the text and bang,.... there it is for reading.
-Steve
Nice and secure. Riiiiggght... (Score:4, Interesting)
Really nice to know that these folks has taken an apparent cue on safe and secure documents from the folks in Redmond.
On a serious note... this is seriously scary. Imagine if the NSA and other agencies are redacting all of their documents this way an passing them around the world to field offices, embassies and elsewhere.
Imagine the implications during legal proceedings here in the States. Yikes.
Re:Nice and secure. Riiiiggght... (Score:3, Insightful)
I would much prefer my government be unable to successfully keep secrets from me.
blonde joke (Score:5, Funny)
A: There is magic marker ink all over the screen!
The New Way for Gov't Transparency (Score:5, Interesting)
Leave PDF the way it is. In fact, make it really hard to actually redact something, but put a tool front-and-center that looks like its redacting something.
Then - remove any delete capability from Outlook. Trash is fine, but not delete.
Then - configure all Windows machines to be inherently wide open, so that we may all peer into gov't computers. Oh wait, this is already true.
Sometimes I think those in positions of high gov't power should forfeit practically all privacy for the duration of their term. Put a webcam on these fuckers 24/7. Does that sound... draconian? Unreasonable? Maybe. But after losing billions of dollars in things like Iraq military contract debacles, I don't trust any of these people. They certainly don't trust us.
Someone missed the memo (Score:5, Interesting)
http://www.nsa.gov/snac/vtechrep/I333-TR-015R-200
Re:Someone missed the memo (Score:3, Funny)
Re:NSA procedure sucks! (Score:3, Insightful)
Pretension (Score:2, Funny)
Is it just me or do they look a little pretensious?
That's the problem with these powerful formats (Score:3, Informative)
It's the same old story as with operating systems or anything else: features are usually either a plus or a "don't matter", except when serious security issues are involved, in which case you can't always predict what is benign, whether in and of itself or in combination with other features. Adobe tried to position PDF for all kinds of other things like portable forms and collaboration, but obviously their users are running into the same problems ad MS Word users have with leaking sensitive information.
What there should be is a standard document format for outside release of legal or sensitive documents, that doesn't have any features that could be inadvertantly used. Maybe it is RFT or a stripped down PDF; but something where you can tell the intern to release this press release, and not count on him being smart enough to check for hidden comments and workflow information. It sould be WYSIAYG -- what you see is ALL you get -- and any additional features, other than possibly a small and well defined set of metadata, should parse as an error.
Re:That's the problem with these powerful formats (Score:2)
Adobe can come out of this smelling like a rose! (Score:3, Funny)
Re:Adobe can come out of this smelling like a rose (Score:2)
Kind of kills the market for the third party vendors who already provide tools which do just that.
Maybe those vendors would have an anti-trust case against Adobe for doing it.
Would be ironic given Adobe's anti-trust allegations against MS for essentially doing the same thing (adding a "Save as PDF" tool to the MS office toolbar).
Hush! Hush! (Score:5, Funny)
This frightens me!!!!! (Score:5, Interesting)
Normal average government workers should NOT be redacting, the people who redact should be those who know that if they screw-up, they may be screwing themselves or good friends in the process. Have people do it(redact) who have something to lose.
Just my 2 cents.
Re:This frightens me!!!!! (Score:3, Funny)
That's just what's called (Score:4, Insightful)
We have to act! (Score:5, Funny)
This law would instruct the FCC to create a program to certify approved PDF viewers; such viewers must make it impossible for users to steal the redacted data in a file, along with technical measures to prevent tampering with the viewers by hackers. Certified viewers will be made available to the public by software companies on a list of government-approved PDF vendors. After it becomes illegal to own a non-certified pirate PDF viewer, these dangerous information leaks will thankfully become a thing of the past.
Seems to be a common occurrence (Score:4, Insightful)
When I started searching, I googled for redact. There were two ads for products that remove the text from the pdf as well as create the black bar. One made it clear that the text would be inaccessible from hackers.
So, why aren't these types of tools being used for all redactions?
Congratulaitons. (Score:5, Informative)
Oh, yeah - it's a no-knock warrant, so put your pants on now.
Re:Congratulaitons. (Score:4, Insightful)
Clear as Mud (Score:5, Interesting)
Command Line Programs; evince (Score:3, Informative)
'pdftotext' comes with xpdf & is even available natively on windows.
Similarly, for MS Word documents, you may use 'antiword' [demon.nl], 'catdoc' [free.net], and 'wv' [sourceforge.net].
These programs are quite nice in that they can easily batch-process a lot of documents & then you can go grepping through them for interesting tidbits.
(On the GUI front, evince [gnome.org] deserves a plug. It uses the same poppler [freedesktop.org] backend as xpdf and kpdf. I used to use tiny & fast xpdf for most of my pdf viewing, but evince has a few nice features which xpdf lacks & has become my personal favorite pdf viewer.)
Common problem with today's UIs (Score:5, Informative)
1) FTP sites in Windows Explorer look like regular Windows folders. People expect them to work like regular folders. I had a field sales force try to "share" an Excel spreadsheet expecting the others to get a "Read Only" copy just like would happen on a local network share. Overwriting madness ensued. You can't blame them, there was no indication that it would work differently. Asking them to understand FTP is like accounting expecting me to fully understand the accounting rules behind my IT purchases.
2) A manager where I used to work had an Excel spreadsheet with payroll data for the entire company. He wanted to send each department their subset of the data. So he filtered his spreadsheet and sent the filtered lists to each department not knowing that he was sending each department the whole list under teh covers. Luckily, the file was 30MB and choked in the mail server and I was able to bail him out of that huge mistake. But you really can't blame him - he saw something on the screen and sent "it". There should be an indication of underlying data. BTW, doing a cut and paste special made each file about 25k or so.
Same thing with this PDF error. If your file shows certain information, it should contain that information only or indicate (or warn) otherwise.
By "simplifying" everything, nobody knows what's really going on. A couple times per week I have to explain some type of issue to some user about how "It's really more complicated than that, see Windows (or an app) hides this from you." User roll eyes as their simple task has become obscurely complicated - all in the name of making things "easier" to understand, ironically.
If something works different, it should be displayed different - that at least gives the user a chance to question what they are doing.
Re:Common problem with today's UIs (Score:3, Interesting)
I use that feature quite often and it was only yesterday that I noticed that the little triangle turns from black to dark blue when you’re viewing a filtered set. All this time I was thinking there really ought to be some sort of visual indication (other than the wonky row numbers).
MS Word Redaction Tool (Score:5, Informative)
Er... pdftotext...? (Score:3, Informative)
Using a STANDARD pdf handling tool:
% pdftotext BALCO_quash_subpoena_sfchronicle.pdf
From the PDF->TXT file:
[snipped to first line before the "blacked out section"]
C. Movants' Efforts to Obtain the Secret Grand Jury Transcripts
[beginning of first blacked out section]
Prior to the return of the Balco indictments, the lead defendant, Victor Conte ("Conte"), began to correspond via e-mail with Movants. (See Ex. 1 to Donnelan Aff.). Neither Movants nor Conte attempted to keep their relationship confidential, as the e-mail correspondence routinely was reported by Movants.2 (Exs. 1, 2, 3, and 11 to Donnelan 1
[... snipped for berevity
On June 23, 2004, Fainaru-Wada sent an e-mail to Conte indicating that he (Fainaru-Wada) was busy working on some stories that may be "up on the web soon. Hope you like t
hem." (Ex. T to Hershman Decl.). Conte responded that he was looking forward to seeing the article and that his lawyer would be available for comment. (Id.).
[end of first blacked out section]
D. Disclosure of the Montgomery Grand Jury Transcript On June 24, 2004
[more, but why post it when you can read it yourself!?]
Okay... WTF!? Doesn't ANYONE check this stuff before it goes out the door!?
OMG! Wonder if this is how our private documents are "made safe"....
Re:A redacted document? Say it ain't so! (Score:2)
Re:A redacted document? Say it ain't so! (Score:5, Informative)
For lawyers/courts/etc., redacted (Per Black's Legal Dictionary) means:
The lesson here is this: if you see a word used in a legal context (or any professional context) and it sounds entirely wrong...ask yourself first whether it might have a special meaning before complaining.
They're correct. (Score:5, Informative)
If I am releasing a document for publication and decide to remove information from it, this is redaction. It's editing for publication, which can include the removal of information. It could also include the addition of new information, but that's not what typically happens. Redaction can be a form of self-censorship, but it's not always the same.
Censorship is when a third party, generally a person in authority, suppresses information which is considered objectionable. The 'authority' can be the same as the author (e.g. 'self-censorship'), or the suppression can be indirect -- it need not be editing per se.
It's my understanding that "redact" is used only in reference to written documents that are being edited, while 'censor' is more general and can refer to anything. The terms are closely related, especially in their typical use, but they're not exactly the same. "Redact" is actually a more specific and precise word for what's going on in this instance. We can argue about whether censorship is also going on, but redaction definitely is.
Anyway, arguing about definitions by citing dictionaries is always a bit pedantic, since dictionaries are not authoritative except as a historical reference: they can tell you what a word meant at the time the dictionary was written, but not what it means right now, since a word's definition is determined by its usage. All language is inherently arbitrary: they're just sounds we make or things we write down in order to convey ideas, and the relationship between the sounds/characters and ideas is not fixed, but infinitely variable. If everyone were to decide tomorrow that 'redaction' meant the same thing as 'censorship,' that's what it would mean, and next year's dictionaries would have to be updated to reflect that.
Re:And these are... (Score:2)
Re:And these are... (Score:4, Insightful)
"Another thing that pisses me off is incopetence."
Oh, the irony.
Re:This proves it: (Score:4, Insightful)
all the viewers 100% honor the redaction as it's intended. In the case of a bitmap format,
you can burn a black or white rectangle into the original image and then add an annotation
a la TIFF's annotations that contains the original portion of the image that was redacted
in an encrypted format so that it's difficult to expose the redaction- IF you need to have
the redaction exposed. If not, you hand across the redacted image as-is without annotations.
This has NOTHING to do with PDF or ODF at all- trying to make this a connection to these
is bogus to say the least. In this case, I believe that the people doing it used the MS Office
redaction capabilities and then exported the redacted content to PDF, which the export
carried the same sort of redactions across to the other format. What happened is because
someone didn't understand the tools they were using, not because of PDF or ODF.