Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

More PDF Blackout Follies

Posted by timothy on Thu Jun 22, 2006 09:35 AM
from the it's-even-secret-what-they-want-secret dept.
georgewilliamherbert writes "The latest installment of "As the PDF Blackouts Turn" hit today, with a U.S. government apparently releasing a redacted version of their court filing in the Balco grand jury leak case which merely stuck a black line over the text, which remains available in the document. As with prior documents, entering text cut/paste mode in a normal PDF browser such as Acrobat allows a reader to access the concealed text. Previous incidents include an AT&T filing in the NSA case." This works with Xpdf and KPDF, too; for KPDF, use the selection tool (under the Tools menu) around the redacted section, copy to clipboard, then paste into the text-manipulator of your choice.
+ -
story

Related Stories

[+] Entertainment: FBI Wiretapping Audit Secrets Uncovered Via Ctrl+C 231 comments
mytrip notes a story in Wired's Threat Level blog on the latest boneheaded government moves with redaction. (We've been discussing redaction follies here for years.) This time it's an FBI report (PDF) on implementing CALEA — you can select text from redacted areas, copy it, and paste into a text editor, as University of Pennsylvania professor Matt Blaze discovered. From Wired: "Once again, supposedly sensitive information blacked out from a government report turns out to be visible by computer experts armed with the Ctrl+C keys — and that information turns out to be not very sensitive after all... [Among] the tidbits considered too sensitive to be aired publicly: The FBI paid Verizon $2,500 apiece to upgrade 1,140 old telephone switches. Oddly the report didn't redact the total amount paid to the telecom — slightly more than $2.9 million dollars — but somehow the bad guys will win if they knew the number of switches and the cost paid."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • Maybe (Score:5, Funny)

    by GmAz (916505) on Thursday June 22 2006, @09:38AM (#15582378) Journal
    Perhaps the people making these "blacked out documents" should be taught a little about Vector Graphics and that a black box is not the same as a sharpie. One word for them 'n00b'!!
    • Re:Maybe (Score:4, Insightful)

      by gEvil (beta) (945888) on Thursday June 22 2006, @09:47AM (#15582461)
      You don't even need to go into vector graphics with these people. All you need to do is attempt to convince them that white text is still text, or that black text on a black background is still text. Either way, the text is still there. The only way to ensure that it's gone is to ACTUALLY GET RID OF THE TEXT.
    • Re:Maybe (Score:5, Funny)

      by Mirlas (760973) * on Thursday June 22 2006, @10:14AM (#15582667)
      Maybe we need to go back to good-old fashioned text files.
      It was good enough back in the days of wood-burning computers;
      it should be good enough now.

      • Re:Maybe (Score:5, Funny)

        by OldManAndTheC++ (723450) on Thursday June 22 2006, @12:19PM (#15583567)

        It was good enough back in the days of wood-burning computers

        Oh man, that brings back some memories! Late nights cranking out code on my Bunyan 2500 - that puppy went through three cords of oak a week, and it kept the place warm to boot. And we didn't need any of that fancy book learnin' to make it work either; if you were a good hand at whittling, you could be a programmer. Never had a lick of trouble with the Bunyan, except for the occasional splinter. Oh sure, you had to keep some kindling around to get her started, but once she got goin' she could do anything - add, multiply, and of course, branch.

        Internet? Pfft. We modulated the smoke exhaust by opening and closing the flue - you could see it for miles, unless it was raining, or windy. Hell, we had peer to peer networks back before most of you guys were even a swimmer in your dad's testicals.

        There's still a few Bunyans around, if you know where to look. Auditors like them, since they're so good at logging, and keeping a paper trail. I think the Vatican still has one, though they only fire it up when they elect a new Pope. Ah, the good old days...

    • Re:Maybe (Score:5, Insightful)

      by frdmfghtr (603968) on Thursday June 22 2006, @03:34PM (#15584906)
      Perhaps the people making these "blacked out documents" should be taught a little about Vector Graphics and that a black box is not the same as a sharpie. One word for them 'n00b'!!


      Sometimes I wonder if these incidents are really "accidents" or somebody's way of feigning ignorance of technology to get the facts out to the public.
  • by alshithead (981606) on Thursday June 22 2006, @09:40AM (#15582396)
    Perhaps after another dozen or so incidents they'll decide a little training is appropriate for the folks who are doing the redacting.
    • by cavtroop (859432) on Thursday June 22 2006, @09:51AM (#15582495)
      No, more than likely they will just pass a new law, stating that "Copying and pasting of blacked out (redacted) lines is a felony" or somesuch...
    • by richg74 (650636) on Thursday June 22 2006, @09:55AM (#15582525) Homepage
      This is in principle a good idea. However, the implementation may suffer from a fundamental problem.

      My grandfather used to say that there is one irreducible requirement for training a dog: you have to be smarter than the dog.

      • by DarkSarin (651985) on Thursday June 22 2006, @11:17AM (#15583165) Homepage Journal
        Fortunately this does not apply to humans--not directly.

        I can easily train people that are smarter than myself, if the conditions are right. For instance, I know a fair bit about statistics and data analysis, and would be perfectly comfortable training certain folks in the field, as long as they didn't know more than I do. Even then, it perfectly possible for me to come up with a unique idea that someone smarter than myself hasn't (note that I didn't say couldn't) considered.

        In the public schools there are frequent cases of a teacher training a student more intelligent than themself. It is unavoidable, although it could be reduced by making sure only the smartest teachers were highered.

        Smarter? Not a requirement. More experienced? Having unique knowledge? Yes, that is required, but maybe not irreducibly.

        HAND
    • by squiggleslash (241428) on Thursday June 22 2006, @10:19AM (#15582703) Homepage Journal

      Alternatively, perhaps the technology is at fault. If the same mistake is made over, and over, and over again, many user interface experts would start investigating whether it's the UI, not the user that's at fault. The argument is that the mistake is being made because the correct solution is not intuitively obvious.

      I'd be curious to know what tool the users are using to black out the text. Are they just exporting from Word but, before exporting, "blocking it out" in Word? If so, how? Are they putting black blocks over text, or setting attributes of the relevent text? If these are the wrong techniques, what can be done to make the right techniques obvious (and the wrongness of these techniques equally obvious)?

      I've designed enough crappy UIs in the past and justified them with "It's user error! All they have to do is hit the OK or CANCEL buttons, of course it's not going to work if they close the window instead!" and other such stuff that, with hindsight, was utterly wrong and elitist of me, to know that technically skilled people are not the best judge of intuitiveness. The fact is, I'm a programmer. You're probably technically minded too. The average user isn't. We can't avoid making assumptions about what the user thinks works that are, on occasion, completely, 180 degrees, wrong. What we can do is own up to them and try to determine how to steer the user in the right direction.

      • by gEvil (beta) (945888) on Thursday June 22 2006, @10:44AM (#15582910)
        What happens when I actually want to print white text on a black background? Will I have to go through some convoluted process because setting the background as black doesn't actually change the background to black, but rather also eliminates any text contained within it?
        • by squiggleslash (241428) on Thursday June 22 2006, @10:48AM (#15582948) Homepage Journal

          If the user interface is designed well, you'll know exactly what to do, just as you'll know intuitively how to really redact text.

          If you're asking me to tell you how such a properly designed UI will work, you're asking the wrong person. It'd be interesting to get someone like Bruce Tognazzini [asktog.com] to give their take on it. Right now, all we can be fairly sure of is that the UI isn't working because people are constantly choosing the wrong tool for the job.

  • which? (Score:5, Funny)

    by Anonymous Coward on Thursday June 22 2006, @09:41AM (#15582408)
    with a U.S. government apparently releasing a redacted version of their court filing

    Which U.S. government?
  • by Deep Fried Geekboy (807607) on Thursday June 22 2006, @09:41AM (#15582413)
    You can open them directly in Safari and cut/paste into TextEdit too.
  • by FudRucker (866063) on Thursday June 22 2006, @09:42AM (#15582415)
    i keep an older version of adobe's acrobat reader for Linux version 5.0 and copy & paste in to a text editor works in it too...

    i hate the new acrobat reader. some claim it calls home to the mothership(Adobe) which i dont approve of either (spyware)...
  • by $RANDOMLUSER (804576) on Thursday June 22 2006, @09:42AM (#15582416)
    What's this in TFA about Barry Bonds and steroids? I had no idea.
  • by nweaver (113078) on Thursday June 22 2006, @09:43AM (#15582428) Homepage
    Redacting electronic documents right is HARD. See, for example, The NSA's guide to redacting word documents as PDF [fas.org].
  • Cache (Score:4, Informative)

    by Rob T Firefly (844560) on Thursday June 22 2006, @09:44AM (#15582437) Homepage Journal
    Coral cache of the PDF [nyud.net]

    Anyone into mirroring it?
  • PDF Redaction (Score:4, Informative)

    by Fedallah (25362) on Thursday June 22 2006, @09:44AM (#15582440) Homepage
    This is pretty ridiculous. Products have existed for years to take care of this sort of thing, such as http://www.appligent.com/products/product_families /redaction.php [appligent.com].

    How does this keep happening?
  • by blcamp (211756) on Thursday June 22 2006, @09:46AM (#15582452) Homepage

    Really nice to know that these folks has taken an apparent cue on safe and secure documents from the folks in Redmond.

    On a serious note... this is seriously scary. Imagine if the NSA and other agencies are redacting all of their documents this way an passing them around the world to field offices, embassies and elsewhere.

    Imagine the implications during legal proceedings here in the States. Yikes.

  • Q: How can you tell when a blonde NSA agent has been redacting PDFs?

    A: There is magic marker ink all over the screen!

  • by thatguywhoiam (524290) on Thursday June 22 2006, @09:53AM (#15582505)
    I love this idea.

    Leave PDF the way it is. In fact, make it really hard to actually redact something, but put a tool front-and-center that looks like its redacting something.

    Then - remove any delete capability from Outlook. Trash is fine, but not delete.

    Then - configure all Windows machines to be inherently wide open, so that we may all peer into gov't computers. Oh wait, this is already true.

    Sometimes I think those in positions of high gov't power should forfeit practically all privacy for the duration of their term. Put a webcam on these fuckers 24/7. Does that sound... draconian? Unreasonable? Maybe. But after losing billions of dollars in things like Iraq military contract debacles, I don't trust any of these people. They certainly don't trust us.

  • by Tozog (599414) on Thursday June 22 2006, @09:53AM (#15582511)
    Here's how the NSA recommends redacting files:

    http://www.nsa.gov/snac/vtechrep/I333-TR-015R-2005 .PDF [nsa.gov]
  • Hush! Hush! (Score:5, Funny)

    by Anonymous Coward on Thursday June 22 2006, @10:02AM (#15582583)
    Why are we publicizing this flaw? We have a US Government in power that increasingly wants to peer into the lives of innocent citizens, while becoming less transparent itself in order to cover up deceit, fraud, abuse, and just plain bumbling incompetence. If these Keystone Kops want to believe that they are criminal masterminds, let them, but don't help them actually cover stuff up!
  • by waif69 (322360) on Thursday June 22 2006, @10:06AM (#15582609) Journal
    Having worked for the gov't and knowing that some documents that I have signed and worked on should be redacted, this scares the crap out of me. It's not that I did anything that was illegal or "evil" as google would put it, I just don't want the "bad guys" (terrorists, etc.) knowing my name is attached to anything that resulted in their cohorts arrested or killed on the battlefield (also includes CONUS since 9/11).

    Normal average government workers should NOT be redacting, the people who redact should be those who know that if they screw-up, they may be screwing themselves or good friends in the process. Have people do it(redact) who have something to lose.

    Just my 2 cents.
  • by alewar (784204) on Thursday June 22 2006, @10:06AM (#15582610)
    "Security by obscurity" :)
  • by Waffle Iron (339739) on Thursday June 22 2006, @10:08AM (#15582622)
    Clearly, these information leaks are a major security threat that is aided and abetted by these renegade PDF viewers. I'm encouraging my representatives in Congress to introduce a "Digital Millennium Redaction Act" that will prohibit the manufacture, sale, discussion or hyperlinks to any PDF viewers which enable the illicit extraction of redacted data from PDF documents. Such viewers are little more than the preferred tools for information thieves, hiding in the guise of "productivity applications". It's despicable.

    This law would instruct the FCC to create a program to certify approved PDF viewers; such viewers must make it impossible for users to steal the redacted data in a file, along with technical measures to prevent tampering with the viewers by hackers. Certified viewers will be made available to the public by software companies on a list of government-approved PDF vendors. After it becomes illegal to own a non-certified pirate PDF viewer, these dangerous information leaks will thankfully become a thing of the past.

  • by milgr (726027) on Thursday June 22 2006, @10:11AM (#15582654)
    I googled for redacted doctuments, chose some pdfs at random, and found that the text is behind the black bars.

    When I started searching, I googled for redact. There were two ads for products that remove the text from the pdf as well as create the black bar. One made it clear that the text would be inaccessible from hackers.

    So, why aren't these types of tools being used for all redactions?
  • Congratulaitons. (Score:5, Informative)

    by sammy baby (14909) on Thursday June 22 2006, @10:14AM (#15582672) Journal
    Congratulations, Slashdot! The FBI will be along shortly to raid your offices on suspicion of violating the DMCA, the Patriot Act, and probably some other bullshit piece of legislation we don't even know about.

    Oh, yeah - it's a no-knock warrant, so put your pants on now.
  • Clear as Mud (Score:5, Interesting)

    by Doc Ruby (173196) on Thursday June 22 2006, @10:52AM (#15582979) Homepage Journal
    Why doesn't Adobe upgrade their PDF generators to include a "Real Redact" button that actually deletes the redacted data? They could sell it to governments at the usual 1000x government markup rate, and the government would probably still save money vs the fallout from these illusory blackout follies. Neither the government nor Adobe is in the "freedom of information" camp. Maybe the government just refuses to buy an upgrade because that would save money overall.
  • by Namlak (850746) on Thursday June 22 2006, @11:02AM (#15583046)
    The industry at large (Microsoft being a big offender) has been trying to get us to a this magical place where everything is system and location independent and this is where we end up:

    1) FTP sites in Windows Explorer look like regular Windows folders. People expect them to work like regular folders. I had a field sales force try to "share" an Excel spreadsheet expecting the others to get a "Read Only" copy just like would happen on a local network share. Overwriting madness ensued. You can't blame them, there was no indication that it would work differently. Asking them to understand FTP is like accounting expecting me to fully understand the accounting rules behind my IT purchases.

    2) A manager where I used to work had an Excel spreadsheet with payroll data for the entire company. He wanted to send each department their subset of the data. So he filtered his spreadsheet and sent the filtered lists to each department not knowing that he was sending each department the whole list under teh covers. Luckily, the file was 30MB and choked in the mail server and I was able to bail him out of that huge mistake. But you really can't blame him - he saw something on the screen and sent "it". There should be an indication of underlying data. BTW, doing a cut and paste special made each file about 25k or so.

    Same thing with this PDF error. If your file shows certain information, it should contain that information only or indicate (or warn) otherwise.

    By "simplifying" everything, nobody knows what's really going on. A couple times per week I have to explain some type of issue to some user about how "It's really more complicated than that, see Windows (or an app) hides this from you." User roll eyes as their simple task has become obscurely complicated - all in the name of making things "easier" to understand, ironically.

    If something works different, it should be displayed different - that at least gives the user a chance to question what they are doing.
  • by blackstripe (635857) on Thursday June 22 2006, @11:14AM (#15583146)
    Assuming the original document was in Word format, I'm surprised they didn't use Microsoft's freely available redaction add-in [microsoft.com].
    • by Billosaur (927319) * <wgrother@opt o n l i n e.net> on Thursday June 22 2006, @09:48AM (#15582473) Journal

      You would think that people would have learned after the first time around. Apparently not.

      You're giving people too much credit; as has been noted in this forum many times, the average computer user is not exactly bright and doesn't read Slashdot, so they would have no idea that this is a problem. People just assume that if something appears to work a certain way, it in fact works that way.

      • by jimktrains (838227) on Thursday June 22 2006, @09:55AM (#15582528) Homepage
        "Human beings, who are almost unique in having the ability to learn from the experience of others, are also remarkable for their apparent disinclination to do so." - Douglas Adams
      • by gstoddart (321705) on Thursday June 22 2006, @10:20AM (#15582716) Homepage
        You're giving people too much credit; as has been noted in this forum many times, the average computer user is not exactly bright and doesn't read Slashdot

        You're giving people too little credit. Most people who use computers are probably fairly bright -- they're lawyers, doctors, accountants, and all sorts of things most people on Slashdot can't do. Reading Slashdot doesn't make you bright (in fact, given much of hte drivel, just the opposite.)

        But, they expect computers to work like a friggin' toaster, and to them, if the text it blanked out, it's not readable. They're not going to realize the 'black' is a representation of a rectangle in a different document layer, and that the actual internal tree of the PDF still contains the actual text. Really, how could they?

        They understand computers by metaphor and analog to the real world. They don't know or care about the actual internal stuff. Since the paradigms have been done to look like the real-world, these people assume that the rest of the things also apply.

        Many people use computers who don't have a full grasp on all of their intricacies. However, I haven't looked inside of a TV in 20+ years, but I'm comfortable using one.

        Cheers
        • Acceptance of Risk (Score:5, Insightful)

          by Kadin2048 (468275) <slashdot@kadin.xoxy@net> on Thursday June 22 2006, @10:39AM (#15582875) Homepage Journal
          While you make a good point, the people who have to use computers to accomplish their jobs, but do not make an attempt to understand how they work (and just treat them like "black boxes") are taking an enormous risk. They are hitching the metaphorical wagon of their livelihood to a team of horses that they don't know shit about.

          If you were somebody who made your living in television, but didn't understand anything about it, you would likewise be taking a great risk. You might, for instance, look like a big idiot when you show up to work at your anchor desk wearing a horizontally pinstriped shirt (which looks like ass on TV because of the Moire effect between the lines on the shirt and the TV scanlines). If you had understood the technology a little better, you might not have done that. That's a trivial example -- undoubtedly if you were a TV anchor, you'd learn or be told at some point not to wear a shirt like that without having to learn about scanlines -- but I hope you see my point.

          Whenever you use a technology without learning about it, you accept a certain amount of risk. Sometimes, you gamble and win: you just use the technology, get your job done, and nobody's the wiser. You're faster, more efficient, more competitive, you look like a hero to your boss, whatever. But if the technology doesn't work, then you're SOL -- but that's the price you pay for not understanding it. That's the risk you accepted when you said to yourself "eh, I don't really care what goes on inside there."

          In the case of PDF, we have a lot of people using a certain technology without knowing anything about how it works, and thus -- like the TV anchor in his pinstriped shirt (or a weatherman wearing chroma-key blue or green) -- you get these gaffes.

          I'm not saying that everybody needs to learn about how everything they use all day works, down to the bare metal. Virtually nobody needs to know that, except perhaps people who are doing things that are so dangerous that they can't afford to fuck up. However, people should be aware of the tradeoff they're making and the risk they're accepting when they forgo figuring out the internal details of a system and simply accept it as a whole, on faith that it will always work a certain way. As long as people are aware of that decision, and make it consiously, and accept the results, you can't ask for more.

          Generally speaking: faith is a fine thing, as long as you know when you're relying on it. It's when you thought you were relying on something else, and find out that you had nothing but faith, that a problem has occured.
    • by The Only Druid (587299) on Thursday June 22 2006, @10:08AM (#15582629)
      "Redacted" is a legal term of art (i.e. it has a special meaning in the legal context).

      For lawyers/courts/etc., redacted (Per Black's Legal Dictionary) means:
      n), n. 1. The careful editing of a document, esp. to remove confidential references or offensive material. (Cases: Criminal Law 663; Federal Civil Procedure 2011; Trial 39. C.J.S. Criminal Law 1210-1211; Trial 148-153.) 2. A revised or edited document. -- redactional, adj. -- redact, vb.>


      The lesson here is this: if you see a word used in a legal context (or any professional context) and it sounds entirely wrong...ask yourself first whether it might have a special meaning before complaining.
    • They're correct. (Score:5, Informative)

      by Kadin2048 (468275) <slashdot@kadin.xoxy@net> on Thursday June 22 2006, @10:09AM (#15582640) Homepage Journal
      Their use of redact is completely correct.

      If I am releasing a document for publication and decide to remove information from it, this is redaction. It's editing for publication, which can include the removal of information. It could also include the addition of new information, but that's not what typically happens. Redaction can be a form of self-censorship, but it's not always the same.

      Censorship is when a third party, generally a person in authority, suppresses information which is considered objectionable. The 'authority' can be the same as the author (e.g. 'self-censorship'), or the suppression can be indirect -- it need not be editing per se.

      It's my understanding that "redact" is used only in reference to written documents that are being edited, while 'censor' is more general and can refer to anything. The terms are closely related, especially in their typical use, but they're not exactly the same. "Redact" is actually a more specific and precise word for what's going on in this instance. We can argue about whether censorship is also going on, but redaction definitely is.

      Anyway, arguing about definitions by citing dictionaries is always a bit pedantic, since dictionaries are not authoritative except as a historical reference: they can tell you what a word meant at the time the dictionary was written, but not what it means right now, since a word's definition is determined by its usage. All language is inherently arbitrary: they're just sounds we make or things we write down in order to convey ideas, and the relationship between the sounds/characters and ideas is not fixed, but infinitely variable. If everyone were to decide tomorrow that 'redaction' meant the same thing as 'censorship,' that's what it would mean, and next year's dictionaries would have to be updated to reflect that.
    • Re:This proves it: (Score:4, Insightful)

      by Svartalf (2997) on Thursday June 22 2006, @10:33AM (#15582822) Homepage
      Excuse me, any electronic format, unless it is a bitmap format, will have this problem unless
      all the viewers 100% honor the redaction as it's intended. In the case of a bitmap format,
      you can burn a black or white rectangle into the original image and then add an annotation
      a la TIFF's annotations that contains the original portion of the image that was redacted
      in an encrypted format so that it's difficult to expose the redaction- IF you need to have
      the redaction exposed. If not, you hand across the redacted image as-is without annotations.

      This has NOTHING to do with PDF or ODF at all- trying to make this a connection to these
      is bogus to say the least. In this case, I believe that the people doing it used the MS Office
      redaction capabilities and then exported the redacted content to PDF, which the export
      carried the same sort of redactions across to the other format. What happened is because
      someone didn't understand the tools they were using, not because of PDF or ODF.
      • I am pretty sure that rasterized PDF documents violate government disability-access guidelines, since they can't be read with screenreaders, braille terminals, or basically anything other than a set of human eyes (or a good OCR program).

        They would be a lot better off going through the document in Word (or Notepad/Textedit/vi/EMACS/whatever) and just selecting the regions of text that they want to remove, and replacing it with [-- TEXT REMOVED --] or even [REDACTED]. If they were really slick, I'm sure somebody could write a little macro to replace the text with an equivalent number of characters of whitespace or random text or dashes, to preserve formatting. (Okay, so to really preserve the formatting it would have to be replaced with characters that have the same amount width as the deleted characters; maybe there's a font-set containing various widths of whitespace characters that they could use? In TeX it would be trivial.)

        The results would be ugly (but really, were black bars ever very beautiful?) but at least it would actually remove the information, and wouldn't result in an inaccessible, rasterized document.