Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Xerox Photocopiers Randomly Alter Numbers, Says German Researcher 290

Posted by timothy on Tuesday August 06, 2013 @08:47AM from the we-think-you-meant-9 dept.

First time accepted submitter sal_park writes "According to a report from German computer scientist D. Kriesel, some Xerox WorkCentre copiers and scanners may alter numbers that appear in scanned documents. Having analyzed the output of two such devices, the Xerox WorkCentre 7535 and 7556, Kriesel found that "patches of the pixel data are randomly replaced in a very subtle and dangerous way": in particular, some numbers appearing in a document may be replaced by other numbers when it is scanned."

This discussion has been archived. No new comments can be posted.

Xerox Photocopiers Randomly Alter Numbers, Says German Researcher

Load All Comments

Search 290 Comments Log In/Create an Account

Comments Filter:

These numbers are not the true numbers (Score:5, Funny)

by hawkinspeter ( 831501 ) writes: on Tuesday August 06, 2013 @08:50AM (#44485185)

So, it has come to this.

Share
twitter facebook
- Re:These numbers are not the true numbers (Score:4, Funny)
  
  by durrr ( 1316311 ) writes: on Tuesday August 06, 2013 @09:12AM (#44485343)
  
  The dark lord is touching the world, and he's doing it through photocopy machines.
  I would've expected printers or those cheap ISP-provided routers to be his preferred way of evildoing, though I guess even he/it couldn't get those to work properly.
  
  Parent Share
  twitter facebook
  - Re: These numbers are not the true numbers (Score:5, Funny)
    
    by rickb928 ( 945187 ) writes: on Tuesday August 06, 2013 @10:54AM (#44486351) Homepage Journal
    
    The Dark Lord uses SAP to interact with our world. You know nothing?
    
    Parent Share
    twitter facebook
- Re: (Score:2, Informative)
  
  by Joce640k ( 829181 ) writes:
  
  Too much XKCD?
  https://xkcd.com/1022/ [xkcd.com]
  - Re:These numbers are not the true numbers (Score:5, Insightful)
    
    by Zaatxe ( 939368 ) writes: on Tuesday August 06, 2013 @11:34AM (#44486793)
    
    Too much XKCD?
    There is no such thing as "too much XKCD".
    
    Parent Share
    twitter facebook
    - Re:These numbers are not the true numbers (Score:5, Funny)
      
      by davidbrit2 ( 775091 ) writes: on Tuesday August 06, 2013 @01:29PM (#44488275) Homepage
      
      Maybe before we rush to adopt XKCD, we should stop to consider the consequences of blithely giving this technology such a central position in our lives.
      
      Parent Share
      twitter facebook
- - - - Re: (Score:2)
        
        by somersault ( 912633 ) writes:
        
        What exactly are you referring to with your "they" and "their"? Because his post was implying that a bunch of Europeans imported a bunch of Africans. Though actually the Europeans constitute a fair amount of diversity too, so bringing slavery into it was just trolling.
Slashdot affected as well (Score:5, Funny)

by Anonymous Coward writes: on Tuesday August 06, 2013 @08:54AM (#44485211)

Kriesel found that âoepatches of the pixel data are randomly replaced in a very subtle and dangerous wayâ
Slashdot users are advised not to use Xerox copiers for submissions.

Share
twitter facebook
- Re:Slashdot affected as well (Score:5, Informative)
  
  by J'raxis ( 248192 ) writes: on Tuesday August 06, 2013 @09:01AM (#44485245) Homepage
  
  That bug is caused by Slashdot still refusing to implement this 20-year-old technology [wikipedia.org]. I mean, this being some sort of cutting-edge tech blog and all, who'd expect them to properly support a character-encoding technology that came out two decades ago?
  
  Parent Share
  twitter facebook
  - Re:Slashdot affected as well (Score:5, Funny)
    
    by intermodal ( 534361 ) writes: on Tuesday August 06, 2013 @09:06AM (#44485295) Homepage Journal
    
    Especially with such an international audience.
    
    Parent Share
    twitter facebook
    - Re:Slashdot affected as well (Score:4, Insightful)
      
      by Anonymous Coward writes: on Tuesday August 06, 2013 @10:28AM (#44486045)
      
      Especially with such an international audience.
      You must have missed the memo. Slashdot is a US site that tolerates international visitors. These are not, however, encouraged to return.
      
      Parent Share
      twitter facebook
  - Re: (Score:2)
    
    by Minwee ( 522556 ) writes:
    
    If that technology is too arcane, perhaps this helpful tool [fourmilab.ch] might be useful.
    On the other hand, it might backfire and wipe out half of the site's users, so maybe that's not such a good idea.
  - - - Re:Slashdot affected as well (Score:4, Informative)
        
        by Mr Z ( 6791 ) writes: on Tuesday August 06, 2013 @09:38AM (#44485579) Homepage Journal
        
        No, just significantly harder to filter effectively. Also, there were a rash of troll accounts with names that looked like the various Slashdot editors, only using accented variants of letters, such as 'tÍmothy'. All those shenanigans added up to where we are today.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by xaxa ( 988988 ) writes:
        
        No, just significantly harder to filter effectively. Also, there were a rash of troll accounts with names that looked like the various Slashdot editors, only using accented variants of letters, such as 'tÍmothy'. All those shenanigans added up to where we are today.
        So filter usernames and email addresses for ASCII, perhaps filter comments for UTF8 basic type 'Graphic' and \n.
        Problem solved? http://slashdot.jp/ [slashdot.jp] supports Unicode.
        
        Re: (Score:2)
        
        by NatasRevol ( 731260 ) writes:
        
        That sounds like a whole lot of whining on the editors' part.
    - - Re: (Score:2)
        
        by gl4ss ( 559668 ) writes:
        
        Actually the goatse ascii art is done completely in ASCII. I've never seen a UTF-8 version of the goatman.
        yeah.. and what matters to ascii art is getting a monospace font. and they do provide that. so what the fuck. at least have them on the story submissions.
        
        /xenu\. . . . . \rulz/ ' ' ' ' '
  - - Re: (Score:3)
      
      by NatasRevol ( 731260 ) writes:
      
      Sorry, but you have that exactly backwards.
      Online publishing is a blight on smart quotes.
      If your publishing can't handle smart quotes, then stop publishing. All they are is a different character. Deal with it properly or GTFO.
      - Re: (Score:2)
        
        by zieroh ( 307208 ) writes:
        
        Sorry, but you have that exactly backwards.
        Online publishing is a blight on smart quotes.
        If your publishing can't handle smart quotes, then stop publishing. All they are is a different character. Deal with it properly or GTFO.
        Why bother? Smart quotes add no value to the document. They are just fluff. I would think that any self-respecting slashdot reader would immediately see through such silliness.
        Oh, wait...
        
        Re: (Score:2)
        
        by omnichad ( 1198475 ) writes:
        
        It actually looks better. We shouldn't go back 100 years or more on typography just to placate technical users who can't be bothered to deal with it. They're only "smart" quotes if the publishing software picks the left or right quotation mark for you automatically. They are just standard left quote and right quote characters otherwise. Characters that weren't on early computers because we were so bit-frugal, not because they weren't already in use in typography.
        The only time it's a bad thing is if you'
        
        Re:Slashdot affected as well (Score:5, Informative)
        
        by tibit ( 1762298 ) writes: on Tuesday August 06, 2013 @10:37AM (#44486149)
        
        Just in case people miss the obvious: The differing opening and closing quotes are the correct punctuation marks. It was only due to the typewriters and teletypes that the mangling into one quote has begun. The MS Office quotes are not "smart", they are merely correct.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3)
        
        by J'raxis ( 248192 ) writes:
        
        "Smart parentheses" add no value to a document, either. They're just fluff. We should start using | for both opening and closing parentheses, no? We could even use the same symbol in place of "smart brackets" and "smart braces."
        
        Re:Slashdot affected as well (Score:5, Funny)
        
        by nmb3000 ( 741169 ) writes: on Tuesday August 06, 2013 @12:17PM (#44487355) Journal
        
        "Smart parentheses" add no value to a document, either. They're just fluff. We should start using | for both opening and closing parentheses, no?
        Wow, you've somehow managed to make Lisp even more difficult to read
        |defun proj |y x||+|*|flet ||ip |x y||sum |* x y|||||* |/|ip x y||ip x x||x||x|y||
        Congratulations are in order, but I'm sure people will still keep using it :|
        
        Parent Share
        twitter facebook
      - "Windows" and "American" etymological fallacies (Score:2)
        
        by tepples ( 727027 ) writes:
        
        Windows-1251 character codes do not belong on the internet. Many people now a days don't use windows.
        
        For one thing, using Windows code pages does not require Windows. They are well-defined encodings of a subset of Unicode. If I were to apply the same etymological fallacy to your suggestion to stick to the American Standard Code for Information Interchange, it might look like this: "Many people now a days don't live in America." A lot of languages don't easily map to just the Basic Latin block (U+0020 through U+007E). For example, in Spanish, "esta" means "this" while "está" means "is currently" or "is
        
        Re:Slashdot affected as well (Score:5, Informative)
        
        by J'raxis ( 248192 ) writes: on Tuesday August 06, 2013 @11:18AM (#44486591) Homepage
        
        The typo in the article evidences that they were using UTF-8. If a quotation mark is turned into three separate characters, that's the tell-tale that it was UTF-8 (multibyte) and not a Windows code page (all single-byte encodings).
        
        Parent Share
        twitter facebook
    - Re: (Score:3)
      
      by J'raxis ( 248192 ) writes:
      
      Yes, professional-looking typography is such a blight. Instead we should use kludges invented for typewriters and held over since the 1960s in computer charsets because of 7-bit character size limitations.
      Perhaps we should go back to using 'O' for '0' and 'l' for '1', too.
    - - Re:Slashdot affected as well (Score:4, Informative)
        
        by dolmen.fr ( 583400 ) writes: on Tuesday August 06, 2013 @10:30AM (#44486073) Homepage
        
        Slashdot uses Perl which is the programming language that has the best support for Unicode [98.245.80.27] (while PHP support for this is comparatively almost inexistent).
        But that doesn't make Unicode work magically. The slashcode [slashcode.com] has to take it into account.
        
        Parent Share
        twitter facebook
  - - Re: (Score:2)
      
      by operagost ( 62405 ) writes:
      
      ISO should release a UTF-8.1 standard. They'll all adopt it immediately.
      "My browser uses UTF-8.1. You probably haven't heard of it."
      "I used UTF-8.1 before it was cool."
- Re: (Score:2)
  
  by nospam007 ( 722110 ) * writes:
  
  "Slashdot users are advised not to use Xerox copiers for submissions."
  Imagine what Excel, rounding problems and this copy-machine could do to the economy of your enemies.
- Re: (Score:2)
  
  by omnichad ( 1198475 ) writes:
  
  And how will they keep up with making all the dupe posts? The summary is supposed to be grossly wrong here anyway.
oh man, what a mess (Score:5, Informative)

by Trepidity ( 597 ) writes: <delirium-slashdot AT hackish DOT org> on Tuesday August 06, 2013 @08:54AM (#44485213)

Some of these machines have been used for digitizing documents whose originals were later shredded, so some people now have subtly wrong "original" digitals. It's particularly problematic because of the nature of degradation; usual lossy degradation of images is in a non-semantic way, just produces blurring or blocking or other kinds of artifacts, not OCR-error style mistakes.
The issue here seems to be the lossy mode of JBIG2 [wikipedia.org], which tries to find patches of the image that approximately match, and consolidates them. The idea seems to be that if the letter "e" appears 5000 times in a document in the same typeface, you just store some version of it once, and then reference it everywhere it appears. But now you get OCR-style errors, if you end up matching some patches to incorrect partners. You have your lightly printed "8" replaced by the "0" patch now and then, that kind of thing. And unlike people doing OCR, who know they need to take this into account, the operators of these machines likely had no idea this was even a possible failure mode to watch for, so who knows how many numbers are wrong in miscellaneous documents (letters are a little less problematic, because most random letter mutations don't destroy meaning).
Blargh.

Share
twitter facebook
- Re:oh man, what a mess (Score:5, Insightful)
  
  by iguana ( 8083 ) * writes: <davepNO@SPAMextendsys.com> on Tuesday August 06, 2013 @09:08AM (#44485317) Homepage Journal
  
  Could also be a problem with an overly aggressive hole filling algorithm. http://www.mathworks.com/help/images/ref/imfill.html [mathworks.com]
  I'd expect there's nothing nefarious going on. It's very likely an overly aggressive image processing algorithm.
  
  Parent Share
  twitter facebook
  - Re:oh man, what a mess (Score:5, Interesting)
    
    by Anonymous Coward writes: on Tuesday August 06, 2013 @09:32AM (#44485529)
    
    While it isn't nefarious so far as a deliberate plot to destroy documents and their integrity, it is a bug that is of concern for those who want to preserve documents for long-term storage in an archival situation.... such as was the case with the architectural documents being scanned.
    Keep in mind that in some archival situations, the original paper documents are destroyed where the scanned versions in these files are all that remains of those documents. Ultimately, by having the numbers change like this, regardless of why it is happening, now throws serious doubt as to the validity of any of the numbers in that document. This can have an enormous set of consequences if you are using this scanned document as a receipt, for banking purposes (aka the check amount might have a different number than was originally used) or other similar kinds of situations. Engineering offices, banks, and a great many other businesses are shredding mountains of paper and archiving those documents electronically, so it is a big deal.
    I guess it really boils down to understanding the limitations of compression algorithms, and not buy into the hype that a vendor might have where you can save all kinds of storage space with this incredible algorithm.... and find out that all of your documents are worthless when you try to submit them to a judge & jury in a lawsuit as evidence. Perhaps an engineer needs to find the dimensions and tolerance limits of a bolt in an obscure subsystem... and the numbers change? Do you really want to fly in an airplane where the parts specifications have changed because of an error like this? Do you mind if a few hundred or even thousand dollars are taken out of your bank account that you didn't authorize?
    
    Parent Share
    twitter facebook
  - Re:oh man, what a mess (Score:5, Funny)
    
    by Hatta ( 162192 ) writes: on Tuesday August 06, 2013 @09:33AM (#44485543) Journal
    
    That's what she said.
    
    Parent Share
    twitter facebook
  - Re: (Score:3)
    
    by Agent0013 ( 828350 ) writes:
    
    Can't be a hole filling algorithm. The 8 that replaces the 6 still has the little dent on the left between the two round parts. It isn't just filling in the 6 to make an 8, it is actually replacing the 6 with a copy of the 8 from elsewhere on the page.
- Re: (Score:2, Insightful)
  
  by sh00z ( 206503 ) writes:
  
  The issue here seems to be the lossy mode of JBIG2
  combined with the fact that he's complaining about errors in scans of a 7-point font. At that size, it probably only takes two erroneous pixels to change a 6 to an 8.
  - Re:oh man, what a mess (Score:5, Informative)
    
    by Trepidity ( 597 ) writes: <delirium-slashdot AT hackish DOT org> on Tuesday August 06, 2013 @09:23AM (#44485449)
    
    Ran some numbers to check, and with some assumptions your estimate seems pretty close.
    The modern standard "postscript point" is 1/72 in, so a 7-point font has a height 7/72 inches. The stroke distinguishing the 6 from the 8 is maybe 1/4 of the height, so let's say ~0.025 inches. If the print/scan cycle roundtrips at somewhere in the range 75-150 dpi, that's 2-4 pixels. If you can manage a professional-standard 300 dpi, you get more like 7-8 pixels, but that's a fairly optimistic case.
    
    Parent Share
    twitter facebook
    - Re:oh man, what a mess (Score:4, Interesting)
      
      by dj245 ( 732906 ) writes: on Tuesday August 06, 2013 @09:35AM (#44485563)
      
      Ran some numbers to check, and with some assumptions your estimate seems pretty close.
      The modern standard "postscript point" is 1/72 in, so a 7-point font has a height 7/72 inches. The stroke distinguishing the 6 from the 8 is maybe 1/4 of the height, so let's say ~0.025 inches. If the print/scan cycle roundtrips at somewhere in the range 75-150 dpi, that's 2-4 pixels. If you can manage a professional-standard 300 dpi, you get more like 7-8 pixels, but that's a fairly optimistic case.
      Why wouldn't you use at least 300dpi?
      
      Most "serious" office printers print at 600dpi or better, so the information is there. Even my $100 brother laser printer defaults to 600dpi. Every recent office multifuntion I have seen can scan at 200, 300, or 600dpi, but every single one defaults to 200dpi. 200dpi scans are hard on the eyes. I always scan at 600dpi, the file size isn't bad in the age of 300GB laptop hard drives, and if I need to send it to someone external to the company, I can always reduce the size.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by N1AK ( 864906 ) writes:
        
        I always scan at 600dpi, the file size isn't bad in the age of 300GB laptop hard drives, and if I need to send it to someone external to the company, I can always reduce the size.
        In which case it begs the question why bother using an algorithm that substitutes in the real content to save space if space isn't an option regardless of what DPI you use? Clearly space saving was a consideration for someone ;)
  - Re: (Score:3)
    
    by v1 ( 525388 ) writes:
    
    I think the problem isn't so much the problem recognition, but the reproduction. It may be looking at two numbers that both look about the same, and using the same compressed data to draw both of them back out. Making them look identical. So if you started with two numbers, say one that was 70% like a 6 and 30% like a 8, and another that was 40% like a 6 and 60% like an 8, it's deciding they're "close enough" and is drawing the 70/30 image in both places. A human could figure out the second one was supp
- Re: (Score:3)
  
  by N1AK ( 864906 ) writes:
  
  I have to admit I'm actually really surprised by this. The idea and technology are good but I would think it fundamentally breaks a key feature of digitising a document: removing the need to keep the hard copy. The moment the digitised copy is more than an electronic representation of the physical document then the authenticity of anything in the digitised document is in doubt. Can it really be used to prove what someone read and signed for example, even if the chance of an error in any case is 1/10,000?
- Re: (Score:2)
  
  by nine-times ( 778537 ) writes:
  
  Thanks for the quick explanation. This is kind of hilariously unfortunate, since it has the potential to undermine the reliability of lots of documents.
- - Re:oh man, what a mess (Score:5, Informative)
    
    by Trepidity ( 597 ) writes: <delirium-slashdot AT hackish DOT org> on Tuesday August 06, 2013 @09:07AM (#44485297)
    
    Yeah, it's not OCR per se, but it operates on a somewhat similar principle to OCR, identifying which numbers are which and consolidating things it thinks are the same glyph. I agree it's much worse, because it alters the actual image. And it does so in a way that still looks plausible and "clean". Really bad lossy compression that just produced a lot of artifacts so that certain numbers were unreadable would at least telegraph that you shouldn't trust the result, but the numbers here look clean and artifact-free, they just happen to be wrong.
    
    Parent Share
    twitter facebook
- - Re:oh man, what a mess (Score:5, Interesting)
    
    by Trepidity ( 597 ) writes: <delirium-slashdot AT hackish DOT org> on Tuesday August 06, 2013 @10:41AM (#44486191)
    
    It could just be a particularly poor JBIG implementation: the format and decompressor is standardized, but the standard doesn't specify how to find the matches, so various companies have their own proprietary versions.
    
    Parent Share
    twitter facebook
JBIG2 (Score:5, Insightful)

by Anonymous Coward writes: on Tuesday August 06, 2013 @08:54AM (#44485217)

Caused by misconfigured JBIG2 compression. When pixel error rate is low enough, similar looking features get printed with the same subimage.

Share
twitter facebook
Problem with JBIG2, not OCR (Score:3, Insightful)

by Anonymous Coward writes: on Tuesday August 06, 2013 @09:05AM (#44485283)

Before anyone spreads wrong information: The problem is with the JBIG2 image compression algorithm used when scanning to PDF format. OCR has nothing to do with this. Also, TIFF format images are not affected as they don't use JBIG2.

Share
twitter facebook
- - Re: (Score:3)
    
    by barlevg ( 2111272 ) writes:
    
    He's not making an assumption--it says so right in the article.
  - The codecs commonly used with a container (Score:3)
    
    by tepples ( 727027 ) writes:
    
    In theory, TIFF is a container format for any image codec that has a TIFF embedding defined. In practice, TIFF is a container format only for those codecs supported by common TIFF viewers. To use your analogy to AVI, when people see "AVI", they think of the codecs commonly used with an AVI container, such as MPEG-4 ASP video and MPEG-1 Layer III audio back in the DivX era. I could wrap the obscure codec of PlayStation 1 or Game Boy Advance FMV in an AVI or MKV container, but there'd be no use because next t
see the Xerox user manual (Score:5, Informative)

by mejustme ( 900516 ) writes: on Tuesday August 06, 2013 @09:08AM (#44485311)

Quote: "Normal/Small produces small files by using advanced compression techniques. Image quality is acceptable but some quality degradation and character substitution errors may occur with some originals"
Source: http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf [unc.edu]

Share
twitter facebook
- Re: (Score:2, Insightful)
  
  by Racemaniac ( 1099281 ) writes:
  
  thanks for mentioning where in the 328 page document you linked that is written :)
  - Re: (Score:3)
    
    by mejustme ( 900516 ) writes:
    
    That is why keyboards have CTRL+F. (Top of page 107.)
  - Re: (Score:2)
    
    by mwvdlee ( 775178 ) writes:
    
    Page 107.
    It literally took longer to download the PDF than it took to find the page by Ctrl+S.
  - Re: (Score:2)
    
    by h4rr4r ( 612664 ) writes:
    
    Try searching for that phrase. Should be pretty simple.
- Re: (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  Interesting, since as far as I remember from reading about this issue yesterday, Xerox had not yet responded to this issue. Strange, since it's in the documentation.
  But then, reading the manual in context, the quote appears on pages 107, 129, and 179, which is the chapters "Fax", "Workflow Scanning", and "Save and Reprint Jobs" respectively.
  It's not in the chapter "Copying" (pages 39..63), so there is no excuse that this issue occurs in simple copy mode.
  - Re: (Score:2)
    
    by NatasRevol ( 731260 ) writes:
    
    If their response is anything other than RTFM, they're dying.
- Re: (Score:2)
  
  by timeOday ( 582209 ) writes:
  
  Seriously, how did you happen to know about that?
- Re: (Score:2)
  
  by Rob the Bold ( 788862 ) writes:
  
  Quote: "Normal/Small produces small files by using advanced compression techniques. Image quality is acceptable but some quality degradation and character substitution errors may occur with some originals"
  Source: http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf [unc.edu]
  Very interesting find, although that warning only appears in the "Fax" section of the manual, and not in the "Copy" or "Workflow Scanning" sections.
  - Re:see the Xerox user manual (Score:5, Informative)
    
    by Rob the Bold ( 788862 ) writes: on Tuesday August 06, 2013 @09:55AM (#44485713)
    
    Very interesting find, although that warning only appears in the "Fax" section of the manual, and not in the "Copy" or "Workflow Scanning" sections.
    AND I'd be wrong, it's in all three sections. Ctrl-F'ing in Ocular only finds "character substitution" when the words are side-by-side, not split by a line break as they appear in the copying and scanning sections.
    That's way worse. Xerox knows about this, and just puts in a little note, rather than a big old: "WARNING: Normal/Small mode may produce undetectable text errors."
    And that type of warning should be defined in the beginning of the manual as "operations that may cause data transcription errors resulting in financial harm, damage to property, injury or death".
    
    Parent Share
    twitter facebook
- Re:see the Xerox user manual (Score:5, Insightful)
  
  by Atzanteol ( 99067 ) writes: on Tuesday August 06, 2013 @09:41AM (#44485593) Homepage
  
  That's "Normal" quality? That could be *very* misleading. If you have an option that has negative side-effects such as this then the option should be titled something to indicate the risk - "Super-compressed", "dangerously small" or the like.
  Though I'm surprised Xerox would even allow such a compression if such an obvious issue occurs. People would expect image quality to suffer - but full character substitution?
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by petermgreen ( 876956 ) writes:
  
  The problem is that most people only read the manual when they discover something is wrong and there is no immediately obvious problem with the results of these scans. The problem only gets noticed much later when someone tries to work with the scanned information and discovers that it is readable but doesn't make sense.
  I also notice that the manual says that the other options give larger files with better image quality but does not state clearly whether compression algorithms that can cause character subst
- - Re:see the Xerox user manual (Score:5, Insightful)
    
    by Rob the Bold ( 788862 ) writes: on Tuesday August 06, 2013 @10:31AM (#44486093)
    
    Seems a little dangerous for that algorithm to be the default, doesn't it? Plus, burying the warning deep in the documentation.
    And an insufficient warning, at that.
    Something more like:
    Normal/Small Mode may not be suitable for documents where faithful reproduction of the original text, numbers or illustrations is critical. Examples would include legal documents (contracts, wills, articles of incorporation, etc.), medical documents (patient charts, orders, medication lists, etc.), financial documents (bills, invoices, statements, reconciliations, etc.), business documents (HR records, meeting minutes, memoranda, etc.), engineering documents (drawings, plans, change orders, instructions, bills of material, etc.) or any other document where incorrect data could result in financial loss, injury, death, property damage or destruction, legal liability, loss of reputation or other harm. These examples should not be considered an exhaustive list of documents not suited for scanning, copying or faxing using Normal/Small mode.
    would be more appropriate.
    
    Parent Share
    twitter facebook
    - Re: (Score:3)
      
      by Chelloveck ( 14643 ) writes:
      
      "I think it'd be more appropriate if the box bore a great red label WARNING: LARK'S VOMIT!"
      I'm boggled. I can't believe any copier maker would use this algorithm for its default mode. Disk and bandwidth are cheap, the space savings can't possibly be worth the risk.
Proofreading @ Xerox Development? (Score:2)

by BoRegardless ( 721219 ) writes:

How could Xerox make copiers for this length of time and not have a proofreading algorithm that works with a super-resolution scan & no interpolation to "machine check" the final commercial copier as a way of quickly finding errors?
Internatlly, Xerox engineering had to know they were "correcting" pixels, rather than just "copying" them, so how did they verify their software?
- Re: (Score:3)
  
  by Fnord666 ( 889225 ) writes:
  
  How could Xerox make copiers for this length of time and not have a proofreading algorithm that works with a super-resolution scan & no interpolation to "machine check" the final commercial copier as a way of quickly finding errors?
  Internatlly(sic), Xerox engineering had to know they were "correcting" pixels, rather than just "copying" them, so how did they verify their software?
  They do know [slashdot.org] about it.
Free Speech (Score:5, Funny)

by BradyB ( 52090 ) writes: on Tuesday August 06, 2013 @09:11AM (#44485341) Homepage

Hey, even photo copiers and faxes need freedom of speech.

Share
twitter facebook
Known Xerox Issue..... in documentation (Score:5, Informative)

by Anonymous Coward writes: on Tuesday August 06, 2013 @09:13AM (#44485351)

If you read the documentation from XEROX... it claims that on scanning it is a known problem that "Image quality is
acceptable but some quality degradation and character substitution errors may occur with some
originals." page 107 from http://www.cs.unc.edu/cms/help/help-articles/files/xerox-copier-user-guide.pdf
also on page 129 we have the following: "Quality / File Size
The Quality / File Size settings allow you to choose
between scan image quality and file size. These settings
allow you to deliver the highest quality or make smaller
files. A small file size delivers slightly reduced image quality
but is better when sharing the file over a network. A larger
file size delivers improved image quality but requires more
time when transmitting over the network. The options are:
Normal/Small produces small files by using advanced
compression techniques. Image quality is acceptable but some quality degradation and character
substitution errors may occur with some originals."

Share
twitter facebook
- Re:Known Xerox Issue..... in documentation (Score:5, Insightful)
  
  by Chris Mattern ( 191822 ) writes: on Tuesday August 06, 2013 @09:30AM (#44485507)
  
  Now the question becomes: what moron made this setting the default? Maybe a setting that can undetectably corrupt your data can be provided if appropriate warnings are given, but it sure as hell should never be the default. I would've thought that was obvious.
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by Nimey ( 114278 ) writes:
  
  So you're telling us this is a problem caused by a user not RTFMing and Slashdot sensationalized it?
  Surely you're joking. :P
  - Re:Known Xerox Issue..... in documentation (Score:4, Insightful)
    
    by MozeeToby ( 1163751 ) writes: on Tuesday August 06, 2013 @09:59AM (#44485759)
    
    Substitution errors shouldn't happen in corporate level scanning hardware, even if you bury a warning about it 107 pages into the 350 page manual. You can't have something that fundamentally makes your product not fit for purpose and claim that it's ok just because it's a known issue.
    
    Parent Share
    twitter facebook
  - Re:Known Xerox Issue..... in documentation (Score:4, Interesting)
    
    by Rob the Bold ( 788862 ) writes: on Tuesday August 06, 2013 @09:59AM (#44485761)
    
    So you're telling us this is a problem caused by a user not RTFMing and Slashdot sensationalized it?
    Surely you're joking. :P
    I admit that I, for one, don't usually RTFM before using the copier. Certainly not when I'm using the copier in "Normal" mode.
    And don't call me "Shirley".
    
    Parent Share
    twitter facebook
Wub fur (Score:2)

by Errol backfiring ( 1280012 ) writes:

They probably have some parts made of wub fur [wikipedia.org]. Those machines are more advanced than I thought!
I recognise the algorithm that gives those errors. (Score:2)

by SuricouRaven ( 1897204 ) writes:

I just spent ten minutes describing exactly how JBIG works here before noticing someone already realised what is happening and put it up on the page.
ImageRunner (Score:4, Funny)

by poofmeisterp ( 650750 ) writes: on Tuesday August 06, 2013 @09:26AM (#44485483) Journal

OMG, my Canon ImageRunners are doing the same thing! It must be a virus!
I'd better write up a research document on this and request some grant money.

Share
twitter facebook
Interesting (Score:4, Interesting)

by jones_supa ( 887896 ) writes: on Tuesday August 06, 2013 @09:31AM (#44485527)

The things you learn. I never knew before about JBIG2 and how scanners use it to repeat pieces of image. Seems to me that the JBIG2 parameters are tuned incorrectly in these scanners.

Share
twitter facebook
Corporate decision (Score:4, Funny)

by Dunbal ( 464142 ) * writes: on Tuesday August 06, 2013 @09:33AM (#44485537)

This was a decision by Xerox to get around ever being sued for copyright violations...

Share
twitter facebook
NSA BUG (Score:2, Funny)

by Sentrion ( 964745 ) writes:

It's just a bug in the NSA eavesdropping algorithm.
I can't understand (Score:3)

by joh ( 27088 ) writes: on Tuesday August 06, 2013 @09:46AM (#44485639)

how a compression that may lead to documents altered in such a way (numbers replaced by other numbers) can be considered fit for use in a photocopier. This can lead to very real, expensive and even dangerous problems down the line.

Share
twitter facebook
Surprised nobody asked this... (Score:4, Informative)

by ZorinLynx ( 31751 ) writes: on Tuesday August 06, 2013 @10:22AM (#44485959) Homepage

Why do we need such aggressive compression algorithms, algorithms that can make the data WRONG, in this day and age when storage and memory is so incredibly cheap?
This is not 1987 when every byte was precious and 1MB of RAM cost a hundred bucks. There is NO EXCUSE for this these days; just use PNG or JPG compression; at least those don't freaking CHANGE THE DATA!!

Share
twitter facebook
Self-Correcting Bug (Score:5, Funny)

by JeanCroix ( 99825 ) writes: on Tuesday August 06, 2013 @10:23AM (#44485983) Journal

I printed out the article in order to hang it on the wall above my office's Workcentre as a warning to coworkers. But apparently printing it fixed the problem, because the article headline became:
"Xerox scanners/photocopiers Scan Documents Flawlessly and are the Best in the Industry"

Share
twitter facebook
This is HUGE! (Score:5, Interesting)

by tekrat ( 242117 ) writes: on Tuesday August 06, 2013 @11:15AM (#44486563) Homepage Journal

This is how people get shot, because the police are given the wrong address to raid a house. This is how people get foreclosed on because a few account numbers are switched.
Holy crap. That makes me never want to go near a copier again.

Share
twitter facebook
Hello? QA? Hello!!! (Score:3)

by nanospook ( 521118 ) writes: on Tuesday August 06, 2013 @03:18PM (#44489905)

Something like this shouldn't have passed QA.. did we outsource or what?

Share
twitter facebook
- Re:Mission Impossible 4? (Score:5, Funny)
  
  by Entropius ( 188861 ) writes: on Tuesday August 06, 2013 @09:00AM (#44485241)
  
  That's Xenu, not Xerox.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by CanHasDIY ( 1672858 ) writes:
    
    That's Xenu, not Xerox.
    Xenu... Xerox... Xenu-Rox?
  - Re: (Score:3)
    
    by AJH16 ( 940784 ) writes:
    
    That's what he said before he scanned it on his WorkCentre.
- Re:Some image smoothing algorithm... (Score:5, Informative)
  
  by Sponge Bath ( 413667 ) writes: on Tuesday August 06, 2013 @09:02AM (#44485257)
  
  This is not smoothing, distortion or individual pad pixels. Entire image patches are copied incorrectly, essentially repeating a scanned section containing one number over another part of the image containing a different number.
  
  Parent Share
  twitter facebook
- Re:Anti-counterfeiting (Score:5, Insightful)
  
  by J'raxis ( 248192 ) writes: on Tuesday August 06, 2013 @09:03AM (#44485269) Homepage
  
  Maybe you should read the article.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Funny)
    
    by Anonymous Coward writes:
    
    Huh?
    I'm sorry. I understand those 6 words individually. But when you put them in that order, they don't make any sense.
    Read? The? Article? You are not making any sense, man!
  - Re: (Score:3, Funny)
    
    by Anonymous Coward writes:
    
    I lack the proper attention span to read the article. Let's make a deal: I quickly skim through it, and soon return here with another completely wrong conclusion. Be back in 30 seconds.
  - Re: (Score:2)
    
    by niftydude ( 1745144 ) writes:
    
    That is asking too much of him - maybe if he just looked at the pictures in the article?
    - Re:Anti-counterfeiting (Score:5, Informative)
      
      by Anubis IV ( 1279820 ) writes: on Tuesday August 06, 2013 @10:53AM (#44486341)
      
      That's all I did, and I learned what they were talking about pretty quickly.
      It's actually pretty insane. They had architectural diagrams that had the square meters for the rooms copy/pasted by the scanner into other rooms. For instance, here were the room sizes for the three rooms on the diagram as reported on the original diagram and various scans of it (I've bolded incorrect values):
      Original Diagram: 14.13m^2, 21.11m^2, 17.42m^2
      Xerox WorkCentre 7335 scan: 14.13m^2, 14.13m^2, 14.13m^2
      Xerox WorkCenter 7556 scan 1: 14.13m^2, 14.13m^2, 14.13m^2
      Xerox WorkCenter 7556 scan 2: 17.42m^2, 21.11m^2, 17.42m^2
      Xerox WorkCenter 7556 scan 3: 14.13m^2, 14.13m^2, 17.42m^2
      They have images of this happening. It's just outright substituting blocks of text from one part of a scanned image into an entirely separate part. Not just mangling pixels or uniformly displacing each by a few mm, but outright moving them into a different part of the image that was similar, yet slightly different. Maybe it's some sort of optimization or compression gone wrong? I.e. They detected a block that appeared to be the same as a previous one, so assumed they were the same and only kept one copy of that data?
      It's bizarre.
      
      Parent Share
      twitter facebook
      - Re:Anti-counterfeiting (Score:5, Funny)
        
        by Anubis IV ( 1279820 ) writes: on Tuesday August 06, 2013 @11:28AM (#44486719)
        
        You came up with the exact same conclusion as the author of the article you just read:
        Hey now, there's no need to accuse me of reading the article just because I looked at the pictures.
        
        Parent Share
        twitter facebook
- Probably an image quality enhancement fix. (Score:4, Insightful)
  
  by jellomizer ( 103300 ) writes: on Tuesday August 06, 2013 @12:00PM (#44487143)
  
  I expect the bug is because it is trying clean up the scanned image. Trying to account for what it thinks is missing data.
  14.13m^2, 21.11m^2, 17.42m^2
  It see 3 blocks of information that probably roughly looks the same to the software accounting for errors. The amount of pixels used in each are fairly close. I expect the scanner sees the three blocks and thinks they are the same, and tries to find the block that seems the most sharp and reproduces them over the other spots.
  Scanning isn't pixel perfect you get a different match. So the image cleaning processor will probably try to clean the numbers differently.
  
  Parent Share
  twitter facebook
- Re:Really? (Score:5, Insightful)
  
  by Sponge Bath ( 413667 ) writes: on Tuesday August 06, 2013 @09:04AM (#44485273)
  
  Scanning an article without comprehension and your complaining about your misinterpretation. Really?
  
  Parent Share
  twitter facebook
- Re:Really? (Score:5, Informative)
  
  by fuzzyfuzzyfungus ( 1223518 ) writes: on Tuesday August 06, 2013 @09:07AM (#44485305) Journal
  
  Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?
  These 'errors' are substantially worse than ordinary scanner suckitude or lossy-compression legovision: JBIG2's pixel-block matching creates the potential for a block containing one character to be mis-identified and replaced with a block containing a different character.
  The replaced character will be exactly as legible as text elsewhere on the page, just entirely incorrect.
  If it were just the scan quality being lousy, or somebody turning, say, JPEG compression up to the point of pain, mangled characters would be obviously mangled. Not as good as being legible; but the issue is obvious. In this case, the errors will look as good as the rest of the document.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by jeffmeden ( 135043 ) writes:
    
    Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?
    These 'errors' are substantially worse than ordinary scanner suckitude or lossy-compression legovision: JBIG2's pixel-block matching creates the potential for a block containing one character to be mis-identified and replaced with a block containing a different character.
    The replaced character will be exactly as legible as text elsewhere on the page, just entirely incorrect.
    If it were just the scan quality being lousy, or somebody turning, say, JPEG compression up to the point of pain, mangled characters would be obviously mangled. Not as good as being legible; but the issue is obvious. In this case, the errors will look as good as the rest of the document.
    After actually looking at the images in TFA, it does seem like there is a problem with the way 6/8 and 4/7 are interpreted. However, you can't say that the results aren't quite noisy; I would look at a scan like that with a squinty eye and be super annoyed at the jerk who couldn't just procure the *original* electronic format. Just because the scanner "seems to do ok" on other equally tiny numbers doesn't make it right. Get the goddamn original file.
- Re:Really? (Score:5, Informative)
  
  by xaxa ( 988988 ) writes: on Tuesday August 06, 2013 @09:12AM (#44485345)
  
  Scanning 7pt text at 200dpi with consumer level scanner technology and you're complaining about scan errors. Really?
  Consumer level? This isn't a home, or even home-office, machine. It's sold on the website [xerox.co.uk] under the office section.
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by Atzanteol ( 99067 ) writes:
  
  A $12,000 scanner/printer is "consumer level?"
- Re:Really? (Score:5, Informative)
  
  by UnknowingFool ( 672806 ) writes: on Tuesday August 06, 2013 @09:46AM (#44485649)
  
  If you read the article you would see it's not a simple case of scan error where a "13" appears blurry and looks like "B". Whole numbers are changed: 21.11--> 17.43. This is a major issue if it was on a construction drawing for example. A beam 4m too short would be a problem. Even if caught the engineer signing off might have to go through a whole audit process.
  
  Parent Share
  twitter facebook
- Re: (Score:3)
  
  by stjobe ( 78285 ) writes:
  
  Ah, my favourite Star Trek / Computer nerd pastiche:
  "I am Pentium of Borg. Division is futile, you will be approximated".
  Caused me endless mirth in the early nineties - and still does, although these day it's nostalgic more than funny.
- Re: (Score:3)
  
  by itsdapead ( 734413 ) writes:
  
  I work for Xerox. I specifically support these machines in a tier 3 capacity. I have not seen or heard a single case of this. My group handles calls from all of North America, and some South.
  Perhaps they're all trying to call the support number on the user guide that they just printed out... :-)
- Re:I call BS (Score:4, Informative)
  
  by Guy Harris ( 3803 ) writes: <guy@alum.mit.edu> on Tuesday August 06, 2013 @01:35PM (#44488361)
  
  I work for Xerox. I specifically support these machines in a tier 3 capacity. I have not seen or heard a single case of this.
  So does Francis Tse [xerox.com], and he's apparently heard of it.
  My group handles calls from all of North America, and some South.
  You might want to talk to somebody who handles calls from Western Europe - Germany [dkriesel.com], in particular.
  
  Parent Share
  twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

These numbers are not the true numbers (Score:5, Funny)

Re:These numbers are not the true numbers (Score:4, Funny)

Re: These numbers are not the true numbers (Score:5, Funny)

Re: (Score:2, Informative)

Re:These numbers are not the true numbers (Score:5, Insightful)

Re:These numbers are not the true numbers (Score:5, Funny)

Re: (Score:2)

Slashdot affected as well (Score:5, Funny)

Re:Slashdot affected as well (Score:5, Informative)

Re:Slashdot affected as well (Score:5, Funny)

Re:Slashdot affected as well (Score:4, Insightful)

Re: (Score:2)

Re:Slashdot affected as well (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:Slashdot affected as well (Score:5, Informative)

Re: (Score:3)

Re:Slashdot affected as well (Score:5, Funny)

"Windows" and "American" etymological fallacies (Score:2)

Re:Slashdot affected as well (Score:5, Informative)

Re: (Score:3)

Re:Slashdot affected as well (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

oh man, what a mess (Score:5, Informative)

Re:oh man, what a mess (Score:5, Insightful)

Re:oh man, what a mess (Score:5, Interesting)

Re:oh man, what a mess (Score:5, Funny)

Re: (Score:3)

Re: (Score:2, Insightful)

Re:oh man, what a mess (Score:5, Informative)

Re:oh man, what a mess (Score:4, Interesting)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re:oh man, what a mess (Score:5, Informative)

Re:oh man, what a mess (Score:5, Interesting)

JBIG2 (Score:5, Insightful)

Problem with JBIG2, not OCR (Score:3, Insightful)

Re: (Score:3)

The codecs commonly used with a container (Score:3)

see the Xerox user manual (Score:5, Informative)

Re: (Score:2, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:see the Xerox user manual (Score:5, Informative)

Re:see the Xerox user manual (Score:5, Insightful)

Re: (Score:3)

Re:see the Xerox user manual (Score:5, Insightful)

Re: (Score:3)

Proofreading @ Xerox Development? (Score:2)

Re: (Score:3)

Free Speech (Score:5, Funny)

Known Xerox Issue..... in documentation (Score:5, Informative)

Re:Known Xerox Issue..... in documentation (Score:5, Insightful)

Re: (Score:3)

Re:Known Xerox Issue..... in documentation (Score:4, Insightful)

Re:Known Xerox Issue..... in documentation (Score:4, Interesting)

Wub fur (Score:2)

I recognise the algorithm that gives those errors. (Score:2)

ImageRunner (Score:4, Funny)

Interesting (Score:4, Interesting)

Corporate decision (Score:4, Funny)

NSA BUG (Score:2, Funny)

I can't understand (Score:3)

Surprised nobody asked this... (Score:4, Informative)

Self-Correcting Bug (Score:5, Funny)

This is HUGE! (Score:5, Interesting)

Hello? QA? Hello!!! (Score:3)