Forgot your password?
typodupeerror
Firefox Mozilla Software IT News

Pdf.js Reaches First Milestone 164

Posted by timothy
from the daddy-what-were-operating-systems? dept.
theweatherelectric writes "The pdf.js project aims to implement a PDF viewer using standards-compliant Web technologies. The project has reached its first milestone: it renders the sample PDF (a paper on Mozilla's Tracemonkey JavaScript engine) perfectly. However, that perfection currently comes with some caveats: 'pdf.js produces different results on pretty much every element in the browser×OS matrix. We said above that pdf.js renders the Tracemonkey paper "perfectly" if you're running a Firefox nightly. On a Windows 7 machine where Firefox can use Direct2D and DirectWrite. If you ignore what appears to be a bug in DirectWrite's font hinting. The paper is rendered less well on other platforms and in older Firefoxen, and even worse in other browsers. But such is life on the bleeding edge of the web platform.'"
This discussion has been archived. No new comments can be posted.

Pdf.js Reaches First Milestone

Comments Filter:
  • Even reading the summary it is clear that this is a very, very early development work. This is their *first* milestone, of course it's going to be severely lacking in almost every way. Of course it's not cross-browser and doesn't allow selectable text... but eventually it will be. I, for one, think this is a great idea, and can't wait to see it done!
  • by Anonymous Coward on Monday July 04, 2011 @10:34AM (#36652094)

    I can understand the use of this to find and fix browser bugs.

    But it seems amazingly inferior to a platform native PDF reader, on any platform imaginable. It will be slower the native x86/ARM code by far, and won't integrate well with the desktop environment.

    What's with this trend recently to build everything on fundamentally sucky technologies?

    • by DrXym (126579)

      What's with this trend recently to build everything on fundamentally sucky technologies?

      I think it's becoming increasingly obvious that browsers need something that allows native client functionality without the burden of shoe horning everything through Javascript's loosely typed, garbage collectioned, non addressable world. LLVM is gaining a lot of steam so perhaps it should be that with each app seeing a limited API that maps out onto the DOM. Perhaps that can even be created from JS, e.g. an vmEval(url, canvas) function that loads bitcode from some url, turns it into an invokable object wh

      • Great idea, the bit code could even execute inside a vm for cross platform compatibility and also be constrained by a sandbox.
      • by yarnosh (2055818)
        So... like Java applets?
        • by DrXym (126579)
          No. Java is higher level, garbage collected, has its own system libraries which run indepedently of the browser, and is a very large environment in its own right.

          I'm suggesting that there should be provision for LLVM bitcode to be compiled and executed natively in the browser. It's only interaction with the outside world is via exposed DOM apis which are already security hardened. A canvas would be its "display", it sockets would map to websockets, it's file io to web storage and so on. The display could

    • by jlebar (1904578)

      It will be slower the native x86/ARM code by far, and won't integrate well with the desktop environment.

      Does your PDF reader integrate well with the browser environment?

      One of the major benefits of rendering PDFs in the browser, aside from the fact that users don't have to download, trust, and run a separate PDF viewer, is that you reduce the security vulnerability surface area. PDFs (well, Adobe Reader) is a major vector for attacks, but that goes away when you sandbox it in the browser.

      I think you might

      • by hedwards (940851)

        If it's a vulnerability thing, then what you really need to do is go over to Adobe and bitch smack the moron there that decided that it was a good idea to include scripting and linking abilities into a document format. And if you choose the Seattle branch you're just a short ways from MS so you can bitch slap the hell out of them for doing the same sort of bullshit with .DOC.

        Documents are for reading, if you want people to be able to fill in a form, then they should have to use a separate program. It's just

        • by sjames (1099)

          Oddly enough, PDF (a descendant of PostScript) has always been more program than data. Like PostScript, a pdf document is a program in a Forth like language that draws the document on a canvas. Adobe's mistake is in letting it out of the sandbox with bolted on extras.

          A number of others have managed to implement PDF sandboxes (often without the bolt-ons) without all the holes.

    • by Vellmont (569020) on Monday July 04, 2011 @02:24PM (#36654442)


      But it seems amazingly inferior to a platform native PDF reader, on any platform imaginable. It will be slower the native x86/ARM code by far, and won't integrate well with the desktop environment.

      What's with this trend recently to build everything on fundamentally sucky technologies?

      You're absolutely right. A platform native PDF reader is technically superior. But opening up a new window for each PDF you display really sucks as a user experience. To eliminate this sucky UI experience, browsers support PDF natively (I'm not sure why this hasn't happened), and not rely on Adobe reader, or some other helper application. Even if all the major browsers supported that TODAY, it would be literally years before a broad enough spectrum of people upgraded to use inline PDFs in a design.

      What implementing a PDF reader in javascript accomplishes is across the board inline PDFs today. No upgrades required. I think that's worth some sucky technology and inefficient code.

      • So are you asserting that a browser window with a PDF document is somehow worse than a PDF viewer's window? Or that a tab is better than a window?

        • by Vellmont (569020)

          I'm saying none of that. I'm saying that sometimes it would be very useful to display a PDF inline in the same page, and not have it displayed in another window, or tab. Another poster pointed out this is already possible. I'm not familiar with how well this works, and the limitations of this method. I will say that being able to treat a PDF like any other object and have it be manipulated programatically would be a huge advantage for some people.

        • It would be nice if text-based PDF's interacted with my browser the same way text-based HTML does. So find, save as and other functions would be browser native rather than the kooky half-breed we have today (with pdf reader stuffed into a browser tab for some docs and mysteriously for other pdf docs a new pop up window with native pdf reader).

      • by pond0123 (784875)

        But opening up a new window for each PDF you display really sucks as a user experience.

        Having "defected" from Win XP to Mac OS X back when Vista was released, it's been many years since I used Windows or Linux for long periods of time, rather than temporarily in VMs for work purposes. Now and then, stories like this, or even entire pieces of technology like the renderer in question, remind me just how awful things still are on other platforms.

        Another poster asked if the PDF renderer was integrated into the browser, rather than the OS. What a bizarre question. My PDF renderer is integrated

        • It's not a kludge, it's not a bodged add-on, it's an extensible, intelligent, well integrated piece of technology that's part of a wider architecture that makes more sense than any other OS architecture I've seen above kernel level.

          Indeed. I came from a long line of AmigaOS based systems, after which I finally "gave up" and got an XP based laptop after being forced through Win2k at work. That drove me mad for awhile and after significant playing with Desktop Linux (just never feels "right" to me... very happy with it on my servers, but not my dekstop), now I've got mostly Macs in the house.

          The system you're referring to here is indeed very simple and elegant. It reminds me a lot of what AmigaOS did with the "datatype" system - Appl

      • by elgaard (81259)

        > But opening up a new window for each PDF you display really sucks as a user experience.

        My browser can show PDF in tabs.
        But I almost always want PDF-s to open in a new window, full-screen.
        Why would I want to read a 200 page report inside a browser?

        The same goes for video, I usually want to view it full-screen not embedded in some page in a browser.

    • But it seems amazingly inferior to a platform native PDF reader, on any platform imaginable. It will be slower the native x86/ARM code by far, and won't integrate well with the desktop environment.

      Regarding speed, two things: First, this will spend most of its time in calls to the browser's Canvas API, which all browsers implement in C++. So it isn't clear that it should be significantly slower than a native implementation. Second, even if this were in 100% JavaScript, that is just around 5X slower than C++ these days. Rendering PDFs might be plenty fast enough at that speed, since you typically render once then show it for a long time. In other words, this isn't something like a game engine that nee

    • by yarnosh (2055818)
      The real question is, if you know you have to display your PDFs in a web browser, why not just convert them to something more web friendly on the server side and then display that to the client? It isn't like you'd be using pdf.js as your PDF viewer for any PDF. It has to be embedded in a website for specific PDFs on that site.
    • by roca (43122)

      Have you tried it? It's fast.

      One interesting fact that wasn't called out in the blog post is that since Gecko's is GPU-accelerated on Windows 7, on that platform pdf.js is GPU-accelerated, unlike every other PDF viewer I know of.

      Browser developers are doing tons of work to make graphics and JS incredibly fast. pdf.js leverages that.

    • The point is that once you can draw a PDF on the screen, you can draw anything. It means you can implement photoshop in Javascript. More importantly, it means you can draw something on the screen, and get it to render exactly the way you want it to, on any system. Right now this isn't possible in a browser, and it sucks.

      It is the PDF language that matters, which is basically a successor to Postscript, not the bloated document reader.
    • It's actually rather fast here, as fast as the native reader.
      Somehow, its not as smooth in Chrome as it is in Firefox however.

  • by hackertourist (2202674) <hackertourist.xmsnet@nl> on Monday July 04, 2011 @10:35AM (#36652104)

    I currently have PDFs set to be downloaded and opened in an external application, because PDF rendering in a browser tab (using Adobe's PDF plugin) fucks up important shortcuts: Cmd-W no longer closes the tab but throws up an annoying dialog. That alone would be reason enough to switch.

    • by Yvan256 (722131)

      Small note to webmasters everywhere (if you think about what the parent said): what I hate is websites that force PDF files to be downloads instead of letting my browser handle them. On Mac OS X, viewing a PDF is basically the same as viewing a JPEG. No Adobe reader required, it just works.

      • That capability is Safari's, nothing to do with OS X. Google Chrome can also display PDF's inline.
        • by Yvan256 (722131)

          OS X itself handles PDF just fine (Quick Look, Preview, Safari, print directly to PDF from any application that can print, etc).

          • by tlhIngan (30335)

            OS X itself handles PDF just fine (Quick Look, Preview, Safari, print directly to PDF from any application that can print, etc).

            That's because OS X's underlying display API is... display PDF! Similar to ye olde Solaris Display PostScript. As a side effect, display and generation of PDFs is trivial - you're outputting to a file rather than to the rasterizer.

            It's also the reason why PDFs are trivially displayed in iOS as well - again, being based on OS X means it also inherits display PDF.

      • what I hate is websites that force PDF files to be downloads instead of letting my browser handle them.

        The problem is that the web site incorrectly specifies the file mime type as e.g. "Content-Type=text/html" instead of "Content-Type=application/pdf". While in theory the ".pdf" extension or content inspection could be used to guess it, Firefox (for example) does not use mime type guessing since it is a security issue: What should Firefox do with this file? [mozillazine.org].

    • by headbulb (534102) on Monday July 04, 2011 @11:00AM (#36652284)

      I have it disabled since it's buggy and it's a huge security risk.

    • Adobe's PDF sucks. I use either:
      a) Google Quick View (my favourite),
      b) Chrome's built-in PDF viewer (which is fast, and doesn't crash often, and doesn't hang everything while the PDF is being downloaded.), or
      c) Foxit's plugin (very rarely),
      depending on the browser and OS being used. But I tried it out, and though the rendering was horrible (Chromium daily on Natty), it didn't seem to hang or ask anything on being closed. The slide-out sidebar was neat, but the open file button didn't do anything.
    • by Malc (1751)

      I was hoping somebody around here might explain the point of opening PDFs embedded in the browser. Instead, your post just confirms my own prejudices. The PDF plugins that I've seen trade off screen space for another toolbar, restrict the functionality over standalone PDF viewer, and break the browser's UI. Chrome's handling of PDF was the single reason I ditched it after a few weeks last year when I tried to switch to it from Firefox (even set to open the PDF viewer was broken as it didn't seem to pass

  • by Anonymous Coward

    So, "Firefoxen" is now the plural of Firefox?

    • by Anonymous Coward

      Emacs --> emacsen; ergo, firefox --> firefoxen. Now, let's go back to comparing Officen and OSen.

      • by Maclir (33773) on Monday July 04, 2011 @12:23PM (#36653314) Journal

        No. The plural of "fox" is "foxes".

        If someone can't use the English language correctly, how seriously do you expect me to take anything they write?

        • Re: (Score:2, Insightful)

          by Anonymous Coward

          Despite your relatively low uid, you must be new to hacker slang.

          from the Jargon File [catb.org]:

          On a similarly Anglo-Saxon note, almost anything ending in ‘x’ may form plurals in ‘-xen’ (see VAXen and boxen in the main text). Even words ending in phonetic /k/ alone are sometimes treated this way; e.g., ‘soxen’ for a bunch of socks. Other funny plurals are the Hebrew-style ‘frobbotzim’ for the plural of ‘frobbozz’ (see frobnitz) and ‘Unices’ and ‘Twenices’ (rather than ‘Unixes’ and ‘Twenexes’; see Unix, TWENEX in main text). But note that ‘Twenexen’ was never used, and ‘Unixen’ was seldom sighted in the wild until the year 2000, thirty years after it might logically have come into use; it has been suggested that this is because ‘-ix’ and ‘-ex’ are Latin singular endings that attract a Latinate plural.

          Now get off my lawn.

          • by Tim C (15259)

            That's as may be, but *I* am not new to hacker slang, and frankly, it sounds stupid.

            Yes, the plural of ox is oxen. No, the plural of box is not boxen, nor is foxen the plural of fox.

            Now you get off *my* lawn.

      • by hedwards (940851)

        Actually proper English dictates that with Emacs and Firefox you'd need a partitive, making it versions of Emacs or versions of Firefox.

  • The one true markup (Score:3, Interesting)

    by Ed Avis (5917) <ed@membled.com> on Monday July 04, 2011 @10:50AM (#36652208) Homepage
    This is really cool. Now we just need to have web2js instead of web2c, and we can typeset documents with TeX in the browser.
    • I've written a clang plugin that translates C to JavaScript, so it should be possible to chain that with web2c to produce JavaScript...
    • Are you sure you want to do that? I can understand typesetting math in the browser, but typesetting entire TeX documents?
      There's already an AMS-endorsed way of typesetting TeX math (Javascript-based) called MathJax (http://www.mathjax.org/), and it works pretty well (well enough for sites like http://mathoverflow.net./ [mathoverflow.net.]

    • But TeX/LaTeX works by having a fixed page size. While one can make the vertical height large enough to accommodate the page, how do you adjust the width if a user resizes the browser?

  • I'm the one who finds this "We do all things now in the browser" highly suspect. I already have a perfectly fine Pdf viewer, called Okular. Why not just give me a link to the Pdf file, so that I can download it, use my favorite Pdf-Viewer and print it out if I like?

    I would really appreciate their affords, but I just known this is not done for my convenience, but for cooperate interests. The only reason this is developed, so that they can put some Ads inside the Pdf file, prevent me from downloading it or pr

    • Re: (Score:2, Informative)

      by paimin (656338)
      Bzzzzt! iOS handles viewing and saving PDF's fine. Thank you for playing "I Bashed Apple on Slashdot". Try again.
    • by tepples (727027)

      I already have a perfectly fine Pdf viewer, called Okular.

      From Okular's web site [kde.org]: "For Windows have a look at the KDE on Windows Initiative webpage for information on how to install KDE on Windows." The download page on KDE Windows Initiative [kde.org] links to detailed installation instructions [kde.org]. I'm not in a position to try it myself because the PC on which I'm typing this has integrated graphics, which isn't enough to run KDE according to a forum post [google.com] linked from a Google search for kde system requirements.

      • I'm not in a position to try it myself because the PC on which I'm typing this has integrated graphics, which isn't enough to run KDE according to some idiot who doesn't know what he's talking about.

        Fixed that for you. KDE 4 works perfectly with integrated graphics, you just have to turn desktop effects off. It's perfectly usable without desktop effects enabled, all applications detect it and degrade gracefully, and all the controls etc. work pretty much the same. I have a laptop with integrated graphics that doesn't support desktop effects, and I don't notice the difference apart from once a week or so when I suddenly wonder why my terminal emulator doesn't have a transparent background.

        This graceful

      • Oh, and I also use Okular on Windows. It works quite nicely.

      • by makomk (752139)

        Is this a troll, or did you just spectacularly miss the point? That forum post is fairly obviously about the system requirements for the KDE equivalent of Aero Glass or whatever it's called these days...

    • by Anonymous Coward

      The end game is that by shifting focus from desktop applications to cloud applications makes the desktop operating system much less important.
      Envisage a day when you dont need to run just so you can run that one specific app.

      this might sound over the top - but i am sure that given time we will be able to play the new "Crysis" (whatever that might be) in the browser on any operating system. (of course there will still likely be some beefy hardware requirements and a juicy broadband). Although im fairly con

  • pdf is old vector graphics news. If they want to help a parky [google.com] out they can get TinySVG support built in to Firefox so I can finish rebuilding all of my XUL UI's in SVG. ...that don't work now unless the user knows how to re-enable support then ends up getting owned instead of a warning like getting a self signed cert... Cough. Sorry. Oh while I'm dreaming, getURL, putURL, and parseXML functions so I don't have to "if typeof (parseXML=='undefined')" override them every time would be nice too :) Oh and t

  • I find it quite hilarious that people speak seriously about coding artificial intelligence as if it will happen in the this decade, when at the same time we can't even achieve a consistent rendering of the same elements in different browsers.

    • by Blobule (913778)
      These problems are generally disjoint. Identical cross browser rendering depends on everyone playing the standards game or everyone playing the let's add fixes for every browser game. The AI problem depends on solving the problem of genuine artificial intelligence without the need to pay attention to cross platform compatibility. That said, I doubt we'll see real AI this decade either :)
    • by Kz (4332)
      You're missing the goal: we need strong AI so that all web documents can be sentient. That way, they'll do a conscious effort to be usable on any kind of browser. See? it's not because AI is cool, it's our only hope!
  • Fun fact: (Score:5, Informative)

    by giuseppemag (1100721) <giuseppemag@g m a i l .com> on Monday July 04, 2011 @11:16AM (#36652396)
    I can render a PDF perfectly on all OSes I own (Windows, OS X, iOS, Windows Phone 7) already!
  • by FlyingGuy (989135) <flyingguy@g m a i l . com> on Monday July 04, 2011 @11:25AM (#36652504)

    This is just silly. While I can appreciate it from a point of curiosity and it is probably a fun project, this is really overloading the browser.

    I would submit that things like this are actively breaking the browser paradigm. Every PDF viewer allows you to save a local copy of the PDF after they have read it from the temp directory or the download directory. To implement this thing correctly is would require that JS have direct access to the file system, which as I understand it, aint fucking supposed to happen, since that would create untold numbers of security problems in a system already plagued by security problems.

    While there may be arguments that this would be ok, they would all be moronic.

    The entire notion of the browser needs to be forked out to an application shell with hard as nails security and a presentation shell and never the twain shall meet.

    • by jlebar (1904578)

      To implement this thing correctly is would require that JS have direct access to the file system, which as I understand it, aint fucking supposed to happen

      Too late. [w3.org]

      The entire notion of the browser needs to be forked out to an application shell with hard as nails security and a presentation shell and never the twain shall meet.

      What a novel [chromium.org] idea [webkit.org]!

    • Next step: implementing gecko or webkit in javascript, so that developers have the same experience everywhere.

      In fact, now that I'm thinking of it, that would not be such a bad idea.

    • by arose (644256)
      Or the save button could be a link to the PDF.
  • by Anonymous Coward

    Direct2D and DirectWrite? Sorry but browser graphical acceleration must end.
    WebGL should be implemented with no hardware acceleration, using graphic card emulation.

    • by Skuto (171945)

      Neither are related to WebGL specifically. They're used for much more mundane things such as Canvas rendering.

    • by BZ (40346)

      > Sorry but browser graphical acceleration must end.

      You can have two of the following three:

      1) Scrolling that doesn't feel like molasses.
      2) Box blur and shadow effects of various sorts.
      3) Sofware rendering backends.

      You're presumably voting for #1 and #3, but web designers are voting with the figers for #2, so a browser's options at that point are limited to either not supporting those and reaping the web-developer hate consequences (c.f IE6) or dropping either #1 or #3. Which one do you think they're

  • The Javascript code isn't doing the rendering of text. It uses dynamically loaded fonts and lets the platform's own font renderer render the glyphs. The Javascript code isn't pushing pixels.

    There's less need for PDF than there used to be, now that you can download fonts in the browser. It might be worthwhile to take this PDF viewer and turn it into a server-side PDF to HTML translator.

    • by hey (83763)

      Thank you for saying that. It seems to be that displaying an HTML file isn't much different that displaying a PDF file. At high level the program reads the description (eg this font size, put the text here, etc) and hands it off to some renderer.

    • by tyrione (134248)

      The Javascript code isn't doing the rendering of text. It uses dynamically loaded fonts and lets the platform's own font renderer render the glyphs. The Javascript code isn't pushing pixels.

      There's less need for PDF than there used to be, now that you can download fonts in the browser. It might be worthwhile to take this PDF viewer and turn it into a server-side PDF to HTML translator.

      OS X is Display PDF and 10.7 is OpenGL 3.2 system-wide. PDF is extremely lightweight on OS X.

  • But WHY? Why spend precios cycles that eat battery life and heat up your PC innards doing the same thing through twenty layers of twisted human logic, that a piece of native runtime plugin code can do as well? 'Plugin' is just a word, it doesn't need to be insecure, alien, buggy or . And even if they are that, the problem lies at another level.

    If anything, Pdf.js will be suitable where and when energy and resource conservation isn't a factor.

    As for me, I prefer to avoid all the extra layers of abstractions

    • by Vellmont (569020)


      Please enlighten me, a software developer of many years, what is this gold that is Pdf.js? I mean, apart from proof-of-concept being gold in itself.

      Gold is just a bit of an overstatement. More like a valuable, but not precious metal like copper.

      The value is that you can display a PDF inline with the website, rather than bringing up a clunky external application like adobe reader. "But you could do that with a plugin!" you say? Correct, but what you can't do with a plugin is actually get most people to in

    • by BZ (40346)

      > that a piece of native runtime plugin code can do
      > as well

      Because in some environments you can't drop in native runtime plugin code.

      Or put another way, where's that 64-bit Flash for Linux? Or heck, for that new architecture that people will come up with tomorrow?

      Whereas this way porting a web browser (a must for a new consumer-facing architecture anyway) will get you a PDF renderer for free.

      > 'Plugin' is just a word, it doesn't need to be
      > insecure, alien, buggy

      A PDF-renderer plug-in can be s

    • by Atzanteol (99067)

      This is a great point. With the current scarcity of precious computer cycles left in the world why would we waste them on this? The latest government report on general availability of cycles estimates that we will be running out in the next 50 years. Just think of that - a world with no computer cycles remaining! In fact we'll feel the crunch far before that as the scarcity drives up computer cycle prices. Every cycle needs to be preserved and only used for productive purposes!

      This post made using as f

  • Whoever wrote that should just go and shoot themselves in the face a couple of times.

The one day you'd sell your soul for something, souls are a glut.

Working...