Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×

Celebrate the XML Decade 177

IdaAshley writes "IBM Systems Journal recently published an issue dedicated to XML's 10th anniversary. Take a look at XML application techniques, and general discussion of the technical, economic and even cultural effects of XML. Learn why XML has been successful, and what it would take for XML to continue its success."
This discussion has been archived. No new comments can be posted.

Celebrate the XML Decade

Comments Filter:
  • by eldavojohn ( 898314 ) * <eldavojohn@noSpAM.gmail.com> on Thursday November 16, 2006 @10:51PM (#16879658) Journal
    Celebrate the XML Decade
    I tried. Oh Lord, how I tried!

    I started this morning by talking to everyone in XML.

    I hope the black eye my coworker gave me heals before my presentation to the CTO tomorrow morning :-(
    • by Duhavid ( 677874 ) on Thursday November 16, 2006 @11:45PM (#16880022)
      Really...

      We all needed to leave the first post in this to the guy with
      the sig

      "XML is like violence, if it doenst fix the problem, you arent using enough"

      Or words to that effect.
      • by Dahamma ( 304068 ) on Friday November 17, 2006 @04:45AM (#16881356)
        Someone put that in our Bugzilla quips a while back - it's still one of my favorites!

        My conspiracy theory is that XML was secretly invented by Intel in order to require 3GHz processors for the simplest of tasks.
      • That would be me, but I ripped it from some AC comment.
        • by Duhavid ( 677874 )
          It's still a great sig. I laughed like the dickens when
          I first saw it, especially since I was working for a place
          that seemed to apply that theory liberally. You just could
          not read the code and know what would happen, it was all
          driven by the XML fed into it.
    • by Randolpho ( 628485 ) on Friday November 17, 2006 @12:24AM (#16880262) Homepage Journal
      I started this morning by talking to everyone in XML.
      <conversation>
      <greeting type="friendly">Hello, fellow coworker type dude!</greeting>
      <response type="violent">Have a black eye!</response>
      </conversation>
      • by Anonymous Coward on Friday November 17, 2006 @12:43AM (#16880382)
        <greeting type="friendly">Hello, fellow coworker type dude!</greeting>
        That's a poorly designed format. You should make "greeting" a complex type and use elements to represent the greeting text and the greeting type. Then, the greeting type can be properly validated against a W3C XML Schema. There's no valid reason to use an attribute in cases like these.
        • by zootm ( 850416 ) on Friday November 17, 2006 @06:31AM (#16881764)

          That's a poorly designed format. You should make "greeting" a complex type and use elements to represent the greeting text and the greeting type. Then, the greeting type can be properly validated against a W3C XML Schema. There's no valid reason to use an attribute in cases like these.

          I took the liberty of revising the format a little, is this better?

          <?xml version="1.0" encoding="UTF-8" standalone="no"?>
          <conversation
          xmlns="http://slashdot.org/sarcasm/XML/conversatio n"
          xmlns:html="http://www.w3.org/1999/xhtml">

          <participants>
          <participant>
          <short-name>OP</short-name>
          <full-name>Original poster</full-name>
          </participant>
          <participant>
          <short-name>CW</short-name>
          <full-name>Unwitting coworker</full-name>
          </participant>
          </participants>

          <relationships>
          <two-way-relationship name="coworker">
          <person>OP</person>
          <person>CW</person>
          </two-way-relationship>
          </relationships>

          <greeting time="2006-11-17T10:12:10Z" speaker="OP" targets="CW">
          <type>
          <demeanour>friendly</demeanour>
          </type>
          <speech>
          <text type="text/plain">
          Hello, fellow coworker type dude!
          </text>
          </speech>
          </greeting>

          <response time="2006-11-17T10:12:34Z" speaker="CW" targets="OP">
          <type>
          <demeanour>angry</demeanour>
          <context>
          <divorce type="messy"/>
          <custody-battle type="messy"/>
          </context>
          </type>
          <speech>
          <text type="application/xhtml+xml">
          Have a <html:em>black eye</html:em>!
          </text>
          </speech>
          <action>
          <punch>
          <recipient>OP</recipient>
          <aim>eye</aim>
          </punch>
          </action>
          </response>

          </conversation>

          I'm sort of disappointed that I only got to use two namespaces. Can't get indentation to work either, unfortunately.

        • That's a poorly designed format. You should make "greeting" a complex type and use elements to represent the greeting text and the greeting type. Then, the greeting type can be properly validated against a W3C XML Schema. There's no valid reason to use an attribute in cases like these.

          You're his co-worker, aren't you? Glad to see you've calmed down a bit.

    • Actually, I was looking at the title and I did a double-take, since the first time I saw it I thought it said "Celebrate the XML Debacle". Oop. I thought, surely it's not that bad...

      Eh, what do I know? Maybe it is that bad. =)

    • Re: (Score:3, Funny)

      by gbobeck ( 926553 )
      I started this morning by talking to everyone in XML.

      Care to share the DTD and schema you used for that?
  • by Ant P. ( 974313 ) on Thursday November 16, 2006 @11:02PM (#16879730)

    Marketing to PHBs, mostly.

    However here on earth a lot of people still hand-code the stuff. IMO a C-like syntax using nested {}s would've been better.

    • by MP3Chuck ( 652277 ) on Thursday November 16, 2006 @11:15PM (#16879812) Homepage Journal
      "IMO a C-like syntax using nested {}s would've been better."

      JSON [wikipedia.org]?
      • by Ankh ( 19084 ) * on Friday November 17, 2006 @01:19AM (#16880588) Homepage
        A lot of people ask about using a different syntax, such as @name{....} as Scribe (and later LaTeX) did. Note that @element{xxx} is in fact a possible syntax that can be defined using SGML. But we were after something different.

        When we designed XML, we had over a decade of solid experience with interoperability in the world of SGML, and we also knew about the kinds of problems that different sorts of users had with different sorts of syntax.

        The primary users of SGML-based documentation systems were not programmers. They were people who were often not likely to know about a bracket-matching option in an editor or about code indenting, for example. But they were still legitimate users.

        You can't easily test the markup in a declarative system: if in an HTML document I used H3 instead of P in a document it might not look right, but it would still parse OK. If I muddle up Author and Title in a bibliography, same thing.

        So, the redundancy of end tags in XML is there because, in practice, if you didn't have it, we had learned that our users had problems correcting their documents, and we knew that, in general, it was only rarely possible for software to give the users much help. There were some experiments early on with </>, allowed by SGML (with various options set) to end any element; it soon became obvious that this caused more problems than it was worth, and even Microsoft disabled the troublesome feature in their XML parser.

        It's true that today XML is used in lots of situations we didn't predict. We were amazed that by the time we got XML published as a Recommendation there were over 200 users. So no, we didn't predict the future percfectly. But the popularity of XML shows we can't have done all that badly, really ;-)

        Liam

        (Liam Quin, currently W3C XML Activity Lead)
        • Re: (Score:3, Insightful)

          by thsths ( 31372 )
          > So, the redundancy of end tags in XML is there because, in practice, if you didn't have it, we had learned that our users had problems correcting their documents, and we knew that, in general, it was only rarely possible for software to give the users much help. There were some experiments early on with , allowed by SGML (with various options set) to end any element; it soon became obvious that this caused more problems than it was worth, and even Microsoft disabled the troublesome feature in their XML
          • by Ankh ( 19084 ) * on Friday November 17, 2006 @11:40AM (#16884552) Homepage
            > The error message does not help people all that much

            One case where it helps most is when an incorrect start tag was applied; with the empty end tag this could not be detected, and it turned out to be more comman than one might expect. You're right that the error messages often aren't good, but did you ever try debugging a large SGML document with OMITTAG and SHORTREF in use? The error message was almost always "characters found after end of document" because the required strategy by SGML (in one of the most common error situations) was to close elements until you got a match, so the parser typically closed elements all the way up the tree to the document element, and then gave up.

            We were bound, at the time, to strict SGML compatibility; perhaps if we had known XML would succeed we could have made more changes, but then we would have strayed further from the well-trodden path of implementation experience.

            As to comments for attributes, I agree with you; we lost them, though because we needed a language simple enough it could be processed e.g. with Perl. We didn't dare dream that Perl would support XML natively!

            I agree with you that structured tools should generally be used. The redundancy and simplicity help computer-generated XML, and help to detect, say, missing portions of documents. If xml-rpc is scary, s-expr rpc is even scarier! :-)

            Liam
        • So, the redundancy of end tags in XML is there because, in practice, if you didn't have it, we had learned that our users had problems correcting their documents, and we knew that, in general, it was only rarely possible for software to give the users much help.

          If you're going to make claims about the usability of something like XML, you better be able to back them up with user studies. Of course, there are none. But, in fact, you don't even seem to be clear on the concept of what you were optimizing: err
          • Re: (Score:3, Interesting)

            by Ankh ( 19084 ) *
            Thank you for your kind words :-)

            We weren't really aiming at HTML users.

            I'm afraid the only useability studies of SGML tools that I saw were not released to the public. At the time I worked for a vendor of SGML-based software (e.g. including an editor, a viewer, a development environment) and it was a matter of great concern to us.

            It's possible we could open up the archives of the XML Working Group, but it would mean getting the permission of several hundred people. I'll ask some people at the upcoming XM
            • I'm afraid the only useability studies of SGML tools that I saw were not released to the public.

              I'm not talking about the usability of SGML tools, I'm talking about the usability of the syntax itself. You claimed that its cumbersome syntax was chosen because people would often have to use it without tools. In order to use that justification, you really need to compare multiple different kinds of syntactic choices experimentally, and you actually have to have some sound criteria to compare it on. Even you
              • by Ankh ( 19084 ) *
                I should admit that our mandate was to put SGML on the Web, not to make an entirely new syntax; we really didn't expect people to be using what became called XML for anything other than predominantly textual documents. There are, in fact, problems with anonymous end tags for metadata: these are not in general problems for computer programming so much since you can test the code in a way that's often not possible with metadata interchange. Unfortunately, no, I can't give you references for studies that wer
          • by nuzak ( 959558 )
            > Don't kid yourself: XML is popular because it's similar to HTML

            I find it amusing that you're lecturing one of the creators of XML about the finer points of its lineage. SGML was quite successful long before HTML had ever hit the scene. Perhaps you've heard of SGML and a little-known app called FrameMaker that output it? I sure bet he has.
    • by porkThreeWays ( 895269 ) on Thursday November 16, 2006 @11:20PM (#16879842)
      Sorta... XML came at a time when there weren't a whole lot of good viable data representation standards. Those that did (i.e. SGML) were too complicated for light use. XML was meant to be used by the masses while still technically remain an SGML subset. We have better alternatives today, but once something is in widespread use, it's not going away for awhile.
    • I keep hearing this and it sees foolish every time. If you just used {} how would you easily tell which tag you were closing? It would be too easy to mistake one brace for another, especially when there are several tags. Sure it'd be more efficient: but the idea was to have something that was equally readable by machines and humans. You take any non-trivial piece of XHTML or other XML and convert it to your new {} syntax. Then go try to add some more mark-up to it. And to non-technical users it would be eve

      • Re: (Score:3, Funny)

        by theodicey ( 662941 )
        It would be too easy to mistake one brace for another, especially when there are several tags

        I hack LISP, you insensitive clod!
      • by Ant P. ( 974313 )
        People seem to have coped fine reading code without closing tags for the past 30 years.
    • by smallpaul ( 65919 ) <paul @ p r e s c o d . net> on Friday November 17, 2006 @12:07AM (#16880132)
      A curly brace syntax would have been a better format for "large scale enterprise publishing"? As someone who has spent more than a decade in that field, I must disagree strongly. A curly brace would have been better to allow enable generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. [w3.org] Please do not confuse what XML is used for with what it was designed for. There is a reason that XML delivery units are called "documents" and not "messages".
    • by Decaff ( 42676 )
      Marketing to PHBs, mostly.

      Yes, because only PHBs would be interested in portable data that can be easily transformed, and have new formats added without losing old data.

      I mean, developers would never want that, would then?
    • Take a look at YAML [yaml.org]. That looks programmer friendly.

  • by d3ik ( 798966 ) on Thursday November 16, 2006 @11:02PM (#16879734)
    ... and most "enterprisey" Java developers have never met a problem that couldn't be fixed with more XML.
    • by ashultz ( 141393 )
      That gets my goat too.

      Nothing makes a set of code harder to deal with than taking half of it and writing it in a variety of XML config files and then scattering them throughout the distribution. That way you ensure that anyone doing something foolish like trying to understand it through javadoc or use their IDE to learn it gets nowhere.

      I'm going to go cry now.
    • by dch24 ( 904899 ) on Friday November 17, 2006 @12:26AM (#16880276) Journal
      My bosses were wary when I suggested XML as our data representation for a new project. Here were some of the arguments:

      Pro
      • Easy to change the schema, don't have to convert old data.
      • They didn't know exactly what XML was, so if I recommended it, ... (a.k.a. "gee whiz" factor?)
      • The other developers liked the idea
      Con
      • They weren't sure whether this would increase (better system = save time?) or decrease (reinvent the wheel = waste a lot of time in meetings?) productivity
      • Takes lots of space (no "binary XML")
      • Slow processing, right? (see "Takes lots of space")

      Eventually we settled on gzipped xml. It required a little more code, but everyone seemed happy. Oh, and we stored images as separate .png.

      I think my experience is pretty common, though. And from experience, libxml2 + libz is still very, very fast, and there's not a (whole lot) of wasted space.

      I'd like to hear other people's success stories, if anyone wants to reply... I liked reading the article, too.
      • by j. andrew rogers ( 774820 ) on Friday November 17, 2006 @04:21AM (#16881266)
        The "slow processing" is caused by more than taking a lot of space. XML is basically a document markup but is frequently and regular used as a wire protocol, which has very different design requirements if you want a good standard. And in fact we already have a good standard for this kind of thing called "ASN.1", which was actually engineered to be extremely efficient as a wire protocol standard. (There is also an ITU standard for encoding XML as ASN.1 called XER, which solves many of the performance problems.)

        Arguably the single biggest problem with XML that causes slow processing is that software can predict almost nothing about an XML stream and therefore has to allow for anything. The opening bracket tells you very little about what to expect, and creates few implicit failure or non-conformance tests that allows one to terminate processing because there is no definition of "unreasonable". If I want to embed a terabyte of data between XML tags, there is no built-in basic mechanism to inform the software of how much data I should expect to see before a closing tag and no basic mechanism to cue the software as to the type of data to expect. (Yes, you can sort of do it with lots of other layers strapped on, but it isn't core and strapping it on adds complexity.) This is the primary reason it gives miserable performance as a wire protocol format -- the software cannot make decisions about the data without slurping most or all of it, with no way to predict what "most" or "all" actually is. In well engineered standards such as ASN.1, they use the good old tag-length-value (TLV) format. The "tag" tells you what to expect, the length tells you how many bytes to expect, and the value is the actual data. In short, the encoding tells the software exactly what it is about to do before it does it in enough detail that the software can make smart and performant handling decisions.

        The only real advantage XML has is that it is (sort of) human readable. Raw TLV formatted documents are a bit opaque, but they can be trivially converted into an XML-like format with no loss (and back) without giving software parsers headaches. There is buckets of irony that the deficiencies of XML are being fixed by essentially converting it to ASN.1 style formats so that machines can parse them with maximum efficiency. Yet another case of computer science history repeating itself. XML is not useful for much more than a presentation layer, and the fact that it is often treated as far more is ridiculous.
        • The only real advantage XML has is that it is (sort of) human readable.

          Actually, it is not. Many people I know, and me, have trouble looking at XML config files that span more than a few rows. You need a tool that presents the XML document as a tree, so you can collapse some nodes in order to focus in the interesting ones.

        • by Kjella ( 173770 )
          There is buckets of irony that the deficiencies of XML are being fixed by essentially converting it to ASN.1 style formats so that machines can parse them with maximum efficiency.

          There's no secret binary is faster. I imagine you can make a lot more compact datastructures with TLV encoding, and codes instead of actual tag names. Do I care? No. XML is brillient for everything I use it for, and if there's a better way to encode XML for some edge cases fine. If you want to go "maximum efficiency" I'm sure you c
      • The company I work for has had a lot of success with XML, and are planning to move the internal data structure for our application from maps to XML. There is one simple reason for our sucess with it: XSLT. A customer asks for output in a specific format? Write a template. Want to display the data on a web page? Write a template that converts to HTML. Want to print to PDF? Write a template that converts to XSL, and use one of many available XSL->PDF processors. Want to use PDF forms to input data? Write a
    • by TheMCP ( 121589 )
      Java? What is this "Java"? I program in XML [waterlanguage.org].
  • by Centurix ( 249778 ) <centurixNO@SPAMgmail.com> on Thursday November 16, 2006 @11:06PM (#16879760) Homepage
    This year I'll be sending out christmas cards in XML and then placing a large banner outside my house with the appropriate schema.

    Then with every following year, I'll be sending a stylesheet card which they can apply to the original XML.

    And if they need to locate their names on the card, they can use //recipient[@name='mum']
  • by elving ( 133577 ) on Thursday November 16, 2006 @11:07PM (#16879766)
    Strange that an article celebrating XML [w3.org]'s anniversary would neglect to mention XML's creator [tbray.org]. I wonder if the fact he works for a competitor [sun.com] has anything to do with it...
    • by tbray ( 95102 ) on Friday November 17, 2006 @01:49AM (#16880714) Homepage Journal

      I have to do this once per year or so, here's the 2006 iteration: I am not XML's inventor. There were 150 people in the debating society and 11 people in the voting cabal and 3 co-editors of the spec. Of the core group, I (a) was the loudest mouth, (b) was independent so I didn't have to get PR clearance to talk, and (c) don't mind marketing work.
      -Tim

      • by jlowery ( 47102 ) on Friday November 17, 2006 @02:39AM (#16880936)
        Al Gore declaims the same every anniversary of the Internet.
      • by x2A ( 858210 )
        you're too modest ;-)

        • Re: (Score:2, Funny)

          by ravenlock ( 693538 )
          I'd try to be modest too if people blamed me for XML :P
          • by x2A ( 858210 )
            *lol* very good point! I'd probably try and deny it in as fewer characters with as little punctuation as possible, just to lend more credit to the idea that I wouldn't invent such a thing.

      • Re: (Score:3, Informative)

        by smallpaul ( 65919 )
        In addition, XML was never intended to be an "invention". It was a simpification. Some innovation slipped in, but the vast majority was just debating what aspects of SGML to strip out and how to fix some well-known flaws in it. The innovation primarily was about how to integrate modern standards like URLs and Unicode.
  • news flash (Score:2, Insightful)

    by User 956 ( 568564 )
    Take a look at XML application techniques, and general discussion of the technical, economic and even cultural effects of XML.

    Cultural Effects? This is a spec for structuring data, not a Picasso.
    • by grcumb ( 781340 )
      Take a look at XML application techniques, and general discussion of the technical, economic and even cultural effects of XML.
      Cultural Effects? This is a spec for structuring data, not a Picasso.

      Philistine. You just don't appreciate abstraction.

      8^)

  • XML Decade? (Score:5, Funny)

    by RealGrouchy ( 943109 ) on Thursday November 16, 2006 @11:35PM (#16879954)
    Wait... let me figure this one out...

    MCMXC was 1990...
    MDCCCLX was 1860...

    I give up! Which decade was XML?

    - RG>
    • by guruevi ( 827432 )
      Converting XML to Decimal is 1060. Long time ago ;-)
      • Re: (Score:3, Informative)

        by Anonymous Coward
        Actually, that would be 1040 -- 'X' (10) before 'M' (1000) = 990 + 'L' (50) = 1040
        • No. 1040 is MXL.

          XML is non-convertable.
          It's as much 1040 as it is 1060 as it is -940 (MX = 990, which is before the L). All of which are wrong answers.
          It's like the Roman "divide by nero error".

    • by aibrahim ( 59031 )
      To answer the joking question...

      XML = 1040

      Oh, those 40's were great too. Lots of good gossip. Macbeth killed Duncan. William the Conqueror took Normandy. There was that business with Zoe, Michael, Theodora and Constantine in the Eastern Roman Empire. Oh, and don't get me started on the Simonious Popes.
  • Stuck (Score:2, Insightful)

    by Duncan3 ( 10537 )
    So we're officially stuck with this crap forever.

    Yay! Lets party!

    XML is for data interchange, nothing else. Unfortunately, it's being used for everything but.
    • Re: (Score:2, Insightful)

      by l0b0 ( 803611 )
      XML is for data interchange, nothing else.

      Isn't all data interchanged? From client to server, from blogger to browser, from developer to developer, etc. Any data which is not interchanged is either useless or forgotten. And XML has shown its strength in all these areas: Ease of human and computer parsing.

      • XML is horrible to human-parse. If you think XML is OK for human parsing it is because you've not seen real file formats. I've freakin' worked with structured *binary* formats that was usually easier to interpret than most XML (the IFF - interchange file format - for the Amiga).

        Eivind.

        • XML is horrible to human-parse.

          100% agree'd.
          XML done "right" (with all the abstraction fluff that your eclipse-jockeys deem necessary)
          quickly becomes totally unreadable.

          And as if the brace and quotes soup wasn't bad enough the XML files that you meet in reality
          are usually poorly indented, too.

          This means that in practice you need a special XML viewer software (or editor support)
          to make sense of a non-trivial XML document anyways.

          So, why not just store the meat as well condensed binary blob
          that can be sliced

      • by theCoder ( 23772 )
        Any moderately complicated XML document is very difficult for humans to parse. Sure, they can read it, but it's often hard to understand. See, for example, this post [slashdot.org] on this article.

        Also, parsing XML isn't exactly easy for computers either. Especially because it is so flexible, the parser has to be written very generically. This has the benefit of only having to write one generic parser (which is good), but something still has to drive that parser to interpret the information, and that can get compilcat
      • by mcvos ( 645701 )
        Isn't all data interchanged?

        Well, what do yoou call data? There are quite a lot of XML-based languages, including programming languages, like XSLT. XSLT is basically the ultimate XML language, because it's in XML and it's made to operate on XML. So you could use XSLT to generate the XSLT you want to handle your data, or crazy stuff like that.

    • "Stuck" may be a good thing. From a historical perspective, XML has a huge benefit. We have data tapes going back 50 years, but don't know the format of the data. 50 years from now, when researchers want to read what data we're processing, they'll have a much easier time of it because of XML. It's not perfect, but they will be able to take a stab at what the data means just by looking at it.

      As for a party, what the hell, any excuse is fine by me!
    • It's also great when you want random documents to be easily reverse-engineerable. XML markup can't replace good documentation, but it can give pointers.

      This occurrred to me when I had to reverse-engineer a format of concatenated ASCII lines mostly containing sets of small integers with strings intersparsed (ie. "1\n0\n22 12\nbrown\ngreen\n12 11 12 10 9 13 14\n5 5 6 3 3 2 3 3") and had to come up with an own format to store the data in. Just in case in the future someone has to write a replacement for my s
  • Apple replacing the perfectly fine, hand editable plist format with an XML version. ick.
  • a decade of ... (Score:3, Insightful)

    by The Pim ( 140414 ) on Friday November 17, 2006 @12:36AM (#16880358)
    vague semantics, confusing specifications, unwarranted complexity, standards proliferation, poor tools, and wildly inappropriate application. Not to mention rampant disregard for existing work in nearly every arena it entered. So the essence of XML is this: the problem it solves is not hard, and it does not solve the problem well. [bell-labs.com]
  • Celebrate the XML Decade:

    Bah, it's too late to tell us to celebrate during the decade of XML because that decade is now over!

    Yeah, should have done that; celebrating.
  • Ten years of XML ... and here I am relearning TeX.

    L
  • I see XML as a glorified CSV file. Instead of being a two-dimensional representation of a data structure, you can use an N-dimensional representation.

    If that's all it was, I wouldn't mind. But it gets so much hype. Why?

  • XML is a nescessary evil. Even though it's slow and inefficent, it's a good system for quickly developing a parser for varied file types without having to design an editor.

    Acedemicly-pure XML is just needless overkill, IMO.

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...