| Document | elements | code | HTML | Attributes | examples | Object | Languages | type |
  • Sitemap
  • Contact
  • About This Electronic Book
  • Chapter 1 -- Understanding Markup Languages
  • A Brief History of Markup Languages
  • How Markup Works
  • Specific and Generalized Markup Languages
  • The Big Markup Picture
  • Chapter 2 -- Enter XML
  • What Is XML?
  • Where Does XML Fit In?
  • The Goals of XML
  • XML, Recommendations, and Standards
  • Chapter 3 -- XML Structure and Syntax
  • XML Structure
  • XML Syntax
  • Valid Versus Well-Formed XML
  • Chapter 4 -- Playing by the Rules -- The DTD
  • Document Classes
  • The Document Type Definition
  • The Goals of XML

    XML is designed to work well on the Web. While that goal is certainly a primary factor in the design of the language, XML is intended to work in many environments outside the Web as well, including publishing, data interchange, and commercial applications. For XML to be adopted in such a wide variety of applications, its designers knew that it would have to be simple, powerful, and easy to implement for many different kinds of users.

    Defining the Goals for XML

    To help you better understand what XML is intended to do, let's look at the goals for the language as defined by the creators of XML.

    GOAL 1: XML SHALL BE STRAIGHTFORWARDLY USABLE OVER THE INTERNET. First XML should work well on the Internet and take into account the needs of applications running in a distributed networked environment. This does not mean that XML should be able to instantly plug into current Web applications, but working well on the Internet should be of primary importance.

    The next point regarding this goal focuses on the word straightforwardly. SGML has proven to be too complex for some authors to work with and its structure too sophisticated for clients to process efficiently in networked environments. Because XML leaves out all the extras and leaves in only what is necessary, most of the complexity and overhead that was problematic with SGML is removed as well. Note that this goal does not mean that XML should be limited to Internet applications, and this brings us to the next goal.

    GOAL 2: XML SHALL SUPPORT A WIDE VARIETY OF APPLICATIONS. This goal stipulates that XML should be utilized by a wide range of applications, such as authoring tools, content display engines, translation tools, and even database applications. The creators of XML realized that the rapid adoption of XML would depend on the availability of software applications that use it.

    Quite a few freestanding software products now support XML. Check out http://www.gca.org/conf/xml/xml_what.shtml#xmlsoft for a list of some of them. Eventually, most popular applications that work with text or other data will support XML.

    GOAL 3: XML SHALL BE COMPATIBLE WITH SGML. This goal is a crucial aspect of the XML design, but it is also one of the most troublesome goals. The idea is that any valid XML document must also be a valid SGML document. This goal was established so that existing SGML tools would also work with and parse XML code. Note that while valid XML is also valid SGML, the reverse is not true. Remember, XML was built by removing all the inessential pieces from SGML, so SGML contains a lot more that would cause a failure when using an XML processor.

    What is parsing? To parse an XML or SGML document, a processor must break down the document into its constituent parts and understand the structure of the document and the relationship of its parts. XML (and SGML) are considered parsable because they are required to follow strict rules (usually in the form of a DTD) that allow any processor that understands those rules to interpret the document correctly. HTML, even though it is an application of SGML, is usually not truly parsable, since most processors don't strictly enforce the rules and most authors don't strictly follow the rules.

    GOAL 4: IT SHALL BE EASY TO WRITE PROGRAMS WHICH PROCESS XML DOCUMENTS. The idea behind this goal is that the adoption of the language would be proportional to the availability of tools. The definition of what is "easy" is, of course, relative, but the original idea was that someone with a computer science degree should be able to write a basic XML processor in a week or two. While the "two-week" benchmark of this goal has all but been abandoned, the proliferation of XML tools, many of them freeware, is testament that the goal has been reached.

    GOAL 5: THE NUMBER OF OPTIONAL FEATURES IN XML IS TO BE KEPT TO THE ABSOLUTE MINIMUM, IDEALLY ZERO.This goal comes from another problem users had with SGML. The SGML specification comes with a lot of options—many of which are never used. This adds excess overhead to SGML processors and often makes document/processor compatibility difficult, if not impossible. If, for example, an application was designed to read and process SGML documents with specific options, the application might not correctly understand documents that used different options. XML avoids this potential for incompatibility by reducing the number of options to zero. This means that any XML processor should be able to parse any XML document, no matter what data or structure the document contains.

    GOAL 6: XML DOCUMENTS SHOULD BE HUMAN-LEGIBLE AND REASONABLY CLEAR. This goal is included for both philosophical and practical purposes. Since XML uses plain text to describe data and relationships among that data, it has the inherent advantage of being easier to work with and read than a binary format that accomplishes the same thing. Since the code is formatted in a straightforward way, it makes sense for XML to be easily readable by humans as well as machines. From a practical standpoint, if XML is easy to understand and use, you don't need complex tools and procedures to work with it. So no matter how sophisticated some tools get, any author can sit down with a simple text editor and write XML code.

    GOAL 7: THE XML DESIGN SHOULD BE PREPARED QUICKLY. As already mentioned, XML was conceived because an extensible language for the Web was needed. This goal was included out of concern that if XML was not made available quickly as a way to extend HTML, another organization might come up with a proprietary solution, a binary solution, or both. The SGML ERB believed that the solution for extensibility needed to come from the SGML community—the same group responsible for HTML. The board also determined that the solution had to be open and extensible and could not be owned by any single vendor. An ad hoc group was formed to work on XML, and eventually, the XML Working Group was formed and operated within the guidelines of the W3C to formalize XML.

    GOAL 8: THE DESIGN OF XML SHALL BE FORMAL AND CONCISE. This goal focuses on the XML specification. The goal was to make it as concise as possible by formalizing the wording of the specification. To help accomplish this, the specification uses Extended Backus-Naur Form (EBNF), a standard form used to describe programming languages that include declarations. Prose is avoided and EBNF is used throughout to keep it as formal and concise as possible. As with Goal 4, the idea with this goal is that the language will be adopted more readily if it is easy to understand and use.

    GOAL 9: XML DOCUMENTS SHALL BE EASY TO CREATE. Just as an XML document should be easily understood by human readers as stated in Goal 6, creating an XML document should be easy as well. And while XML documents can be authored in something as simple as a plain text editor, the reality is that complex documents sometimes prove to be too cumbersome to work with in that environment. It will be up to the market to decide ultimately if this goal has been met, but many tools (both commercial and freeware) are already available for creating and using XML, which indicates that the W3C has succeeded so far in meeting this goal.

    GOAL 10: TERSENESS IN XML MARKUP IS OF MINIMAL IMPORTANCE.This is another goal that grew out of the need to remedy implementation problems in SGML. SGML supports minimization techniques, which simply means that SGML allows some shortcuts in the interest of reducing the amount of typing the author has to do. One well-known SGML minimization that is also part of HTML is the ability to omit the closing tag for many elements. In these languages, the beginning of the next opening tag is enough to signal that the previous element should be closed. Although this reduces some work for the author, it can be a source of confusion for the reader. In XML, clarity always takes precedence over brevity.

    Implementing the Goals

    These 10 goals were used to drive the development of the XML specification. As you get to know XML, you will see that the "flavor" and many of the features of the language are direct results of decisions that were made based on these goals. I think you will find that XML is a simple yet extremely powerful and flexible markup language.

    Document   elements   code   HTML   Attributes   examples   Object   Languages   type   used   Linking   processors   Patterns   Namespaces   Chapter   other   Markup   declaration   Page   Structure   Figure   nodeList   text   Model   work