站内搜索: 请输入搜索关键词
当前页面: 图书首页 > XML and Java: Developing Web Applications, Second Edition

XML and Java: Developing Web Applications, Second Edition

[ directory ] Previous Section Next Section

9.3 RELAX NG

RELAX NG is a simple yet powerful alternative to W3C XML Schema. RELAX NG was created by an international standardization organization, the Organization for the Advancement of Structured Information Standards (OASIS). RELAX NG was later published as a committee specification of ISO/IEC JTC1 SC34.

RELAX NG was created by unifying two schema languages: TREX and RELAX Core. RELAX Core has been published as a Technical Report (ISO/IEC TR 22250-1:2001) by ISO/IEC JTC1 and was originally submitted to ISO/IEC JTC1 by Japan. Under the influence of RELAX Core, James Clark (the technical lead of the original XML WG, the editor of W3C XML XSLT, and the implementer of reference implementations of XML and XSLT) designed TREX. The RELAX NG technical committee of OASIS unified these two schema languages and published RELAX NG version 1.0 as a committee specification in December 2001.

9.3.1 Mimicking DTDs

This subsection demonstrates how to mimic DTD features in RELAX NG by rewriting the schemas shown in Section 9.2.1. Features specific to RELAX NG are not covered.

Mimicking Element Type Declarations

Recall that we rewrote a simple DTD (see Listing 9.2) in W3C XML Schema (see Listing 9.3). Listing 9.17 is a rewrite in RELAX NG.

Listing 9.17 A simple schema, chap09/itemList0.rng
[1]     <?xml version="1.0" encoding="utf-8"?>
        <grammar xmlns="http://relaxng.org/ns/structure/1.0">
[3]       <start>
            <ref name="itemList"/>
          </start>

          <define name="itemList">
            <element name="itemList">
[9]           <zeroOrMore>
[10]            <ref name="item"/>
              </zeroOrMore>
            </element>
          </define>

          <define name="item">
            <element name="item">
[17]          <ref name="name"/>
[18]          <ref name="quantity"/>
            </element>
          </define>

          <define name="name">
            <element name="name">
[24]         <text/>
           </element>
         </define>

         <define name="quantity">
           <element name="quantity">
[30]        <text/>
           </element>
         </define>

       </grammar>

The root grammar (line 1) declares the namespace http://relaxng.org/ns/structure/1.0. This schema has four define elements, each of which has an element child element. These element elements specify the permissible tag names itemList, item, name, and quantity. The zeroOrMore element (line 9) and its child ref (line 10) specify that an itemList element can contain zero or more item elements. The two ref elements (lines 17 and 18) specify that an item element has a name element followed by a quantity element. The text elements (lines 24 and 30) specify that a name element and a quantity element can have any string as their contents. The start element (line 3) specifies that the root is an itemList element.

We can make this schema more compact. First, we can eliminate ref and define. The schema in Listing 9.18 is created by replacing each ref with the body of the corresponding define.

Listing 9.18 A compact schema, chap09/itemList1.rng
<?xml version="1.0" encoding="utf-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0">
  <start>
    <element name="itemList">
      <zeroOrMore>
        <element name="item">
          <element name="name">
            <text/>
          </element>
          <element name="quantity">
            <text/>
          </element>
        </element>
      </zeroOrMore>
    </element>
  </start>
</grammar>

Because this grammar no longer contains define, we can eliminate grammar and start (and move the namespace declaration to the root element), as shown in Listing 9.19.

Listing 9.19 An even more compact schema, chap09/itemList2.rng
<?xml version="1.0" encoding="utf-8"?>
<element name="itemList"
  xmlns="http://relaxng.org/ns/structure/1.0">
  <zeroOrMore>
    <element name="item">
      <element name="name">
        <text/>
      </element>
      <element name="quantity">
        <text/>
      </element>
    </element>
  </zeroOrMore>
</element>

Let us validate itemList.xml against itemList0.rng. As a RELAX NG validator, we use Jing by James Clark. Version 2001-12-03 of jing.jar is included on the accompanying CD-ROM. The latest version is available from http://www.thaiopensource.com/relaxng/jing.html.

In preparation, add the jar file of Jing to your CLASSPATH. For example:

set CLASSPATH=".;c:\relaxng\jing.jar;c:\xerces-1_4_3\xerces.jar"

We can invoke Jing by specifying com.thaiopensource.relaxng.util. Driver as the main class. The first argument is a schema and the second is an instance document.

R:\samples>java com.thaiopensource.relaxng.util.Driver
                           chap09/itemList0.rng chap09/itemList.xml

Because no errors are found, Jing does not report anything.

Mimicking Attribute-List Declarations

Recall that we added an attribute-list declaration to itemList-attribute.dtd (see Listing 9.6). To capture this attribute-list declaration, we only have to introduce an attribute element, as shown in Listing 9.20.

Listing 9.20 A schema with an attribute declaration, chap09/itemList- attribute.rng
       <?xml version="1.0" encoding="utf-8"?>
       <element name="itemList"
         xmlns="http://relaxng.org/ns/structure/1.0">
         <zeroOrMore>
           <element name="item">
[6]          <attribute name="number"/>
             <element name="name">
               <text/>
             </element>
             <element name="quantity">
               <text/>
             </element>
           </element>
         </zeroOrMore>
       </element>

The only difference from itemList2.rng (see Listing 9.19) is the addition of attribute (line 6). It declares number as a mandatory attribute.

To make the attribute number optional, we wrap the attribute with an optional element as follows:

[6]   <optional><attribute name="number"/></optional>
Comments in Schemas

As a representation of comments, RELAX NG provides documentation elements in the namespace http://relaxng.org/ns/compatibility/annotations/1.0, as shown in Listing 9.21. Elements or attributes of foreign namespaces such as <p> of XHTML can also be used for representing comments.

Listing 9.21 A schema with annotations, chap09/itemList-attribute- annotation.rng
       <?xml version="1.0" encoding="utf-8"?>
       <element name="itemList"
         xmlns="http://relaxng.org/ns/structure/1.0"
         xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0">

[6]      <a:documentation>The root of a schema</a:documentation>

         <zeroOrMore>
           <element name="item">
[10]         <a:documentation>Attributes and child elements can be
                              described together.</a:documentation>
             <attribute name="number">
[13]         <a:documentation>Describes an attribute.</a:documentation>
           </attribute>
           <element name="name">
[16]         <a:documentation>Describes a child element.</a:documentation>
             <text/>
             </element>
             <element name="quantity">
[20]           <a:documentation>Describes another child element.</a:documentation>
               <text/>
             </element>
           </element>
         </zeroOrMore>
       </element>

This schema is similar to itemList-attribute.rng (see Listing 9.20), but five documentation elements (lines 6, 10, 13, 16, and 20) are added.

Table 9.4 summarizes how each construct of DTD can be mimicked in RELAX NG. We assume that unqualified elements are in the namespace http://relaxng.org/ns/structure/1.0, and the prefix a refers to the namespace http://relaxng.org/ns/compatibility/annotations/1.0.

Table 9.4. Mimicking DTD Constructs in RELAX NG

DTD

RELAX NG

<!ELEMENT name (#PCDATA)>
 and optional attribute-list
declarations
<define name="name">
  <element name="name">
    attribute-declarations
    <text/>
  </element>
</define>
<!ELEMENT name (#PCDATA | foo1
   | foo2 | ...)*>
 with optional attribute-list
declarations
<define name="name">
  <element name="name">
    attribute-declarations
    <mixed>
      <zeroOrMore>
        <choice>
          <ref name="foo1"/>
          <ref name="foo2"/>
          ...
        </choice>
      </zeroOrMore>
    </mixed>
  </element>
</define>
<!ELEMENT name (element-content-
model)>
 with optional attribute-list
declarations
<define name="name">
  <element name="name">
    attribute-declarations
    element-content-model
  </element>
</define>
reference to
 other elements in content models

<ref name="name"/>

, in content models

<group> ... </group> (If an element, define, or optional element has more than one child, an intervening group element is implicitly introduced.)

| in content models

<choice> ... </choice>

* in content models

<zeroOrMore> ... </zeroOrMore>

+ in content models

<oneOrMore> ... </oneOrMore>

? in content models or
 #IMPLIED in attribute-list
declarations

<optional> ... </optional>

attribute declarations

<attribute name="...">...</attribute>

CDATA

<text/>

<!-- ... -->

<a:documentation>... </a:documentation>

<!ENTITY % param-ent-name
    attribute-declarations>
<define name="param-ent-name">
   attribute declarations
</define>
<!ENTITY % param-ent-name
    element-content-model>
<define name="param-ent-name">
   element content model
</define>
%name; (reference to parameter
 entities in content models)

<ref name="name"/>

%name; (reference to parameter
 entities in attribute declarations)

<ref name="name"/>

<!ENTITY name ...>
  (parsed or unparsed entity  declarations)

Not applicable

9.3.2 Using Datatypes and Facets of W3C XML Schema

RELAX NG does not have a fixed set of datatypes. Rather, it utilizes datatype libraries defined elsewhere. In particular, RELAX NG can use datatypes and facets of W3C XML Schema.

The schema in Listing 9.22 is equivalent to itemList-facet.xsd (see Listing 9.9), which specifies datatypes and facets.

Listing 9.22 Using datatypes and facets, chap09/itemList-facet.rng
       <?xml version="1.0" encoding="utf-8"?>
       <element name="itemList"
         xmlns="http://relaxng.org/ns/structure/1.0"
         xmlns:a="http://relaxng.org/ns/compatibility/annotations/1.0"
[5]      datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
         <zeroOrMore>
           <element name="item">
             <attribute name="number"/>
             <element name="name">
[10]           <data type="token">
                 <a:documentation>Tokens up to 5 characters</a:documentation>
[12]             <param name="maxLength">5</param>
               </data>
             </element>

             <element name="quantity">
[17]           <data type="short">
                 <a:documentation>Short numbers between 0 and 20
                 </a:documentation>
[19]             <param name="maxInclusive">20</param>
[20]             <param name="minInclusive">0</param>
               </data>
             </element>
           </element>
         </zeroOrMore>
       </element>

The attribute datatypeLibrary (line 5) specifies which datatype library is used. The value http://www.w3.org/2001/XMLSchema-datatypes indicates the use of W3C XML Schema datatypes. The first data (line 10) specifies that the content of a name element is of the datatype token, which is a datatype of W3C XML Schema. The first param (line 12) specifies that tokens are up to 5 characters. This is done by specifying the facet maxLength of W3C XML Schema. The second data (line 17) specifies that the content of a quantity element is of the datatype short, which is a datatype of W3C XML Schema. The second param (line 19) specifies that the value is less than or equal to 20, while the third param (line 20) specifies that the value is greater than or equal to 0. This is done by specifying the facets maxInclusive and minInclusive of W3C XML Schema.

9.3.3 Using Namespaces

Handling of namespaces in RELAX NG is quite different from that in W3C XML Schema. In RELAX NG, one-to-one correspondence between namespaces and schemas is not required. Rather, a single RELAX NG schema can handle multiple namespaces. When we declare an element or attribute, we can indicate the namespace name by specifying or inheriting the attribute ns.

Recall multiNS.xml (see Listing 9.12) and multiNS-differentPrefix-xsd.xml (see Listing 9.16). Because these XML documents have two namespaces, we had to create two W3C XML Schema schemas, foo.xsd (see Listing 9.13) and xhtml.xsd (see Listing 9.14). However, RELAX NG allows us to create a single RELAX NG schema as shown in Listing 9.23.

Listing 9.23 A schema for two namespaces, chap09/multiNS.rng
       <?xml version="1.0" encoding="utf-8"?>
[2]    <element
         name="foo"
         ns="http://www.example.net/foo"
         xmlns="http://relaxng.org/ns/structure/1.0"
         datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

          <zeroOrMore>
            <choice>

[11]          <element name="p"  ns="http://www.w3.org/1999/xhtml">
                <data type="string"/>
              </element>

[15]          <element name="ol" ns="http://www.w3.org/1999/xhtml">
                <zeroOrMore>
[17]              <element name="li">
                    <data type="string"/>
                  </element>
                </zeroOrMore>
              </element>

            </choice>
          </zeroOrMore>
       </element>

Among the four element elements in this schema, the first (line 2) specifies the attribute ns and announces the namespace http://www.example.net/foo. The second (line 11) announces the namespace http://www.w3.org/1999/xhtml. The third (line 15) announces the same namespace, which is inherited by the fourth one (line 17).

9.3.4 Co-occurrence Constraints

Co-occurrence constraints are interdependencies between attributes and elements. An example of co-occurrence constraints is that either the attribute foo or an element foo shall be specified, but not both. W3C XML Schema cannot capture co-occurrence constraints, although many users require them.

RELAX NG is very powerful in handling co-occurrence constraints. To demonstrate, we use the schema shown in Listing 9.24.

Listing 9.24 A schema for co-occurrence constraints, chap09/bookList.rng
       <?xml version="1.0" encoding="utf-8"?>
       <element name="bookList"
         ns="http://www.example.org"
         xmlns="http://relaxng.org/ns/structure/1.0"
         datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
         <zeroOrMore>
           <element name="book">
[8]          <interleave>
[9]            <choice>
[10]             <element name="title"><data type="string"/></element>
[11]             <attribute name="title"><data type="string"/></attribute>
[12]           </choice>
[13]           <choice>
[14]            <oneOrMore>
[15]              <element name="author"><data type="string"/></element>
[16]            </oneOrMore>
[17]            <attribute name="author"><data type="string"/></attribute>
[18]          </choice>
[19]          <optional>
[20]            <choice>
[21]              <element name="price"><data type="decimal"/></element>
[22]              <attribute name="price"><data type="decimal"/></attribute>
[23]            </choice>
[24]          </optional>
[25]        </interleave>
          </element>
        </zeroOrMore>
      </element>

Lines 8 through 25 describe the content and attributes of book elements. The first choice element (line 9) specifies that a title is represented by either a child element (line 10) or an attribute (line 11). Either a single child element or an attribute exists, but not both.

The second choice element (line 13) specifies that authors are represented either by child elements (lines 14?6) or an attribute (line 17). More than one author element may exist. The attribute author is present only if there are no author elements.

The third choice element (line 20) specifies that prices are represented by a child element (line 21) or an attribute (line 22). Because we have optional (line 19), there might not be any price information.

These three choice elements are combined by interleave (line 8) rather than group. As a result, the child elements title, author, and price (if any) may occur in any order.

The bookList document, shown in Listing 9.25, is valid against this schema. This document contains three book elements. They represent title, author, and price information differently.

Listing 9.25 A document with book information, chap09/bookList.xml
<?xml version="1.0" encoding="utf-8"?>
<bookList xmlns="http://www.example.org">

  <book price="81.94">
    <title>The Java (tm) Programming Language, Third Edition</title>
    <author>Ken Arnold</author>
    <author>James Gosling</author>
    <author>David Holmes</author>
  </book>

  <book title="The Java Tutorial Second Edition">
    <author>Mary Campione</author>
    <price>45.95</price>
    <author>Kathy Walrath</author>
 </book>

  <book author="Peter Holman" title="Dowland, Lachrimae"/>

</bookList>

To represent titles, the first book element uses an element, while the second and third use the title attribute. To represent authors, the first and second use author elements, while the third uses the attribute author. To represent prices, the first uses the attribute price, and the second uses a price element. The third does not have price information. Observe that the second book element has an author element, a price element, and another author element in that order. This sequence will be disallowed if we replace interleave (line 8 in Listing 9.24) with group.

9.3.5 Further Information

Although we have seen most of the keywords of RELAX NG, we have not quite covered all its features. The details of RELAX NG are described by the official documents in the following list.

RELAX NG Tutorial

This tutorial demonstrates most of the features of RELAX NG with ample examples.

RELAX NG Specification

This is the definitive specification of RELAX NG and is intended to be used in conjunction with the next two specifications.

RELAX NG DTD Compatibility

This specification defines datatypes and annotations for use in RELAX NG schemas to support some of the features of XML 1.0 DTDs that are not supported directly by RELAX NG.

Guidelines for using W3C XML Schema Datatypes with RELAX NG

This document specifies guidelines for using the datatypes and facets of W3C XML Schema from RELAX NG.

In parallel to the language design, RELAX NG has been actively implemented. As of this writing, three validators for RELAX NG have been developed. They are Jing, Multi-Schema Validator, and VBRELAXNG. Jing and VBRELAXNG are included on the accompanying CD-ROM.

  • Jing was developed by James Clark, the chair of the RELAX NG technical committee and a co-editor of the specifications.

  • Multi-Schema Validator was developed by Kohsuke Kawaguchi of Sun Microsystems. Besides supporting RELAX NG, it supports RELAX Namespace, RELAX Core, TREX, and a subset of W3C XML Schema.

  • VBRELAXNG was developed by Koji Yonekura of NEC. It is written in Visual Basic and provides an interactive tutorial.

Further information, including these specifications, is available at the Web page of the OASIS RELAX NG technical committee (http://relaxng.org/).

    [ directory ] Previous Section Next Section