RELAX NG for Python

February 7, 2002

A.M. Kuchling

What is RELAX NG?

RELAX NG (or RNG) is a schema language for XML.

Schema languages let you check whether an XML document conforms to a given schema: that it follows a certain structure of elements and attributes.


Other Schema Languages (I): DTDs

DTDs date back to SGML, and are also in XML 1.0.

<!ELEMENT title (#PCDATA)>
<!ATTLIST title
        id ID #IMPLIED
        class CDATA #IMPLIED
        title CDATA #IMPLIED

Opinion in the XML community seems to be running against them:

So long, and thanks for all the fish...

Other Schema Languages (II): XML Schema

Variously called XML Schema or XSD.

<xs:schema xs="http://.../2001/XMLSchema>
  <xs:element name="title">
      <xs attribute name="id" type="xs:ID" 
      <xs attribute name="class" type="xs:string"/>
      <xs attribute name="title" type="xs:string"/>


RELAX NG Example

<? xml version="1.0"?>
<element name="title" xmlns="">

  <attribute name="id"><text/></attribute>

  <attribute name="class">

  <attribute name="timestamp">
    <data type="dateTime" 

Implementation: Derivatives

The algorithm for RELAX NG is remarkably elegant, and is based on computing the derivative of a pattern.

A pattern P is nullable if the empty string (or empty tree) matches it.

The derivative of a pattern P w.r.t tree X =
a pattern matching what's left of P after matching X.

Pattern Text Derivative
a+b+ a a*b+
a*b+ aaa a*b+
a*b b Empty pattern

Current Status


