ModularizationXhtml

TransWiki - an Open Translation Project(OTP)

Table of contents

Modularization of XHTML™


W3C Recommendation 10 April 2001



This version:
http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410 (http://www.w3.org/TR/2001/REC-xhtml-modularization-20010410) (Single HTML file (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html), PostScript version (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.ps), PDF version (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.pdf), ZIP archive (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.zip), or Gzip'd TAR archive (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.tgz))
Latest version:
http://www.w3.org/TR/xhtml-modularization (http://www.w3.org/TR/xhtml-modularization)
Previous version:
http://www.w3.org/TR/2001/PR-xhtml-modularization-20010222 (http://www.w3.org/TR/2001/PR-xhtml-modularization-20010222)
Editors:
Murray Altheim (mailto:altheim@eng.sun.com), Sun Microsystems (http://www.sun.com/) Frank Boumphrey (mailto:bckman@ix.netcom.com), HTML Writers Guild (http://www.hwg.org/) Sam Dooley (mailto:dooley@watson.ibm.com), IBM (http://www.ibm.com/) Shane McCarron (mailto:shane@aptest.com), Applied Testing and Technology (http://www.aptest.com/) Sebastian Schnitzenbaumer (mailto:schnitz@mozquito.com), Mozquito Technologies AG (http://www.mozquito.com/) Ted Wugofski (mailto:ted.wugofski@openwave.com), Openwave (http://www.openwave.com/) (formerly Gateway)
Copyright (http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Copyright) ©2001 W3C® (MIT, INRIA, Keio (http://www.keio.ac.jp/)), All Rights Reserved. W3C liability (http://www.w3.org/Consortium/Legal/ipr-notice-20000612#Legal_Disclaimer), trademark (http://www.w3.org/Consortium/Legal/ipr-notice-20000612#W3C_Trademarks), document use (http://www.w3.org/Consortium/Legal/copyright-documents-19990405) and software licensing (http://www.w3.org/Consortium/Legal/copyright-software-19980720) rules apply.





Abstract


This Recommendation specifies an abstract modularization of XHTML and an implementation of the abstraction using XML Document Type Definitions (DTDs). This modularization provides a means for subsetting and extending XHTML, a feature needed for extending XHTML's reach onto emerging platforms.


Status of This Document


This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.


This document has been reviewed by W3C Members and other interested parties and has been endorsed by the Director as a W3C Recommendation. It is a stable document and may be used as reference material or cited as a normative reference from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.


This document has been produced by the W3C HTML Working Group (http://www.w3.org/MarkUp/Group/) (members only (http://cgi.w3.org/MemberAccess/AccessRequest)) as part of the W3C HTML Activity (http://www.w3.org/MarkUp/). The goals of the HTML Working Group are discussed in the HTML Working Group charter (http://www.w3.org/MarkUp/2000/Charter). The W3C staff contact for work on HTML is Masayasu Ishikawa (mailto:mimasa@w3.org).


Public discussion of HTML takes place on www-html@w3.org (mailto:www-html@w3.org) (archive (http://lists.w3.org/Archives/Public/www-html/)). To subscribe send an email to www-html-request@w3.org (mailto:www-html-request@w3.org) with the word subscribe in the subject line.


Please report errors in this document to www-html-editor@w3.org (mailto:www-html-editor@w3.org) (archive (http://lists.w3.org/Archives/Public/www-html-editor/)). The list of known errors (http://www.w3.org/2001/04/REC-xhtml-modularization-20010410-errata) in this specification is available at http://www.w3.org/2001/04/REC-xhtml-modularization-20010410-errata.


The English version of this specification is the only normative version. Information about translations of this document (http://www.w3.org/MarkUp/translations) is available at http://www.w3.org/MarkUp/translations.


A list of current W3C Recommendations and other technical documents can be found at http://www.w3.org/TR (http://www.w3.org/TR).


Quick Table of Contents





Full Table of Contents




  • 1. Introduction (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro)


  • 1.1. What is XHTML? (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_whatisxhtml)
  • 1.2. What is XHTML Modularization? (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_whatismod)
  • 1.3. Why Modularize XHTML? (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_xhtml_mods)


  • 1.3.1. Abstract modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_abstract)
  • 1.3.2. Module implementations (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_module_implementation)
  • 1.3.3. Hybrid document types (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_hybrid)
  • 1.3.4. Validation (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_validation)
  • 1.3.5. Formatting Model (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_formatting)





  • 4.1. Syntactic Conventions (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_4.1.)
  • 4.2. Content Types (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_common_types)
  • 4.3. Attribute Types (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_common_attrtypes)
  • 4.4. An Example Abstract Module Definition (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_4.4.)


  • 4.4.1. XHTML Skiing Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_4.4.1.)


  • 5. XHTML Abstract Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_xhtmlmodules)


  • 5.1. Attribute Collections (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_commonatts)
  • 5.2. Core Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_5.2.)


  • 5.2.1. Structure Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_structuremodule)
  • 5.2.2. Text Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_textmodule)
  • 5.2.3. Hypertext Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_hypertextmodule)
  • 5.2.4. List Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_listmodule)


  • 5.3. Applet Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_appletmodule)
  • 5.4. Text Extension Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_text)


  • 5.4.1. Presentation Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_presentationmodule)
  • 5.4.2. Edit Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_editmodule)
  • 5.4.3. Bi-directional Text Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_bdomodule)


  • 5.5. Forms Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_forms)


  • 5.5.1. Basic Forms Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_sformsmodule)
  • 5.5.2. Forms Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_extformsmodule)


  • 5.6. Table Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_5.6.)


  • 5.6.1. Basic Tables Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_simpletablemodule)
  • 5.6.2. Tables Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_tablemodule)


  • 5.7. Image Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_imagemodule)
  • 5.8. Client-side Image Map Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_imapmodule)
  • 5.9. Server-side Image Map Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_servermapmodule)
  • 5.10. Object Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_objectmodule)
  • 5.11. Frames Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_framesmodule)
  • 5.12. Target Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_targetmodule)
  • 5.13. Iframe Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_iframemodule)
  • 5.14. Intrinsic Events Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intrinsiceventsmodule)
  • 5.15. Metainformation Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_metamodule)
  • 5.16. Scripting Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_scriptmodule)
  • 5.17. Style Sheet Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_stylemodule)
  • 5.18. Style Attribute Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_styleattributemodule)
  • 5.19. Link Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_linkmodule)
  • 5.20. Base Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_basemodule)
  • 5.21. Name Identification Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_nameidentmodule)
  • 5.22. Legacy Module (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_legacymodule)








  • E.4. Creating a new DTD (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_E.4.)



  • E.5. Using the new DTD (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_E.5.)



  • F.1. XHTML Character Entities (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_xhtml_character_entities)



  • F.2. XHTML Modular Framework (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_xhtml_framework)


  • F.2.1. XHTML Base Architecture (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_XHTML_Base_Architecture)
  • F.2.2. XHTML Notations (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_XHTML_Notations)
  • F.2.3. XHTML Datatypes (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_XHTML_Datatypes)
  • F.2.4. XHTML Common Attribute Definitions (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_XHTML_Common_Attribute_Definitions)
  • F.2.5. XHTML Qualified Names (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_XHTML_Qualified_Names)
  • F.2.6. XHTML Character Entities (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_XHTML_Character_Entities)



  • F.3.1. XHTML Core Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_modules_basicmods)
  • F.3.2. Applet (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Applet)
  • F.3.3. Text Modules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_textmodule)
  • F.3.4. Forms (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_F.3.4.)
  • F.3.5. Tables (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#sec_F.3.5.)
  • F.3.6. Image (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Image)
  • F.3.7. Client-side Image Map (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Client-side_Image_Map)
  • F.3.8. Server-side Image Map (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Server-side_Image_Map)
  • F.3.9. Object (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Object)
  • F.3.10. Frames (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Frames)
  • F.3.11. Target (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Target)
  • F.3.12. Iframe (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Iframe)
  • F.3.13. Intrinsic Events (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Intrinsic_Events)
  • F.3.14. Metainformation (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Metainformation)
  • F.3.15. Scripting (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Scripting)
  • F.3.16. Style Sheet (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Style_Sheet)
  • F.3.17. Style Attribute (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Style_Attribute)
  • F.3.18. Link (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Link)
  • F.3.19. Base (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Base)
  • F.3.20. Name Identification (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Name_Identification)
  • F.3.21. Legacy (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Legacy)



  • F.4.1. Block Phrasal (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Block_Phrasal)
  • F.4.2. Block Presentational (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Block_Presentational)
  • F.4.3. Block Structural (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Block_Structural)
  • F.4.4. Inline Phrasal (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Inline_Phrasal)
  • F.4.5. Inline Presentational (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Inline_Presentational)
  • F.4.6. Inline Structural (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Inline_Structural)
  • F.4.7. Param (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Param)
  • F.4.8. Legacy Redeclarations (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_module_Legacy_Redeclarations)


  • G. References (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_refs)


  • G.1. Normative References (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_normrefs)
  • G.2. Informative References (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_inforefs)


  • H. Design Goals (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_design)


  • H.1. Requirements (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_intro_requirements)


  • H.1.1. Granularity (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_req_granularity)
  • H.1.2. Composibility (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_req_composibility)
  • H.1.3. Ease of Use (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_req_easeofuse)
  • H.1.4. Compatibility (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_req_compatibility)
  • H.1.5. Conformance (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_req_conformance)


  • J. Acknowledgements (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#a_acks)


1. Introduction


This section is informative.


1.1. What is XHTML?


XHTML is the reformulation of HTML 4 as an application of XML. XHTML 1.0 [XHTML1 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xhtml1)] specifies three XML document types that correspond to the three HTML 4 DTDs: Strict, Transitional, and Frameset. XHTML 1.0 is the basis for a family of document types that subset and extend HTML.


1.2. What is XHTML Modularization?


XHTML Modularization is a decomposition of XHTML 1.0, and by reference HTML 4, into a collection of abstract modules that provide specific types of functionality. These abstract modules are implemented in this specification using the XML Document Type Definition language, but an implementation using XML Schemas is expected. The rules for defining the abstract modules, and for implementing them using XML DTDs, are also defined in this document.


These modules may be combined with each other and with other modules to create XHTML subset and extension document types that qualify as members of the XHTML-family of document types.


1.3. Why Modularize XHTML?


The modularization of XHTML refers to the task of specifying well-defined sets of XHTML elements that can be combined and extended by document authors, document type architects, other XML standards specifications, and application and product designers to make it economically feasible for content developers to deliver content on a greater number and diversity of platforms.


Over the last couple of years, many specialized markets have begun looking to HTML as a content language. There is a great movement toward using HTML across increasingly diverse computing platforms. Currently there is activity to move HTML onto mobile devices (hand held computers, portable phones, etc.), television devices (digital televisions, TV-based Web browsers, etc.), and appliances (fixed function devices). Each of these devices has different requirements and constraints.


Modularizing XHTML provides a means for product designers to specify which elements are supported by a device using standard building blocks and standard methods for specifying which building blocks are used. These modules serve as "points of conformance" for the content community. The content community can now target the installed base that supports a certain collection of modules, rather than worry about the installed base that supports this or that permutation of XHTML elements. The use of standards is critical for modularized XHTML to be successful on a large scale. It is not economically feasible for content developers to tailor content to each and every permutation of XHTML elements. By specifying a standard, either software processes can autonomously tailor content to a device, or the device can automatically load the software required to process a module.


Modularization also allows for the extension of XHTML's layout and presentation capabilities, using the extensibility of XML, without breaking the XHTML standard. This development path provides a stable, useful, and implementable framework for content developers and publishers to manage the rapid pace of technological change on the Web.


1.3.1. Abstract modules


An XHTML document type is defined as a set of abstract modules. A abstract module defines one kind of data that is semantically different from all others. Abstract modules can be combined into document types without a deep understanding of the underlying schemas that define the modules.


1.3.2. Module implementations


A module implementation consists of a set of element types, a set of attribute-list declarations, and a set of content model declarations, where any of these three sets may be empty. An attribute-list declaration in a module may modify an element type outside the element types defined in the module, and a content model declaration may modify an element type outside the element type set of the module.


One implementation mechanism is XML DTDs. An XML DTD is a means of describing the structure of a class of XML documents, collectively known as an XML document type. XML DTDs are described in the XML 1.0 Recommendation [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)]. Another implementation mechanism is XML Schema [XMLSCHEMA (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xmlschema)].


1.3.3. Hybrid document types


A hybrid document type is an document type composed from a collection of XML DTDs or DTD Modules. The primary purpose of the modularization framework described in this document is to allow a DTD author to combine elements from multiple abstract modules into a hybrid document type, develop documents against that hybrid document type, and to validate that document against the associated hybrid document type definition.


One of the most valuable benefits of XML over SGML is that XML reduces the barrier to entry for standardization of element sets that allow communities to exchange data in an interoperable format. However, the relatively static nature of HTML as the content language for the Web has meant that any one of these communities have previously held out little hope that their XML document types would be able to see widespread adoption as part of Web standards. The modularization framework allows for the dynamic incorporation of these diverse document types within the XHTML-family of document types, further reducing the barriers to the incorporation of these domain-specific vocabularies in XHTML documents.


1.3.4. Validation


The use of well-formed, but not valid, documents is an important benefit of XML. In the process of developing a document type, however, the additional leverage provided by a validating parser for error checking is important. The same statement applies to XHTML document types with elements from multiple abstract modules.


A document is an instance of one particular document type defined by the DTD identified in the document's prologue. Validating the document is the process of checking that the document complies with the rules in the document type definition.


One document can consist of multiple document fragments. Validating only fragments of a document, where each fragment is of a different document type than the other fragments in the document, is beyond the scope of this framework - since it would require technology that is not yet defined.


However, the modularization framework allows multiple document type definitions to be integrated and form a new document type (e.g. SVG integrated into XHTML). The new document type definition can be used for normal XML 1.0 validation.


1.3.5. Formatting Model


Earlier versions of HTML attempted to define parts of the model that user agents are required to use when formatting a document. With the advent of HTML 4, the W3C started the process of divorcing presentation from structure. XHTML 1.0 maintained this separation, and this document continues moving HTML and its descendants down this path. Consequently, this document makes no requirements on the formatting model associated with the presentation of documents marked up with XHTML Family document types.


Instead, this document recommends that content authors rely upon style mechanisms such as CSS to define the formatting model for their content. When user agents support the style mechanisms, documents will format as expected. When user agents do not support the style mechanisms, documents will format as appropriate for that user agent. This permits XHTML Family user agents to support rich formatting models on devices where that is appropriate, and lean formatting models on devices where that is appropriate.


2. Terms and Definitions


This section is informative.


While some terms are defined in place, the following definitions are used throughout this document. Familiarity with the W3C XML 1.0 Recommendation [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)] is highly recommended.



abstract module
a unit of document type specification corresponding to a distinct type of content, corresponding to a markup construct reflecting this distinct type.
content model
the declared markup structure allowed within instances of an element type. XML 1.0 differentiates two types: elements containing only element content (no character data) and mixed content (elements that may contain character data optionally interspersed with child elements). The latter are characterized by a content specification beginning with the "#PCDATA" string (denoting character data).
document model
the effective structure and constraints of a given document type. The document model constitutes the abstract representation of the physical or semantic structures of a class of documents.
document type
a class of documents sharing a common abstract structure. The ISO 8879 [SGML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_sgml)] definition is as follows: "a class of documents having similar characteristics; for example, journal, article, technical manual, or memo. (4.102)"
document type definition (DTD)
a formal, machine-readable expression of the XML structure and syntax rules to which a document instance of a specific document type must conform; the schema type used in XML 1.0 to validate conformance of a document instance to its declared document type. The same markup model may be expressed by a variety of DTDs.
driver
a generally short file used to declare and instantiate the modules of a DTD. A good rule of thumb is that a DTD driver contains no markup declarations that comprise any part of the document model itself.
element
an instance of an element type.
element type
the definition of an element, that is, a container for a distinct semantic class of document content.
entity
an entity is a logical or physical storage unit containing document content. Entities may be composed of parse-able XML markup or character data, or unparsed (i.e., non-XML, possibly non-textual) content. Entity content may be either defined entirely within the document entity ("internal entities") or external to the document entity ("external entities"). In parsed entities, the replacement text may include references to other entities.
entity reference
a mnemonic string used as a reference to the content of a declared entity (eg., "&" for "&", "<" for "<", "©" for "©".)
generic identifier
the name identifying the element type of an element. Also, element type name.
hybrid document
A hybrid document is a document that uses more than one XML namespace. Hybrid documents may be defined as documents that contain elements or attributes from hybrid document types.
instantiate
to replace an entity reference with an instance of its declared content.
markup declaration
a syntactical construct within a DTD declaring an entity or defining a markup structure. Within XML DTDs, there are four specific types: entity declaration defines the binding between a mnemonic symbol and its replacement content; element declaration constrains which element types may occur as descendants within an element (see also content model); attribute definition list declaration defines the set of attributes for a given element type, and may also establish type constraints and default values; notation declaration defines the binding between a notation name and an external identifier referencing the format of an unparsed entity.
markup model
the markup vocabulary (i.e., the gamut of element and attribute names, notations, etc.) and grammar (i.e., the prescribed use of that vocabulary) as defined by a document type definition (i.e., a schema) The markup model is the concrete representation in markup syntax of the document model, and may be defined with varying levels of strict conformity. The same document model may be expressed by a variety of markup models.
module
an abstract unit within a document model expressed as a DTD fragment, used to consolidate markup declarations to increase the flexibility, modifiability, reuse and understanding of specific logical or semantic structures.
modularization
an implementation of a modularization model; the process of composing or de-composing a DTD by dividing its markup declarations into units or groups to support specific goals. Modules may or may not exist as separate file entities (i.e., the physical and logical structures of a DTD may mirror each other, but there is no such requirement).
modularization model
the abstract design of the document type definition (DTD) in support of the modularization goals, such as reuse, extensibility, expressiveness, ease of documentation, code size, consistency and intuitiveness of use. It is important to note that a modularization model is only orthogonally related to the document model it describes, so that two very different modularization models may describe the same document type.
parameter entity
an entity whose scope of use is within the document prolog (i.e., the external subset/DTD or internal subset). Parameter entities are disallowed within the document instance.
parent document type
A parent document type of a hybrid document is the document type of the root element.
tag
descriptive markup delimiting the start and end (including its generic identifier and any attributes) of an element.

3. Conformance Definition


This section is normative.


In order to ensure that XHTML-family documents are maximally portable among XHTML-family user agents, this specification rigidly defines conformance requirements for both of these and for XHTML-family document types. While the conformance definitions can be found in this section, they necessarily reference normative text within this document, within the base XHTML specification [XHTML1 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xhtml1)], and within other related specifications. It is only possible to fully comprehend the conformance requirements of XHTML through a complete reading of all normative references.


The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_RFC2119)].


3.1. XHTML Host Language Document Type Conformance


It is possible to modify existing document types and define wholly new document types using both modules defined in this specification and other modules. Such a document type is "XHTML Host Language Conforming" when it meets the following criteria:



  • The document type must be defined using one of the implementation methods defined by the W3C. Currently this is limited to XML DTDs, but XML Schema will be available soon. The rest of this section refers to "DTDs" although other implementations are possible.
  • The DTD which defines the document type must have a unique identifier as defined in Naming Rules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_conform_naming_rules) that uses the string "XHTML" in its first token of the public text description.
  • The DTD which defines the document type must include, at a minimum, the Structure, Hypertext, Text, and List modules defined in this specification.
  • For each of the W3C-defined modules that are included, all of the elements, attributes, types of attributes (including any required enumerated value lists), and any required minimal content models must be included (and optionally extended) in the document type's content model. When content models are extended, all of the elements and attributes (along with their types or any required enumerated value lists) required in the original content model must continue to be required.
  • The DTD which defines the document type may define additional elements and attributes. However, these must be in their own XML namespace [XMLNAMES (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xmlns)].


3.2. XHTML Integration Set Document Type Conformance


It is also possible to define document types that are based upon XHTML, but do not adhere to its structure. Such a document type is "XHTML Integration Set Conforming" when it meets the following criteria:



  • The document type must be defined using one of the implementation methods defined by the W3C. Currently this is limited to XML DTDs, but XML Schema will be available soon. The rest of this section refers to "DTDs" although other implementations are possible.
  • The DTD which defines the document type must have a unique identifier as defined in Naming Rules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_conform_naming_rules) that uses the string "XHTML" NOT in its first token of the public text description.
  • The DTD which defines the document type must include, at a minimum, the Hypertext, Text, and List modules defined in this specification.
  • For each of the W3C-defined modules that are included, all of the elements, attributes, types of attributes (including any required enumerated lists), and any required minimal content models must be included (and optionally extended) in the document type's content model. When content models are extended, all of the elements and attributes (along with their types or any required enumerated value lists) required in the original content model must continue to be required.
  • The DTD which defines the document type may define additional elements and attributes. However, these must be in their own XML namespace [XMLNAMES (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xmlns)].


3.3. XHTML Family Module Conformance


This specification defines a method for defining XHTML-conforming modules. A module conforms to this specification when it meets all of the following criteria:



  • The document type must be defined using one of the implementation methods defined by the W3C. Currently this is limited to XML DTDs, but XML Schema will be available soon. The rest of this section refers to "DTDs" although other implementations are possible.
  • The DTD which defines the module must have a unique identifier as defined in Naming Rules (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#s_conform_naming_rules).
  • When the module is defined using an XML DTD, the module must insulate its parameter entity names through the use of unique prefixes or other, similar methods.
  • The module definition must have a prose definition that describes the syntactic and semantic requirements of the elements, attributes, and/or content models that it declares.
  • The module definition must not reuse any element names that are defined in other W3C-defined modules, except when the content model and semantics of those elements are either identical to the original or an extension of the original, or when the reused element names are within their own namespace (see below).
  • The module definition's elements and attributes must be part of an XML namespace [XMLNAMES (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xmlns)]. If the module is defined by an organization other than the W3C, this namespace must NOT be the same as the namespace in which other W3C modules are defined.


3.4. XHTML Family Document Conformance


A conforming XHTML family document is a valid instance of an XHTML Host Language Conforming Document Type.


3.5. XHTML Family User Agent Conformance


A conforming user agent must meet all of the following criteria (as defined in [XHTML1 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xhtml1)]):



  • In order to be consistent with the XML 1.0 Recommendation [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)], the user agent must parse and evaluate an XHTML document for well-formedness. If the user agent claims to be a validating user agent, it must also validate documents against their referenced DTDs according to [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)].
  • When the user agent claims to support facilities defined within this specification or required by this specification through normative reference, it must do so in ways consistent with the facilities' definition.
  • When a user agent processes an XHTML document as generic [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)], it shall only recognize attributes of type ID (e.g., the id attribute on most XHTML elements) as fragment identifiers.
  • If a user agent encounters an element it does not recognize, it must continue to process the children of that element. If the content is text, the text must be presented to the user.
  • If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).
  • If a user agent encounters an attribute value it doesn't recognize, it must use the default attribute value.
  • If it encounters an entity reference (other than one of the predefined entities) for which the user agent has processed no declaration (which could happen if the declaration is in the external subset which the user agent hasn't read), the entity reference should be rendered as the characters (starting with the ampersand and ending with the semi-colon) that make up the entity reference.
  • When rendering content, user agents that encounter characters or character entity references that are recognized but not renderable should display the document in such a way that it is obvious to the user that normal rendering has not taken place.

White space is handled according to the following rules. The following characters are defined in [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)] as white space characters:



  • SPACE ( )
  • HORIZONTAL TABULATION ( )
  • CARRIAGE RETURN ( )
  • LINE FEED ( )


The XML processor normalizes different systems' line end codes into one single LINE FEED character, that is passed up to the application.


The user agent must process white space characters in the data received from the XML processor as follows:



  • All white space surrounding block elements should be removed.
  • Comments are removed entirely and do not affect white space handling. One white space character on either side of a comment is treated as two white space characters.
  • When the 'xml:space' attribute is set to 'preserve', white space characters must be preserved and consequently LINE FEED characters within a block must not be converted.
  • When the 'xml:space' attribute is not set to 'preserve', then:


  • Leading and trailing white space inside a block element must be removed.
  • LINE FEED characters must be converted into one of the following characters: a SPACE character, a ZERO WIDTH SPACE character (​), or no character (i.e. removed). The choice of the resulting character is user agent dependent and is conditioned by the script property of the characters preceding and following the LINE FEED character.
  • A sequence of white space characters without any LINE FEED characters must be reduced to a single SPACE character.
  • A sequence of white space characters with one or more LINE FEED characters must be reduced in the same way as a single LINE FEED character.


White space in attribute values is processed according to [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)].


Note (informative): In determining how to convert a LINE FEED character a user agent should consider the following cases, whereby the script of characters on either side of the LINE FEED determines the choice of the replacement. Characters of COMMON script (such as punctuation) are treated as the same as the script on the other side:



  • If the characters preceding and following the LINE FEED character belong to a script in which the SPACE character is used as a word separator, the LINE FEED character should be converted into a SPACE character. Examples of such scripts are Latin, Greek, and Cyrillic.
  • If the characters preceding and following the LINE FEED character belong to an ideographic-based script or writing system in which there is no word separator, the LINE FEED should be converted into no character. Examples of such scripts or writing systems are Chinese, Japanese.
  • If the characters preceding and following the LINE FEED character belong to a non ideographic-based script in which there is no word separator, the LINE FEED should be converted into a ZERO WIDTH SPACE character (​) or no character. Examples of such scripts are Thai, Khmer.
  • If none of the conditions in (1) through (3) are true, the LINE FEED character should be converted into a SPACE character.


The Unicode [UNICODE (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_unicode)] technical report TR#24 (Script Names) provides an assignment of script names to all characters.


3.6. Naming Rules


XHTML Host Language document types must adhere to strict naming conventions so that it is possible for software and users to readily determine the relationship of document types to XHTML. The names for document types implemented as XML document type definitions are defined through Formal Public Identifiers (FPIs). Within FPIs, fields are separated by double slash character sequences (//). The various fields must be composed as follows:



  • The leading field must be "-" to indicate a privately defined resource.
  • The second field must contain the name of the organization responsible for maintaining the named item. There is no formal registry for these organization names. Each organization should define a name that is unique. The name used by the W3C is, for example, W3C.
  • The third field contains two constructs: the public text class followed by the public text description. The first token in the third field is the public text class which should adhere to ISO 8879 Clause 10.2.2.1 Public Text Class. Only XHTML Host Language conforming documents should begin the public text description with the token XHTML. The public text description should contain the string XHTML if the document type is Integration Set conforming. The field must also contain an organization-defined unique identifier (e.g., MyML 1.0). This identifier should be composed of a unique name and a version identifier that can be updated as the document type evolves.
  • The fourth field defines the language in which the item is developed (e.g., EN).


Using these rules, the name for an XHTML Host Language conforming document type might be -//MyCompany//DTD XHTML MyML 1.0//EN. The name for an XHTML family conforming module might be -//MyCompany//ELEMENTS XHTML MyElements 1.0//EN. The name for an XHTML Integration Set conforming document type might be -//MyCompany//DTD Special Markup with XHTML//EN.


3.7. XHTML Module Evolution


Each module defined in this specification is given a unique identifier that adheres to the naming rules in the previous section. Over time, a module may evolve. A logical ramification of such evolution may be that some aspects of the module are no longer compatible with its previous definition. To help ensure that document types defined against modules defined in this specification continue to operate, the identifiers associated with a module that changes will be updated. Specifically, the Formal Public Identifier and System Identifier of the module will be changed by modifying the version identifier included in each. Document types that wish to incorporate the updated functionality will need to be similarly updated.


In addition, the earlier version(s) of the module will continue to be available via its earlier, unique identifier(s). In this way, document types developed using XHTML modules will continue to function seamlessly using their original definitions even as the collection expands and evolves. Similarly, document instances written against such document types will continue to validate using the earlier module definitions.


Other XHTML Family Module and Document Type authors are encouraged to adopt a similar strategy to ensure the continued functioning of document types based upon those modules and document instances based upon those document types.


4. Defining Abstract Modules


This section is normative.


An abstract module is a definition of an XHTML module using prose text and some informal markup conventions. While such a definition is not generally useful in the machine processing of document types, it is critical in helping people understand what is contained in a module. This section defines the way in which XHTML abstract modules are defined. An XHTML-conforming module is not required to provide an abstract module definition. However, anyone developing an XHTML module is encouraged to provide an abstraction to ease in the use of that module.


4.1. Syntactic Conventions


The abstract modules are not defined in a formal grammar. However, the definitions do adhere to the following syntactic conventions. These conventions are similar to those of XML DTDs, and should be familiar to XML DTD authors. Each discrete syntactic element can be combined with others to make more complex expressions that conform to the algebra defined here.



element name
When an element is included in a content model, its explicit name will be listed.
content set
Some modules define lists of explicit element names called content sets. When a content set is included in a content model, its name will be listed.
expr ?
Zero or one instances of expr are permitted.
expr +
One or more instances of expr are required.
expr *
Zero or more instances of expr are permitted.
a , b
Expression a is required, followed by expression b.
a | b
Either expression a or expression b is required.
a - b
Expression a is permitted, omitting elements in expression b.
parentheses
When an expression is contained within parentheses, evaluation of any subexpressions within the parentheses take place before evaluation of expressions outside of the parentheses (starting at the deepest level of nesting first).
extending pre-defined elements
In some instances, a module adds attributes to an element. In these instances, the element name is followed by an ampersand (&).
defining required attributes
When an element requires the definition of an attribute, that attribute name is followed by an asterisk (*).
defining the type of attribute values
When a module defines the type of an attribute value, it does so by listing the type in parentheses after the attribute name.
defining the legal values of attributes
When a module defines the legal values for an attribute, it does so by listing the explicit legal values (enclosed in quotation marks), separated by vertical bars (|), inside of parentheses following the attribute name. If the attribute has a default value, that value is followed by an asterisk (*). If the attribute has a fixed value, the attribute name is followed by an equals sign (=) and the fixed value enclosed in quotation marks.

4.2. Content Types


Abstract module definitions define minimal, atomic content models for each module. These minimal content models reference the elements in the module itself. They may also reference elements in other modules upon which the abstract module depends. Finally, the content model in many cases requires that text be permitted as content to one or more elements. In these cases, the symbol used for text is PCDATA. This is a term, defined in the XML 1.0 Recommendation, that refers to processed character data. A content type can also be defined as EMPTY, meaning the element has no content in its minimal content model.


4.3. Attribute Types


In some instances, it is necessary to define the types of attribute values or the explicit set of permitted values for attributes. The following attribute types (defined in the XML 1.0 Recommendation) are used in the definitions of the abstract modules:





Attribute Type
Definition


CDATA
Character data

ID
A document-unique identifier

IDREF
A reference to a document-unique identifier

IDREFS
A space-separated list of references to document-unique identifiers

NAME
A name with the same character constraints as ID above

NMTOKEN
A name composed of only name tokens as defined in XML 1.0 [XML (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_xml)]

NMTOKENS
One or more white space separated NMTOKEN values

PCDATA
Processed character data
In addition to these pre-defined data types, XHTML Modularization defines the following data types and their semantics (as appropriate):





Data type
Description

Character
A single character from [ISO10646 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_ISO10646)].

Charset
A character encoding, as per [RFC2045 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_RFC2045)].

Charsets
A space-separated list of character encodings, as per [RFC2045 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_RFC2045)].

Color

The attribute value type "Color" refers to color definitions as specified in [SRGB]. A color value may either be a hexadecimal number (prefixed by a hash mark) or one of the following sixteen color names. The color names are case-insensitive.



Color names and sRGB values

 


<img alt="" src="black.gif"></img>
Black = "#000000"
<img alt="" src="green.gif"></img>
Green = "#008000"



<img alt="" src="silver.gif"></img>
Silver = "#C0C0C0"
<img alt="" src="lime.gif"></img>
Lime = "#00FF00"



<img alt="" src="gray.gif"></img>
Gray = "#808080"
<img alt="" src="olive.gif"></img>
Olive = "#808000"



<img alt="" src="white.gif"></img>
White = "#FFFFFF"
<img alt="" src="yellow.gif"></img>
Yellow = "#FFFF00"



<img alt="" src="maroon.gif"></img>
Maroon = "#800000"
<img alt="" src="navy.gif"></img>
Navy = "#000080"



<img alt="" src="red.gif"></img>
Red = "#FF0000"
<img alt="" src="blue.gif"></img>
Blue = "#0000FF"



<img alt="" src="purple.gif"></img>
Purple = "#800080"
<img alt="" src="teal.gif"></img>
Teal = "#008080"



<img alt="" src="fuchsia.gif"></img>
Fuchsia = "#FF00FF"
<img alt="" src="aqua.gif"></img>
Aqua = "#00FFFF"


Thus, the color values "#800080" and "Purple" both refer to the color purple.



ContentType
A media type, as per [RFC2045 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_RFC2045)].

ContentTypes
A comma-separated list of media types, as per [RFC2045 (http://www.w3.org/TR/xhtml-modularization/xhtml-modularization.html#ref_RFC2045)].

Coords
Comma separated list of coordinates to use in defining areas.

Datetime
Date and time information.

FPI
A character string representing an SGML Formal Public Identifier.

FrameTarget
Frame name used as destination for results of certain actions.

LanguageCode
A language code, as per