
b7761965ae03a316469fe77aaf09bd29.ppt
- Количество слайдов: 78
XML Introduction with a DRM Focus by Ken Sall, SAIC for the FEA DRM (Data Reference Model) Team June 27, 2005 http: //kensall. com/gov/xml-intro. ppt 1
Agenda • • • 6/27/2005 What is XML? (The Big Picture) Presentation vs. Structure Anatomy of XML Well-Formed vs. Valid XML Namespaces Structure and Datatypes: XML Schema Understanding the DRM XML Schema Metadata: Dublin Core, DDMS Searching: XQuery [Transforming XML with XSLT] XML Introduction with DRM Focus: by Ken Sall 2
What is XML? The Big Picture / Structure vs. Presentation 3
XML Is… • Extensible Markup Language (XML) is a set of syntax rules used to define special purpose languages that meet the diverse needs of business, science, and government, as well as the publishing world. • The term "XML" also refers more broadly to the set of related technical standards created with XML syntax, as well as to custom (non-standard) languages based upon these standard specifications. • Collectively, the family of XML specifications enables the interchange of data and structured text across dissimilar computer systems, especially when the Internet is the transmission medium. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 4
Characteristics of XML • Extensible web standard: • W 3 C “Recommendation” (2/1998) • Standard for 7+ yrs; now “ 3 rd Edition” (2/2004) • http: //www. w 3. org/TR/REC-xml • human readable data format: visually similar to HTML, but closer to Standard Generalized Markup Language (SGML) • machine understandable; easily parsable by freeware • structured (hierarchical) data, any degree of complexity; database neutral format 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 5
Characteristics of XML [cont. ] XML is an enabling syntax upon which many more interesting and powerful things are being built. • verifiable: can be validated for correctness / integrity with little program-specific code • extensible: meta language used to define other domain- or industry-specific languages by supplying a specific Document Type Definition (DTD; like BNF for computer lang. ) or XML Schema • media independent: describes data, not visual presentation • interoperable: device, platform & vendor independent 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 6
The Big Picture: XML Family of Specifications 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 7
Presentation vs. Structure HTML: Only reflects how text should be presented. Not obvious to computer that “Public” is a last name, or even that we’re referring to an employee. <B>John Q Public</B><BR> <I>john. q. public@ex. com<I> John Q. Public john. q. public@ex. com XML: Only indicates how information is structured. Relationship of LAST name to EMPLOYEE is obvious, but how we display this info is not specified, so different applications can extract or display pieces as per their own requirements. <EMPLOYEE> <NAME> ? ? ? <FIRST>John</FIRST> <MIDDLE>Q</MIDDLE> <LAST>Public</LAST> What is EMPLOYEE/NAME/LAST? </NAME> <EMAIL>john. q. public@ex. com</EMAIL> </EMPLOYEE> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 8
How HTML and XML Differ HTML for display some optional end tags no knowledge of data known, closed tag set defined by W 3 C shallow learning curve case insensitive empty tag like <IMG [attrs]> requires nothing special white space ignored mostly ISO Latin 1 6/27/2005 XML for data structure can’t omit end tags presentation independent open language; define your own elements for all browsers steeper learning curve (tools help) case sensitive!! empty tag syntax: <image [attrs] /> white space significant within content; returned to app by default must use Unicode [cf: Java] Look, Ma! No browser! XML Introduction with DRM Focus: by Ken Sall 9
Anatomy of XML What’s Inside and Between the Angle Brackets? 10
Anatomy of XML, 1 <? xml version="1. 0" encoding="UTF-8"? > <!DOCTYPE ARCXML SYSTEM "arcxml 40. dtd" > <!-- incomplete example --> <ARCXML version="1. 1"> XML Declaration (optional) Doc. Type DECLARATION Comment (optional) Document (Root) Element <CONFIG> <ENVIRONMENT> What is the Content of the <CONFIG> element? <LOCALE country="US" language="en" /> </ENVIRONMENT> <MAP> Element’s Start Tag <MAP> <LAYER type="image" name="Scanned Quads" minscale="1: 150000"> Content of <EXTRA>: “This element is not part <DATASET name="*Image. Directory" workspace="jai_ws-46" />of the DTD. ” </LAYER> </MAP> Element’s End Tag </MAP> <EXTRA>This element is not part of the DTD. </EXTRA> </CONFIG> Elements may contain other elements or character data (text). They may occur 0 or more times, may be in sequence, or as a choice. </ARCXML> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 11
Anatomy of XML, 1 -a <? xml version="1. 0" encoding="UTF-8"? > DTD ref. replaced with XML Schema reference as attribute of root element. <ARCXML version="1. 1" xsi: no. Namespace. Schema. Location="http: //doi. gov/ARC arcxml 40. xsd"> <CONFIG> <ENVIRONMENT> <LOCALE country="US" language="en" /> </ENVIRONMENT> <MAP> <LAYER type="image" name="Scanned Quads" minscale="1: 150000"> <DATASET name="*Image. Directory" workspace="jai_ws-46" /> </LAYER> </MAP> <EXTRA>This element is not part of the DTD. </EXTRA> </CONFIG> </ARCXML> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 12
Attributes are details that always appear within the start tag of an element. Each element may have 0 or more attributes, but cannot repeat (for given elt. ). <? xml version="1. 0" encoding="UTF-8"? > Each attribute has a name and a quoted value. <!DOCTYPE ARCXML SYSTEM "arcxml 40. dtd" > The <LAYER> element has 3 attributes: <ARCXML version="1. 1"> Name “type” with the value “image” <CONFIG> Name “name” with the value “Scanned Quads” <ENVIRONMENT> Name “minscale” with the value “ 1: 150000” <LOCALE country="US" language="en" /> Anatomy 2 – Attributes <UIFONT color="0, 0, 0" name="Arial" size="12" /> Attribute order is undefined. </ENVIRONMENT> Attributes may be required or optional. <MAP> They may have default values or fixed values. <LAYER type="image" name="Scanned Quads" minscale="1: 150000"> <DATASET name="*Image. Directory" workspace="jai_ws-46" /> <IMAGEPROPERTIES transparency="1. 0" transcolor="255, 255" /> </LAYER> </MAP> </CONFIG> </ARCXML> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 13
Elements may: contain child elements (nested) Anatomy 3 – Empty -Elements <? xml version="1. 0" encoding="UTF-8"? > - contain only text content <!DOCTYPE ARCXML SYSTEM "arcxml 40. dtd" > - be "empty“ (no content, but may optionally have attributes) <ARCXML version="1. 1"> <CONFIG> <ENVIRONMENT> <LOCALE country="US" language="en" ></LOCALE> <UIFONT color="0, 0, 0" name="Arial" size="12" /> </ENVIRONMENT> <MAP> <LAYER type="image" name="Scanned Quads" minscale="1: 150000"> <DATASET name="*Image. Directory" workspace="jai_ws-46" /> <IMAGEPROPERTIES transparency="1. 0" transcolor="255, 255" /> </LAYER> ARCXML, CONFIG, ENVIRONMENT, MAP, and LAYER </MAP> all contain nested children elements. <EXTRA>This element is not part of the DTD. </EXTRA> LOCALE, UIFONT, DATASET, and IMAGEPROPERTIES </CONFIG> </ARCXML> 6/27/2005 are empty. They convey meaning simply by their empty presence or by their attributes; no content. EXTRA is the only element with text content. XML Introduction with DRM Focus: by Ken Sall 14
Anatomy: Schema and Namespace References 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 15
Anatomy Quiz #1 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 16
Well-Formed vs. Valid XML A Crucial Distinction 17
Well-Formed Document • Document adheres to basic XML syntax rules • Each start-tag has matching end-tag • All attribute values are quoted • No attribute may appear more than once on the same start-tag. • Etc. • Well-formedness is the minimum requirement in XML. • By definition, if a document isn’t well-formed, it’s not XML because it is technically not parsable. • Browsers should not display a document if it is not well-formed. (Major difference from HTML. ) • No implication about validity! 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 18
Well-Formed Example, Part 1 <? xml version="1. 0" encoding="utf-8"? > <drm: Data. Reference. Model xmlns: drm="http: //egov. gov/fea" xmlns: dc="http: //purl. org/dc/elements/1. 1/" xmlns: rdf="http: //www. w 3. org/1999/02/22 rdf-syntax-ns#" > <!-- drm: Submission. Metadata, Data. Sharing and Data. Context deleted --> <drm: Data. Description> <drm: Data> Note lack of reference to XML <drm: Structured. Data> Schema or DTD, since not <drm: External. Entities. Refs> needed for well-formedness. <drm: External. Entities. Ref rdf: id="OMB 300" drm: href="http: //www. whitehouse. gov/omb/egov/documents/OMB 300 v 2. 95. xsd" drm: representation. Format="XSD" /> </drm: External. Entities. Refs> <drm: Entities> <drm: Entity rdf: id="Business. Area" drm: name="FEA BRM Business Area"> <drm: Attributes> <drm: Attribute drm: key. Type="primary key" drm: name="business. Area. ID" drm: datatype="xsd: integer"/> <drm: Attribute drm: name="Business. Area. Name" drm: datatype="xsd: string"/> <drm: Attribute drm: name="Business. Area. Definition. Text" drm: datatype="xsd: string"/> </drm: Attributes> <drm: Data. Source. Ref drm: authoritative. Source="false" rdf: idref="data_asset 01"/> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 19
Well-Formed Example, Part 2 </drm: Data. Source. Refs> <drm: Node. Refs> <drm: Node. Ref drm: type="part. Of" rdf: idref="node 06"/> </drm: Node. Refs> <drm: Relationships> <drm: Relationship drm: cardinality="1" drm: key="business. Area. ID" drm: name="Business Area to Business Line Association" > <drm: Relationship. Target drm: key="refbusiness. Area. ID" drm: cardinality="unbounded" rdf: idref="Business. Line"></drm: Relationship. Target> </drm: Relationship> </drm: Relationships> <drm: Resource. Refs> <drm: Resource. Ref drm: type="part. Of" rdf: idref="resource 04" /> </drm: Resource. Refs> </drm: Entity> </drm: Entities> </drm: Structured. Data> </drm: Data> <drm: Data. Sources /> </drm: Data. Description> </drm: Data. Reference. Model> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 20
Not Well-Formed, 1 <? xml version="1. 0" encoding="utf-8"? > <drm: Data. Reference. Model xmlns: drm="http: //egov. gov/fea" xmlns: dc="http: //purl. org/dc/elements/1. 1/" xmlns: rdf="http: //www. w 3. org/1999/02/22 rdf-syntax-ns#" > <!-- drm: Submission. Metadata, Data. Sharing and Data. Context deleted --> <drm: Data. Description> <drm: Data> <drm: Structured. Data> <drm: External. Entities. Refs> <drm: External. Entities. Ref rdf: id=OMB 300 drm: href="http: //www. whitehouse. gov/omb/egov/documents/OMB 300 v 2. 95. xsd" drm: representation. Format="XSD" /> </drm: External. Entities. Refs> <drm: Entities> <drm: Entity rdf: id="Business. Area" drm: name="FEA BRM Business Area"> <drm: Attributes> <drm: Attribute drm: key. Type="primary key" drm: name="business. Area. ID" drm: datatype="xsd: integer"/> <drm: Attribute drm: name="Business. Area. Name" drm: datatype="xsd: string"/> <drm: Attribute drm: name="Business. Area. Definition. Text" drm: datatype="xsd: string"/> </drm: Attributes> <drm: Data. Source. Refs> <!-- etc from previous example --> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 21
Not Well-Formed, 1 a 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 22
Not Well-Formed, 2 <drm: Entities> <drm: Entity rdf: id="Business. Area" drm: name="FEA BRM Business Area"> <drm: Attributes> <drm: Attribute drm: key. Type="primary key" drm: name="business. Area. ID" drm: datatype="xsd: integer"/> <drm: Attribute drm: name="Business. Area. Name" drm: datatype="xsd: string"/> <drm: Attribute drm: name="Business. Area. Definition. Text" drm: datatype="xsd: string"/> </drm: Attributes> <drm: Data. Source. Ref drm: authoritative. Source="false" rdf: idref="data_asset 01"/> </drm: Data. Source. Refs> <drm: Node. Refs> <drm: Node. Ref drm: type="part. Of" rdf: idref="node 06"/> </drm: Node. Refs> <drm: Relationships> <drm: Relationship drm: cardinality="1" drm: key="business. Area. ID" drm: name="Business Area to Business Line Association" > <drm: Relationship. Target drm: key="refbusiness. Area. ID" drm: cardinality="unbounded" rdf: idref="Business. Line"></drm: Relationship. Target> </drm: Relationship> </drm: Relationships> <drm: Resource. Refs> <drm: Resource. Ref drm: type="part. Of" rdf: idref="resource 04" /> </drm: Resource. Refs> <!-- DELETED drm: Entity --> </drm: Entities> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 23
Not Well-Formed, 2 a 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 24
Valid XML • For an XML instance (document or message) to be valid, it must: • Be well-formed (necessary but not sufficient cond. ) • Reference either a DTD or XML Schema • Adhere perfectly to the model that the DTD or XML Schema represents • Elements appear in the correct order and nesting • All required elements and attributes must appear • Values are of the correct datatype, etc. • Validity checking • Is not necessarily reported by a browser! • Takes time (read and construct model, read and apply to instance; may cache model) 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 25
Valid Instance 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 26
Invalid Datatype but Well-Formed Date is wrong format. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 27
Invalid Structure but Well-Formed Missing <drm: Steward> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 28
Browsers Don’t Check Validity No error message although element is missing! 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 29
Namespaces An element by any other name would smell just as sweet. 30
What are Namespaces? • Way to distinguish between otherwise duplicate element and attribute names (e. g. , when schemas are combined). • Dublin Core uses very common element names (Type, Description, Title, Source, Date) • Prefix disambiguates XML names • dc: Date • xsd: date • groc: date • match: date Note: Namespace are not guaranteed to point to schemas, files, or anything else -- they're just identifiers. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 31
More on Namespaces • A namespace is identified by a unique name, which is a URI (Universal Resource Identifier): • URL – Locator (http: //www. foo. com/bar) • URN – Name (urn: com: books-r-us) • Any element type or attribute name in an XML namespace can be uniquely identified by a two -part name: the name of its XML namespace and its local name. • Fully Qualified Name (FQN) = drm: data. Type • Namespace prefix “drm” • Colon separator • Local part of element name “data. Type” 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 32
Anatomy: Namespace References XML instance (of the DRM XML Schema) Each prefix is shorthand for the with 4 namespaces declared, with prefixes associated namespace URI. The URI is the “drm”, “dc”, “rdf”, and “xsi”. actual namespace. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 33
Structure and Datatypes: XML Schema Modeling XML Documents 34
What is XML Schema? • Abbreviated XSD or WXS (W 3 C XML Schema). • W 3 C REC in May 2001 (3. 5 yrs in development) • XML Schema is a metalanguage (written in XML) for defining other XML languages. • A specific XML Schema is a model that describes the entire class of possible documents (all variations) that are valid instances of that schema. • Instances are validated against that XML Schema in terms of their: • element structure (order and hierarchy) • datatypes • Our DRM XML Schema is an XML Schema that defines what is valid for a DRM instances. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 35
What XML Schema (XSD, WXS) Adds Compared to DTDs, 1 • Datatypes • 44 Built-in types: primitive vs. generated (e. g. , xsd: float vs. xsd: positive-integer) • User defined: simple vs. complex (e. g. , my: string vs. ubl: Address, drm: Data. Description) • Constraints: datatype dependent “facets” • Numeric: • Ranges: min. Inclusive, max. Inclusive, min. Exclusive, max. Exclusive • total. Digits, fraction. Digits • Strings: • length, min. Length, max. Length, pattern, enumeration, white. Space 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 36
What XML Schema Adds, 2 • Object-Oriented Subclassing (derivation - reuse) • Inheritance: • extension (adding) or • restriction (removing or constraining) • Can prohibit changes with final attribute • Grouping of Elements or Attributes • related functionality named for re-use 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 37
Major XSD Components, 1 • Complete list of XML Schema elements • xsd: element – element in XSD used to define elements in your target schema (e. g. , DRM XSD) • xsd: attribute – element in XSD used to define attributes in your target schema • Compositors • xsd: choice – one of N elements • xsd: sequence – all of N elements, in given order • xsd: all – all of N elments, in any order, no repeats 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 38
Major XSD Components, 2 • Referencing external files • xsd: include - target NS of included components must be same as the target NS of the including schema • xsd: import - enables schema components from different target namespaces to be used together 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 39
Major XSD Components, 3 • xsd: simple. Type • Any of 44 built-in types (e. g. , xsd: integer, xsd: string, xsd: any. URI, xsd: base 64) • Can define your own simple. Types by restriction: set of legal values can be restricted from a “base” simple. Type – see Facets slide • Examples from our DRM: • • 6/27/2005 drm: Availability. Level. Type drm: Access. Control. Protocol. Type drm: Encryption. Type drm: Transaction. Type XML Introduction with DRM Focus: by Ken Sall 40
Major XSD Components, 4 • xsd: complex. Type • Used to construct your own types comprised of • multiple elements • attributes (optionally) • text content (optionally, with restricted values optionally) • Examples of xsd: complex. Type • • 6/27/2005 drm: Data. Reference. Model. Type drm: Data. Description. Type drm: Data. Exchange. Type drm: Data. Sharing. Type drm: Query. Point. Type (was: Access. Point. Type ) drm: Resource. Type drm: Entity. Type XML Introduction with DRM Focus: by Ken Sall 41
XSD Built-in Type Hierarchy This slide taken from imagemap http: //www. w 3. org/TR/xmlschem . a-2/#built-in-datatypes Copyright © 2004 W 3 C® (MIT, ERCIM, Keio) 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 42
Understanding the DRM XML Schema 43
Anatomy: Schema References XML instance points to the drm: Data. Reference. Model XML Schema for validation. Namespace declarations (as before) Schema namespace URI 6/27/2005 Hint as to the location of an XML Schema that defines the elements and types from this namespace. XML Introduction with DRM Focus: by Ken Sall 44
Top Level DRM complex. Type (Schema), 1 <xsd: complex. Type name="Data. Reference. Model. Type"> <xsd: all> <xsd: element ref="drm: Submission. Metadata"/> <xsd: element ref="drm: Data. Description"/> <xsd: element ref="drm: Data. Sharing" min. Occurs="0"/> <xsd: element ref="drm: Data. Context"/> </xsd: all> <xsd: attribute ref="xml: base" use="optional"/> </xsd: complex. Type> • • • 6/27/2005 Defines the structure of the entire DRM: 3 main data sections plus 1 section of metadata, with sections permitted in any order. Data. Sharing is optional since min. Occurs=“ 0”. The other 3 sections are required, exactly one of each. The actual definitions of the 4 sections appear elsewhere; they are referred (“ref”) to from here. An optional attribute of xml: base may appear within drm: Data. Reference. Model start tag. XML Introduction with DRM Focus: by Ken Sall 45
Top Level DRM complex. Type, 2 <xsd: element name="Data. Reference. Model" type="drm: Data. Reference. Model. Type"> <xsd: annotation> <xsd: documentation> Root node of the DRM XML instance document. </xsd: documentation> </xsd: annotation> </xsd: element> • Earlier in the schema, we have a global element declaration (above) that uses the complex. Type defined in the previous slide. • Since min. Occurs and max. Occurs aren’t specified, they default to one, meaning that a DRM instance must contain exactly one of these. • <xsd: annotation> and <xsd: documentation> are not relevant to the instance (except as descriptive information). 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 46
How Does This Map to an Instance? <? xml version="1. 0" encoding="utf-8" ? > <drm: Data. Reference. Model xmlns: drm="http: //egov. gov/fea" …. xsi: schema. Location="http: //egov. gov/fea Draft_FEA_DRM_XML_Schema. xsd"> <drm: Submission. Metadata> … </drm: Submission. Metadata> <drm: Data. Description> … </drm: Data. Description> <drm: Data. Sharing> … </drm: Data. Sharing> <drm: Data. Context> … </drm: Data. Context> </drm: Data. Reference. Model> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 47
Another Possible Instance <? xml version="1. 0" encoding="utf-8" ? > <drm: Data. Reference. Model xmlns: drm="http: //egov. gov/fea" …. xsi: schema. Location="http: //egov. gov/fea Draft_FEA_DRM_XML_Schema. xsd"> <drm: Data. Description> … </drm: Data. Description> <drm: Data. Context> … </drm: Data. Context> <drm: Submission. Metadata> … </drm: Submission. Metadata> </drm: Data. Reference. Model> • This instance omits the optional Data. Sharing section and places Submission. Metadata last. It is still valid according to the schema. • Remember: The DRM Schema is a model that describes all possible variations of DRM instances (submissions). 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 48
xsd: complex. Type with xsd: all <xsd: complex. Type name="Exchange. Package. Type"> <xsd: all> <!-- All 8 children but in any order. --> <xsd: element ref="drm: Sender"/> <xsd: element ref="drm: Recipient"/> <xsd: element ref="drm: Exchange. Frequency" min. Occurs="0"/> <xsd: element ref="drm: Classification"/> <xsd: element ref="drm: Transaction. Type"/> <xsd: element ref="drm: Status" min. Occurs="0"/> <xsd: element ref="drm: Entity. Refs"/> <xsd: element ref="drm: Node. Refs" min. Occurs="0"/> </xsd: all> <!– with rdf: id attribute attached to Exchange. Package element. --> <xsd: attribute ref="rdf: id" use="required"/> </xsd: complex. Type> What does min. Occurs tell you here? 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 49
xsd: complex. Type with xsd: sequence <xsd: complex. Type name="Submission. Metadata. Type"> <xsd: sequence> <!-- All 8 children in this order. --> <xsd: element ref="drm: Submitting. Agency"/> <xsd: element ref="drm: Submission. Date"/> <xsd: element ref="drm: Submission. Version" min. Occurs="0"/> <xsd: element ref="dc: Title" min. Occurs="0"/> <xsd: element ref="dc: Identifier" min. Occurs="0"/> <xsd: element ref="dc: Description" min. Occurs="0"/> <xsd: element ref="dc: Subject" min. Occurs="0"/> <xsd: element ref="drm: Point. Of. Contact" min. Occurs="0" max. Occurs="unbounded"/> </xsd: sequence> </xsd: complex. Type> • All 8 of these in this order, except elements with min. Occurs=“ 0” are optional. • Unbounded means unlimited number of Point. Of. Contact repeats. • Note 2 different namespace prefixes for children. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 50
XSD Constraints (Facets) • Facets restrict values of a datatype (simple. Type). • xsd: min. Exclusive | min. Inclusive | max. Exclusive | max. Inclusive | total. Digits | fraction. Digits | length | min. Length | max. Length | enumeration | white. Space | pattern • Which facets apply depend upon the datatype; some apply only to text-related types, others only apply to numeric types. • Can use more than one facet in one restriction. • Example: Restricts string type to 4 specific values. <xsd: simple. Type name="reference. Model. Type"> <xsd: restriction base="xsd: string"> <xsd: enumeration value="BRM"/> <xsd: enumeration value="PRM"/> <xsd: enumeration value="SRM"/> <xsd: enumeration value="TRM"/> <xsd: length value="3"/> <!-- not in DRM XSD --> </xsd: restriction> </xsd: simple. Type> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 51
DRM Schema Components, 1 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 52
DRM Schema Components, 2 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 53
Anatomy Quiz #2 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 54
Metadata: Dublin Core and DDMS 55
DCMI Metadata, 1 • • • Dublin Core Metadata Initiative: http: //dublincore. org/ Terms: http: //dublincore. org/documents/dcmi-terms/ Type vocabulary: http: //dublincore. org/documents/dcmi-type-vocabulary/ • Browse Dublin Core Metadata Registry • ISO 15836: 2003(E). Information and documentation — The Dublin Core metadata element set Element list from Users Guide: 16 (or 18? ) • 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 56
DCMI Metadata, 2 • xmlns: dc="http: //purl. org/dc/elements/1. 1/" • Creator="Internal Revenue Service. Customer Complaints Unit" (a person, an organization, or a service). See also Contributor. • Date="1998 -02 -16" • Relation “is Refined by”: conforms. To has. Format has. Part has. Version is. Format. Of is. Part. Of is. Referenced. By is. Replaced. By is. Required. By is. Version. Of references replaces requires • Identifier – would be desirable if registry could assign this automatically as a UID • Audience • Title == Term • Subject == Context 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 57
DDMS (Do. D Discovery Metadata Specification) 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 58
DRM Resource. Type Uses DC and DDMS <xsd: complex. Type name="Resource. Type"> <xsd: all> <xsd: element ref="drm: Resource. Refs" min. Occurs="0" /> <xsd: element ref="drm: Data. Asset. Refs" min. Occurs="0" /> <xsd: element ref="ddms: geospatial. Coverage" min. Occurs="0" /> <xsd: element ref="dc: Title" min. Occurs="0" /> <xsd: element ref="dc: Identifier" min. Occurs="0" /> <xsd: element ref="dc: Date" min. Occurs="0" /> <xsd: import namespace="http: //purl. org/dc/elements/1. 1/" <xsd: element ref="dc: Creator" min. Occurs="0" /> schema. Location="dc. xsd" /> <xsd: element ref="dc: Format" min. Occurs="0" /> <xsd: element ref="dc: Description" min. Occurs="0" /> <xsd: import namespace="http: //metadata. dod. mil/mdr/ns/DDMS/1. 2/" <xsd: element ref="dc: Source" min. Occurs="0" /> schema. Location="DDMS-v 1_2. xsd" /> <xsd: element ref="dc: Subject" min. Occurs="0" /> <xsd: import namespace="urn: us: gov: ic: ism: v 2" <xsd: element ref="dc: Type" min. Occurs="0" /> schema. Location="IC-ISM-v 2. xsd" /> <xsd: element ref="dc: Publisher" min. Occurs="0" /> <xsd: element ref="dc: Contributor" min. Occurs="0" /> <xsd: element ref="dc: Language" min. Occurs="0" /> <xsd: element ref="dc: Relation" min. Occurs="0" /> <xsd: element ref="dc: Coverage" min. Occurs="0" /> <xsd: element ref="dc: Rights" min. Occurs="0" /> <xsd: element ref="ddms: temporal. Coverage" min. Occurs="0" /> <xsd: element ref="drm: Node. Refs" min. Occurs="0" /> </xsd: all> <xsd: attribute ref="rdf: ID" use="required" /> </xsd: complex. Type> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 59
ddms: Geospatial. Coverage. Type <xs: complex. Type name="Geospatial. Coverage. Type"> <xs: choice> <xs: element name="Place" type="Place. Type"/> <xs: element name="Facility" type="Facility. Type"/> </xs: choice> </xs: complex. Type> <xs: complex. Type name="Place. Type"> <xs: sequence> <xs: element name="name" type="xs: string" min. Occurs="0"/> <xs: element name="region" type="xs: string" min. Occurs="0"/> <xs: element ref="geo. Ref" min. Occurs="0" max. Occurs="unbounded"/> <xs: element name="street" type="xs: string" min. Occurs="0"/> <xs: element name="city" type="xs: string" min. Occurs="0"/> <xs: element name="state" type="xs: string" min. Occurs="0"/> <xs: element name="postal. Code" type="xs: string" min. Occurs="0"/> <xs: element ref="country. Code" min. Occurs="0"/> <xs: element name="province" type="xs: string" min. Occurs="0"/> </xs: sequence> </xs: complex. Type> <xs: complex. Type name="Facility. Type"> <xs: choice> <xs: sequence> <xs: element name="name" type="xs: string" min. Occurs="0"/> <xs: element name="region" type="xs: string" min. Occurs="0"/> <xs: element ref="geo. Ref" min. Occurs="0" max. Occurs="unbounded"/> <xs: element name="street" type="xs: string" min. Occurs="0"/> <xs: element name="city" type="xs: string" min. Occurs="0"/> <xs: element name="state" type="xs: string" min. Occurs="0"/> <xs: element name="postal. Code" type="xs: string" min. Occurs="0"/> <xs: element ref="country. Code" min. Occurs="0"/> <xs: element name="province" type="xs: string" min. Occurs="0"/> </xs: sequence> <xs: sequence> <xs: element name="facility. Identifier" type="Facility. Identifier. Type" min. Occurs="0"/> </xs: sequence> </xs: choice> </xs: complex. Type> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 60
More DDMS Geospatial Details <xs: element name="geo. Ref" type="Compound. Geo. Ref. Identifier. Type"/> <xs: complex. Type name="Compound. Geo. Ref. Identifier. Type"> <xs: attribute name="qualifier" type="xs: string"/> <xs: attribute name="value" type="xs: string"/> </xs: complex. Type> <xs: element name="country. Code" type="Compound. Country. Code. Identifier. Type"/> <xs: complex. Type name="Compound. Country. Code. Identifier. Type"> <xs: attribute name="qualifier" type="xs: string"/> <xs: attribute name="value" type="xs: string"/> </xs: complex. Type> <xs: complex. Type name="Facility. Identifier. Type"> <xs: attribute name="be. Number" type="xs: string" use="required"> <xs: annotation> <xs: documentation> (GMI: BE_NUMBER, 1. 0) Uniquely identifies the installation of the facility. The BE_NUMBER is generated based on the value input for the COORD to determine the appropriate World Area Code (WAC), the system assigned record originator and a one-up-number. [STUFF DELETED]</xs: documentation> </xs: annotation> </xs: attribute> <xs: attribute name="osuffix" type="xs: string" use="required"> <xs: annotation> <xs: documentation> (GMI: OSUFFIX, 1. 0) Uniquely identifies a facility or demographic area in conjunction with a BE_NUMBER. Pos. 12. SYSTEM ASSIGNED RECORD_ORIGINATOR. The organization creating the facility or demographic area. DIA installation records created prior to IDB generation of OSUFFIX contain DD. Pos. 3 -5 A one-up number. </xs: documentation> </xs: annotation> </xs: attribute> </xs: complex. Type> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 61
Searching: XQuery Collect, Register, Harmonize, Discover, and Measure 62
Why Do We Need XQuery? • • • XML is not relational Native XML query language XML views of data sources and relational data XPath 2. 0 work began in 2001, as did XQuery Operate on collections of documents Join data from multiple documents Useful for structured and unstructured data Strongly type, declarative language Compact, non-XML syntax for querying http: //xml. house. gov - site generated using XML/e. Xist/XQuery/XSLT • Note: XQuery is not yet a W 3 C REC. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 63
Play Structure 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 64
Simple Example • Assuming input is Jon Bosak’s XML collection of Shakespeare plays… [see previous slide] • Find the speaker of the 1 st speech of the 1 st scene of the 1 st act of each play in a database. • Can be simply XPath: /PLAY/ACT[1]/SCENE[1]/SPEECH[1]/SPEAKER Search scope is implicitly a collection of documents in a database, or can be narrowed using the doc() function. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 65
Complex Example Find books sorted in reverse order by author, showing the xquery version "1. 0"; number of titles by declare namespace each author. rdf="http: //www. w 3. org/1999/02/22 -rdf-syntax-ns#"; declare namespace dc="http: //purl. org/dc/elements/1. 1/"; (: This is a comment. Next is a FLWOR expression : ) for $p in distinctvalues(doc('/db/library/biblio. rdf')//dc: creator) let $books : = //rdf: Description[dc: creator &= $p] order by $p descending return <result> <creator titles="{count($books)}">{$p}</creator> {for $b in $books return $b/dc: title} </result> Blue = Query Prolog. Note semi-colon after each line. Purple = Query body; no semi-colons. 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 66
FLWOR Expressions (“flower”) • For, Let, Where, Ordered by, Return (For. Clause | Let. Clause)+ Where. Clause? Order. By. Clause? "return" Expr. Single • • • Similar to SELECT-FROM-WHERE in SQL For: iterate over nodes Let: assign value to variable(s) Where: under some optional condition Ordered by: optional sorting (ascending, descending, etc. ) • Return: construct elements and values to return 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 67
FLWOR <bib> { for $b in doc("http: //bstore 1. example. com/bib. xml")//book where $b/publisher = "Addison-Wesley" and $b/@year > 1991 order by $b/title return <book> { $b/@year } { $b/title } </book> } </bib> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 68
Transforming XML with XSLT XML Stylesheet Language Transformations 69
XSLT to Transform XML to HTML or PDF (e. g. , Section 207 d Use Case) 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 70
XSLT Functional Capabilities, 1 • • 6/27/2005 filtering XML elements from the input document using match patterns sorting input based on element names or attribute values transposing the order of elements creating new elements and attributes repeating text content extracted from the input for multiple purposes in the output (e. g. , headings and table of contents) computing new content based on existing content in the source document incorporating fixed text in output XML Introduction with DRM Focus: by Ken Sall 71
XSLT Functional Capabilities, 2 • conditional processing based on values in the XML source • looping through a set of nodes • copying subtrees of your source documents to output • processing the same elements in multiple ways (passes) based on different templates modes • accessing XPath and XSLT functions for string manipulation, basic math operations, and node processing • reusing code by defining named templates and attribute sets 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 72
XSLT Processing Diagram [from Sall’s book, “XML Family of Specifications: A Practical Guide”] 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 73
XML Instance can be Assoc. w/XSLT <? xml version="1. 0" encoding="UTF-8"? > <? xml-stylesheet href="DRM-Mixed-Ex. xsl" type="text/xsl"? > <Collection xmlns: xsi="http: //www. w 3. org/2001/XMLSchema-instance" xsi: no. Namespace. Schema. Location="thesaurus. xsd"> <Concept> <pref. Label>Baseline Architecture</pref. Label> <definition>The set of products that portray the existing enterprise, the current business practices, and technical infrastructure (commonly referred to as the “As-Is"" architecture)</definition> <subject>Federal Enterprise Architecture</subject> <scope. Note>US Federal Information Technology</scope. Note> <SOURCE>Data Reference Model, Volume 1 [http: //www. whitehouse. gov/omb/egov/documents/feadrm 1. PDF]</SOURCE> </Concept> <Concept> <pref. Label>BRM</pref. Label> <ABBREVIATION_OR_ACRONYM>BRM</ABBREVIATION_OR_ACRONYM> <definition>Business Reference Model of the Federal Enterprise Architecture</definition> <subject>Federal Enterprise Architecture</subject> <scope. Note>US Federal Information Technology</scope. Note> <SOURCE>Data Reference Model, Volume 1 [http: //www. whitehouse. gov/omb/egov/documents/feadrm 1. PDF]</SOURCE> <broader>Federal Enterprise Architecture</broader> </Concept> <!-- etc --> </Collection> 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 74
Custom XSLT for Glossary XML <? xml version="1. 0"? > <xsl: stylesheet version="1. 0" xmlns: xsl="http: //www. w 3. org/1999/XSL/Transform"> <!-- Copyright (c) 2005. Ken Sall (xml@kensall. com). All Rights Reserved. --> <xsl: output method="html" indent="yes" /> <xsl: template match="/"> <html> <head> <style type="text/css"> body { margin: 2 em; font-family: Arial; } </style> </head> <body> <xsl: apply-templates select="/Collection/Concept" /> </body> </html> </xsl: template> <xsl: template match="Concept"> <xsl: variable name="term"><xsl: value-of select="pref. Label" /></xsl: variable> <table align="center" border="1" cellpadding="3" width="90%"> <tr><td colspan="2">Concept: <b><xsl: value-of select="$term" /></b></td></tr> <tr><td>pref. Label</td><xsl: value-of select="$term" /></td></tr> <xsl: if test="alt. Label"><tr><td width="35%">alt. Label</td><xsl: value-of select="alt. Label" /></td></tr></xsl: if> <xsl: if test="ABBREVIATION_OR_ACRONYM"><tr><td width="35%">ABBREVIATION_OR_ACRONYM</td><xsl: value-of select="ABBREVIATION_OR_ACRONYM" /></td></tr></xsl: if> <xsl: if test="definition"><tr><td width="35%">definition</td><xsl: value-of select="definition" /></td></tr></xsl: if> <xsl: if test="subject"><tr><td width="35%">subject</td><xsl: value-of select="subject" /></td></tr></xsl: if> <xsl: if test="scope. Note"><tr><td width="35%">scope. Note</td><xsl: value-of select="scope. Note" /></td></tr></xsl: if> <xsl: if test="SOURCE"><tr><td width="35%">SOURCE</td><xsl: value-of select="SOURCE" /></td></tr></xsl: if> <xsl: if test="example"><tr><td width="35%">example</td><xsl: value-of select="example" /></td></tr></xsl: if> <xsl: if test="narrower"><tr><td width="35%">narrower</td><xsl: value-of select="narrower" /></td></tr></xsl: if> <xsl: if test="broader"><tr><td width="35%">broader</td><xsl: value-of select="broader" /></td></tr></xsl: if> <xsl: if test="related"><tr><td width="35%">related</td><xsl: value-of select="related" /></td></tr></xsl: if> </table> </xsl: template> </xsl: stylesheet> 6/27/2005 75 XML Introduction with DRM Focus: by Ken Sall
XML Instance in Browser Rendered as HTML 6/27/2005 XML Introduction with DRM Focus: by Ken Sall 76
Acronyms [1] • • • • • 6/27/2005 CSS CCTS DAML+OIL Doc. Book Cascading Style Sheets [sic] Core Components Technical Specification (for eb. XML) DARPA Agent Markup Language + Ontology Inference Layer XML/SGML vocabulary particularly well suited to books and papers about computer hardware and software; from OASIS DOM Document Object Model DTD Document Type Definition eb. XML e. Business XML (eb. XML. org) GXA and WS-* Microsoft's Global XML Web Services Architecture Human. ML Human Markup Language; from OASIS Java WSDP Java Web Services Developer Pack Legal. XML electronic exchange of legal data NDR Naming and Design Rules (for XML Schema) Math. ML Mathematical Markup Language MSXML Microsoft XML Parser [in IE 5, IE 6, or standalone] OASIS Organization for the Advancement of Structured Information Standards OWL semantic markup language for publishing and sharing ontologies on the Web, based on DAML+OIL and RDF Resource Description Framework RELAX NG REgular LAnguage description for XML; Schema alternative REST REpresentational State Transfer XML Introduction with DRM Focus: by Ken Sall 77
Acronyms [2] • • • • • • 6/27/2005 RSS SAML SAX SKOS SOAP SVG UBL UDDI WSDL XACML XBRL XForms XHTML XLink XML XPath XPointer XQuery XSD XSL-FO XSLT XTM Really Simple Syndication (aka: RDF or Rich Site Summary) Security Assertion Markup Language; from OASIS Simple API for XML Simple Knowledge Organization System Simple Object Access Protocol [deprecated] Scalable Vector Graphics Universal Business Language (see eb. XML); from OASIS Universal Discovery Description and Integration Web Services Description Language Extensible Access Control Markup Language Extensible Business Reporting Language not really an acronym; XML-enabled web forms Extensible Hyper. Text Markup Language XML Linking Language Extensible Markup Language XML Path Language XML Pointer Language XML Query Language XML Schema (also “WXS” for W 3 C XML Schema) Extensible Stylesheet Language family (XSL-FO, XSLT, XPath) XSL Formatting Objects (cf: CSS) Extensible Stylesheet Language Transformations XML Topic Maps XML Introduction with DRM Focus: by Ken Sall 78
b7761965ae03a316469fe77aaf09bd29.ppt